SQL parser

Related products: Software Factory

While developing I have run into issues multiple times where my solution was "Thinkwise has to implement it's own SQL parser". One example: I'm renaming a column, now I have to look through all templates using the name of this column to make sure all of them are renamed.. Now if the SF would be able to read my code, it should be able to rename those references automatically without a hitch.

Also things like validations could be implemented more easily, like the example given here: https://community.thinkwisesoftware.com/ideas-18/code-template-validation-trigger-with-parameter-237

Another benefit would be that after editing my model I could validate my code based on the model instead of the (out-of-date) database.
Long in depth answer ahead. The good news is in the last paragraph.



One of the SF's core design principles has always been to be language agnostic. This has allowed us to not just write templates for Microsoft's SQL Server, but also in various SQL dialects such as those used by DB2 and Oracle databases. We've had templates in Java, C#, JavaScript, and XML. And if a developer requires it, he could write templates in whatever language is the flavor of the year. This kind of flexibility yields a degree of raw power that cannot be overestimated.



Notwithstanding design principles, Transact-SQL has been given preferential treatment because most application logic is written in it and as such warrants implementing a SQL parser. Writing a parser for a particular SQL dialect though means that the SF's full power will only be available to partners who choose to work with one particular RDBMS, and who write their application logic in SQL. Are you sure you're looking for this kind of lock-in? Or are there expectations that (a) the used language is static and will never change, or (b) whenever the preferred target language changes, we'll have another parser for it on our shelf?



You see, we cannot just use an existing 3rd party parser, because our templates are only fragments of the SQL that's yet going to be generated. We'd have to accurately predict where templates end up, and that totally depends on what code lives inside the control procedure that distributes the template over various program objects. If we require 100% accuracy, we'd either have to give up on control procedures, solve the halting problem, or generate code each time a template is saved (performance cost?) and find a way to map error messages back onto the templates it generated - and what if e.g. a variable rename in the edited template causes an error in another...? This is a dead end, unfortunately.



So if we do this, we have to rely on a heuristic parser, which takes dependencies on model settings. And this means that whatever target language we choose, we'll always have to roll our own parser. It is guaranteed to give many false positives and negatives, and won't hardly be as useful as we want it to be. Right now it seems to be in the realm of projects with greater costs than benefits, if not outright unfeasible. And that's just one language.



Meanwhile, we're developing a Business Rule Modeler, which models application logic and has verifiable references to model objects. It addresses the maintenance issues in a different way, and will be a more valuable addition to the SF once released.
Wow, thank you very much for the in-depth answer.



I think I understand the problems with having your own parser. I don't think I fully agree with all your statements, but I think you do have me convinced.



It got me wondering though, how do others do it. So as an example I looked at Unity which allows coding in C# and javaScript, not as many as you strive to support, but multiple anyways. What Unity does though (as far as I can see), is delegate managing of the (C#) code to Visual Studio. This makes a lot of sense, because Visual Studio is very very good at handling C# code. Unfortunately SSMS does not seem to work as well... so delegating all this to others (SSMS) (more than is done now) does not seem to be an option either.



I still feel I want to be helped more (I have been brought up using Visual Studio). I'm not sure this is fully resolved with a Business Rule Modeler, but I can see how that will bring us closer.
As you speculated, the difficulty to analyze or compile code from an environment such as the Software Factory (or Unity for that matter) greatly depends on the language. Some languages have great tools available that enable you to do these things quite easily. Others not so much.



For .NET, compiling arbitrary .NET code (C#, VB.NET) has always been quite easy. A while ago, Microsoft introduced their compiler platform called Roslyn, which provides even more powerful features. Roslyn has an SDK with APIs for code analysis, syntax tree inspection, code generation and compilation. I'm not that familiar with Unity, so I'm not quite sure how they handle their C# code, but these days there are plenty of ways to validate and compile C#, including delegating various tasks to Visual Studio. It's worth mentioning that Unity has dropped support for UnityScript (it's not actually Javascript) in favor of C#, probably due to the flexibility provided by Microsoft's tooling.



The main reason the Business Rule Modeler will alleviate the issues described by this post is that it will store the application logic in the application model in a form similar to a syntax tree. You could say that the Business Rule Modeler is a friendly viewer and editor for the syntax tree that does not require any parsing of text. The way in which the application logic will be stored allows us to query the application logic to detect errors and do things like finding identifiers that need to be renamed.



I hope this clarifies a few things.
Updated idea status OpenClosed