Software Secret Weapons™  
Code Linguine posted by Pavel Simakov on 2007-01-04 15:32:05 under Code Linguine
view comments
 


About  •  Contact  •  Articles  •  Projects  •  My Links  •  My Bookshelf  •  Past And Present

If you have large scale Delphi project – please read on!

The Code Linguine package was created for a large scale reengineering, analysis and internationalization of Delphi applications. We do mean large scale – so large that manual fixes are a no-no! Large scale means millions of lines of source code!

Often the source code in the large project reminds you of a seafood linguine in the Italian restaurant. Just like in the plate of linguine all the pieces are intertwined together... This is where the name Code Linguine comes from.

We had to do this kind of reengineering for two large commercial projects. This package helped us to parse and analyze several hundreds of *.pas files. First, source files were parsed with Code Linguine. Then the parse trees (abstract syntax trees, AST) of these files were converted into the XML DOM documents. And further XSLT and XPATH were used to query XML representations of these documents. Queries answered various question about the source code structure and coding styles.

Here are the sample questions we were able to answer in the process:

  • Where do I raise exceptions?
  • Where do I catch exceptions?
  • Where I do NOT catch exceptions?
  • Do I have any constants declared?
  • Do I have any constants declared that contain string literals?
  • Do I have any string literals not declared as constants?
  • Do I have any variables not bound to class or function?
  • Do I have any variables globally accessible in interface section?
  • Do I have any resource strings?
  • Do I have any string literal properties?
and so on. There is a page that explains how this works in depth.

Why not to use plain grep? Grep will work of course, but not for a large product or a large software development team. In a large product, work on pulling strings out must be coordinated, planned, and completed in chunks – one subsystem at a time. So you will be running grep day and night all over again...

Microsoft went through similar experience when they adapted Windows codebase to support Unicode. They actually wrote a source code analysis tool (for C and C++) to figure strategy for a migration.

Both projects were a great success in technology and business. Large software product was reengineered to support multilingual user interface. Compliance tools and processes were created to maintain the quality of the product. All of it was done in a record time with a superb quality.

Having a parser for the language you are working with is a great thing! It allows entirely new level of source code manipulation and code improvements! You can also visualize schemas and language grammars - check out Linguine Maps project to learn more. Please inquire if you need similar help with your own projects.

Good luck!

Comments (2)

  • Comment by Niek Sluyter — June 20, 2007 @ 3:39 am

    Hello Pavel,
    Looks like a nice piece of software, i c you have even implemented {$Define} pre processor.
    I was searching for universal parsers to make a object tree of the parsed source.
    Regards,
    Niek

  • Comment by Yogi Yang — June 6, 2008 @ 3:42 am

    This is a good project. Thanks for releasing it as GPL.

    I am actually a newbie to Delphi and Object Pascal your project will help me learn Delphi’s pascal implementation better.

    I have a question though:

    I want to parse files similar to what I am pasting. Is it possible to parse such files easily?

    Sample text that I want to parse.

    store( &Copyright ) “Copyright © 2008 nothing”

    c need to include international keyboards in this list also
    store(nonK) “ACDFJKMPSTVWXYZ`|<>0123456789=^&*()’;” “œ”
    store(notrans) “abcdefghijklmnopqrstuvwxyz- /”
    + “f” > “ph”
    “^” + “e” > “ê”
    + “a” > “b”
    + “b” > “c”
    “b” + “c” > “d”
    “c” + “d” > “e”
    store(vowels) “aeiou”
    + any(vowels) > “.”

    store(lowercase) “abcdefghijklmnopqrstuvwxyz”
    store(uppercase) “ABCDEFGHIJKLMNOPQRSTUVWXYZ”
    + any(lowercase) > index(uppercase,1)

    “a” + any(store1) > index(store2,2)
    “ab” any(store1) + “c” > index(store2,3)

    d065 any(store1) “B” + any(store2) > index(store3,4)

    any(st1) + any(st2) > index(st1,2) index(st2,1) index(st3,2)

    Thanks,

    Regards,

    Yogi Yang


Leave a comment


  Copyright © 2004-2007 by Pavel Simakov SourceForge.net Logo