[Home] [By Thread] [By Date] [Recent Entries]

  • To: xml-dev@l...
  • Subject: Schema for Programming Language
  • From: Allen Razdow <arazdow@m...>
  • Date: Wed, 14 May 2003 17:37:01 -0400

I'm writing a schema to represent documents which are programs in a particular programming language.  The most useful part of the schema would represent the "abstract syntax" of the program, essentially a canonical form of the program stripped of concrete syntax and sugar.  But can anyone give guidance on accepted or conventional ways to include the specific syntax/sugar in the schema in a way that keeps it separate somehow?

 

For example, consider the code-fragment :

 

     If (x<0)

            ++x;

     else

            --x;

 

Something like

 

<if-then-else-statement>

            <block>

                        <unary-op type="increment" arg="x"/>

            </block>

            <block>

                        <unary-op type="decrement" arg="x"/>

            </block>

</if-then-else-statement>

 

could represent the abstract syntax of the fragment, but doesn't capture the actual syntax with the programmers choices for indenting, etc.  The specific operator syntax such as "++" can be generated with XSLT because it is part of the language, but what about the whitespace, indenting and linebreaks?  Where should they appear in the XML of the fragment?  Having two separate schemata, echoing the program and its parse-tree, seems wrong.  XML should be able to annotate the parse-tree with the syntactic specifics somehow.

 

Any advice would be appreciated.

 

-Allen


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member