- From: Rick Jelliffe <rjelliffe@a...>
- To: xml-dev <xml-dev@l...>
- Date: Thu, 20 Jan 2022 13:51:37 +1100
Wicked complicated? Seems a low bar, if anything that is not minimally simple is regarded as complex: fallacy of the excluded middle. And, of course, the simple.st way of expressing a grammar or regular expression may not always be the best (for performance, for tools,for human explanation)
The more you try to do in a single production, the more that you need either a more expressive grammar or more complex rules. So if you treat XML tags as two-levels, the first says whitespace delimits tokens, and the second forms tokens into tagnames and attributes with no consideration for whitespace, each grammar is super "simple".
Rick
On Thu, 20 Jan. 2022, 10:58 Roger L Costello, < costello@m...> wrote: Hi Folks,
XML start tags have a simple structure, right?
Wrong!
Here are some of the permutations of a start tag:
'<' tag-name '>'
'<' tag-name "/>"
'<' tag-name WSP '>'
'<' tag-name WSP "/>"
'<' tag-name WSP attribute-name '=' "value" '>'
'<' tag-name WSP attribute-name WSP '=' "value" '>'
'<' tag-name WSP attribute-name '=' WSP "value" '>'
'<' tag-name WSP attribute-name WSP '=' WSP "value" '>'
'<' tag-name WSP attribute-name WSP '=' WSP "value" WSP '>'
... a lot more ...
Now, let's play parser: We are scanning and encounter these items
... '<'
... tag-name
... WSP
... attribute/value pair
... WSP
Trouble!
What does the WSP (WSP = whitespace) signify? Does it signify:
(a) Space between the first attribute and a second attribute? E.g. WSP attribute-name '=' "value"
(b) Space just prior to the end angle bracket? I.e., WSP '>'
The only way to know the answer is to lookahead beyond the WSP to see what token comes next. But a two-token lookahead requires a more powerful parser than a one-token lookahead parser.
So the next time someone tells you that the structure of an XML start tag is simple, tell 'em it ain't so!
/Roger
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@l...
subscribe: xml-dev-subscribe@l...
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
|