[Home] [By Thread] [By Date] [Recent Entries]

  • From: Pete Cordell <pete++xmldev@c...>
  • To: Roger L Costello <costello@m...>,"xml-dev@l..." <xml-dev@l...>
  • Date: Thu, 17 Mar 2022 11:48:25 +0000

On 17/03/2022 11:25, Roger L Costello wrote:
So this is perfectly well-formed XML:

<Test foo="&#x3C;x>blah&#x3C;/x>"/>

And the numeric character references will be replaced during the parsing process to yield this:

<Test foo="<x>blah</x>"/>
I'd say that parsing <Test foo="&#x3C;x>blah&#x3C;/x>"/> yields an attribute named "foo" with a value of "<x>blah</x>".

It's not creating an alternate piece of XML that is then parsed again.

The sequence is more like:

- Low level XML tokeniser reads attribute name "foo"

- Low level parse checks it's followed by "=" and quotes

- Low level parser reads until it finds end quote, getting "&#x3C;x>blah&#x3C;/x>"

- Internal logic converts "&#x3C;x>blah&#x3C;/x>" to "<x>blah</x>"

- Internal logic creates a data record for an attribute of name "foo" with value "<x>blah</x>" and associates it with the element "Test".

Regards,

Pete.
--
---------------------------------------------------------------------
Pete Cordell
Codalogic Ltd
Read & write XML in C++, http://www.xml2cpp.com
---------------------------------------------------------------------


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member