[Home] [By Thread] [By Date] [Recent Entries]
On 27/12/2021 12:03, Roger L Costello wrote: [snip] > If the XML document is not associated with a schema (XSD, DTD, or > RNG), then the answer is always (a) and the whitespace may be safely > discarded. I think it's other way round. In the absence of a schema/DTD, whitespace must be retained and passed to the application. Only a schema/DTD can identify where whitespace can safely be ignored. > So, sometimes the content of <Document> is one thing, sometimes it's > another thing. This complicates lexers (and parsers) because they must > have external, out-of-band knowledge about the document. Yes, exactly. > Is that good language design? For the original purposes of SGML and XML (large text documents with both element content and mixed content), yes. In those cases, a schema is pretty much always used, so the question never arises (it's [a]). If you use XML to hold what is essentially rectangular data (rows and columns), or if your application can dispense with mixed content, the question also never arises (it's [b] and it's up to the application to ignore whitespace-only nodes). Basically it's a feature, not a bug 🐞 The only notable bug is (was?) in software that discards a whitespace-only node that is the sole node between adjacent elements when a schema/DTD has identified the context as being mixed content. That is /always/ wrong. Peter
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



