[Home] [By Thread] [By Date] [Recent Entries]


At 5:54 PM -0800 1/5/04, Jeff Rafter wrote:
>This is one of those questions that are more for curiousity than anything,
>but does anyone have any information on why ignorableWhitespace was included
>in ContentHandler as opposed to LexicalHandler? Based on my understanding of
>the guidelines used in determining what belongs in the default interfaces
>and what belongs in the extension interfaces it seems to fall under the
>latter. It is non-imperative lexical information associated with the parse.
>Comments?

That is incorrect. XML parsers must report all content, ignorable or 
otherwise. It is not optional to report this content, unlike, for 
example, CDATA section boundaries. The word "ignorable" is an 
unfortunate choice here. It means the application receiving the data 
may choose to ignore it. However, the parser cannot ignore this 
content. It must provide it.

It's also the case that a lot of white space many people think is 
ignorable really isn't. White space is only really ignorable if 
there's a DTD, and even then you may choose not to ignore it. I 
prefer the less loaded term "boundary white space" which identifies 
all white space only text nodes, not just those that are ignorable.
-- 

   Elliotte Rusty Harold
   elharo@m...
   Effective XML (Addison-Wesley, 2003)
   http://www.cafeconleche.org/books/effectivexml
   http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member