[Home] [By Thread] [By Date] [Recent Entries]
Manos Batsis wrote: > Hi list, > > Short version: Must character references in attribute values get > expanded by an XML parser? > > Long version: When a document like > > <?xml version="1.0" encoding="iso-8859-1"?> > <foo bar="λ"/> > > > is accessed by an API like SAX on top of an XML parser like piccolo > must the exposed attribute value be "λ" or "&lgr;" (greek lambda)? You could conceivably have a partial parser that does not expand character references. But then you would have two kinds of strings floating around, which could cause confusion. I guess it would be useful if * you wanted to stick to ASCII or 8859-1 enocded strings * you were just shovelling characters from input to output as fast as possible and you weren't interested in looking at the contents at all. There are lots of kinds of partial or lazy parsing possible... Cheers Rick Jelliffe
|

Cart



