[Home] [By Thread] [By Date] [Recent Entries]
Thanks for clearing that up - I should have asked around when I had the problem originally, I guess! You have correctly inferred the source of our problem - using a JDK InputStreamReader in front of the parser. cheers -Mike On 5/20/2011 9:28 PM, Michael Glavassevich wrote: > > John Cowan <cowan@c...> wrote on 05/20/2011 06:59:04 PM: > > > Mike Sokolov scripsit: > > > > > BOM in UTF-8 seems to cause problems with some XML parsers > > > (incl. Xerces 2.9.1). They seem to believe it is white space in the > > > prolog. To deal with this, we have had to insert a processor prior to > > > our parser which checks for BOM and strips it out. > > > > Support for the 8-BOM was not explicitly required until the XML 1.0 > > Third Edition of 2004. Xerces 2.9.1 may be out of date. > > What doesn't work? Xerces has known how to handle the UTF-8 BOM for > much longer than that. All releases since 2003 [1] have supported it. > > Note that you need to the let parser use its own encoding support for > the InputStream. > > Don't pass in a UTF-8 Reader from the JDK. The JDK UTF-8 > InputStreamReader [2] apparently doesn't recognize the BOM and perhaps > never will. > > > -- > > XQuery Blueberry DOM John Cowan > > Entity parser dot-com cowan@c... > > Abstract schemata http://www.ccil.org/~cowan > <http://www.ccil.org/%7Ecowan> > > XPointer errata > > Infoset Unicode BOM --Richard Tobin > > > > _______________________________________________________________________ > > > > XML-DEV is a publicly archived, unmoderated list hosted by OASIS > > to support XML implementation and development. To minimize > > spam in the archives, you must subscribe before posting. > > > > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ > > Or unsubscribe: xml-dev-unsubscribe@l... > > subscribe: xml-dev-subscribe@l... > > List archive: http://lists.xml.org/archives/xml-dev/ > > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php > > [1] > http://svn.apache.org/viewvc/xerces/java/trunk/src/org/apache/xerces/impl/XMLEntityManager.java?r1=318934&r2=318940&diff_format=h > <http://svn.apache.org/viewvc/xerces/java/trunk/src/org/apache/xerces/impl/XMLEntityManager.java?r1=318934&r2=318940&diff_format=h> > [2] http://bugs.sun.com/view_bug.do?bug_id=4508058 > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: mrglavas@c... > E-mail: mrglavas@a... >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



