Subject: RE: Encoding problem or what else?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Thu, 8 Dec 2005 08:31:18 -0000
|
> i checked the file with an HEX editor and it turned out that
> there are three
> bytes whose hex code is EF BB BF at the beginning of the
> offending file.
> I guess this must be the reason why the parser is
> complaining, though I am
> still not clear if this is some sort of multi-byte character
> or just some
> junk that happens to be there.
This is a byte-order-mark, a signal that the file is in UTF-8 encoding. Some
parsers allow a BOM at the start of a UTF-8 file, others don't. The latest
specs say that it should be accepted, but older parsers don't recognize it.
It's generated automatically by some software that writes UTF-8, especially
Microsoft software.
Michael Kay
http://www.saxonica.com/
| Current Thread |
Michael Kay - 8 Dec 2005 08:31:56 -0000 <=
Michael Kay - 8 Dec 2005 08:38:49 -0000
FC - 8 Dec 2005 14:42:02 -0000
- Jirka Kosek - 8 Dec 2005 15:11:37 -0000
- FC - 8 Dec 2005 15:30:38 -0000
|
|