[Home] [By Thread] [By Date] [Recent Entries]

  • From: John Cowan <cowan@l...>
  • To: eldarm@m... (Eldar Musayev)
  • Date: Wed, 24 May 100 21:21:56 -0400 (EDT)

Eldar Musayev scripsit:

> People outside may want just to slip few lines in a text without bothering
> themselves with
> encoding header. Would you like to add charset information to every XML
> document you create?

You *must* do so, unless the document is in UTF-8 or UTF-16.  US-ASCII,
which is a subset of UTF-8, will also work, but ISO 8859-1, or KOI-8R,
or EUC-JP, is illegal without an encoding declaration or the equivalent
charset declaration on a MIME header.

> Because what you are proposing stripes the whole world except few
> purely-English language countries
> of the convenience of a default charset.

The only default charsets are UTF-8 and UTF-16.

> In short, non-valid characters are errors, but they should not be fatal.

We are not talking about invalid characters (such as U+0001) which are
already fatal errors.  We are talking about invalid encodings.
An FF byte in a UTF-8 document means the document is nonsense; there
is no telling what it means.

-- 
John Cowan                                   cowan@c...
	Yes, I know the message date is bogus.  I can't help it.
		--me, on far too many occasions

***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member