Re: [Summary] UTF-8 Question: e with acute accent should requ

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

From: "Pete Cordell" <petexmldev@t...>
To: "Michael Kay" <mike@s...>,"'Alessandro Triglia'" <sandro@m...>,"'Costello, Roger L.'" <costello@m...>, <xml-dev@l...>
Date: Sat, 29 Sep 2007 10:28:30 +0100

----- Original Message From: "Michael Kay"

>> It is not correct to say that a Unicode character can be
>> either an "ASCII character" or a "non-ASCII character".  It
>> is better to say that some Unicode characters (those with
>> codes below 128) have a corresponding character in ASCII.
>
> Why?
>
> You're claiming that the character which ASCII calls "Capital Letter A" is 
> a
> different character from the one which Unicode calls "LATIN CAPITAL LETTER
> A". (Actually I don't know what ASCII calls it, but it doesn't affect the
> argument.) What makes you say that these are different characters? They
> aren't different just because different documents give them different 
> names.


I agree with Alessandro.

Just because Unicode "LATIN CAPITAL LETTER A" and ASCII "Capital Letter A" 
represent the same character, does not mean that Unicode "LATIN CAPITAL 
LETTER A" _IS_ ASCII "Capital Letter A".  It is the A character itself that 
both refer to that is the authorative entity, not the ASCII "Capital Letter 
A" character code.

Also, in the case of XML instances, the whole document has the same 
character encoding.  We don't say that some of it is ASCII and some of it is 
Unicode (i.e. UTF-8 in the given examples).  In an XML context, in Roger's 
original string, the e acute can not be represented in ASCII, so the other 
characters can not be ASCII either.  That doesn't mean that the character 
code used to represent, say, A in this as yet unknown character encoding 
can't be the same as that used in ASCII.

my 2 cents!

Pete.
--
=============================================
Pete Cordell
Codalogic
for XML Schema to C++ data binding visit
 http://www.codalogic.com/lmx/
=============================================

Follow-Ups:
- Re: [Summary] UTF-8 Question: e with acute accent should require two bytes, right?
  - From: richard@i... (Richard Tobin)

References:
- UTF-8 Question: e with acute accent should require two bytes, right?
  - From: "Costello, Roger L." <costello@m...>
- [Summary] UTF-8 Question: e with acute accent should require two bytes, right?
  - From: "Costello, Roger L." <costello@m...>
- RE: [Summary] UTF-8 Question: e with acute accent should require two bytes, right?
  - From: "Alessandro Triglia" <sandro@m...>
- RE: [Summary] UTF-8 Question: e with acute accent should require two bytes, right?
  - From: "Michael Kay" <mike@s...>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >