[Home] [By Thread] [By Date] [Recent Entries]
> Notice: é (the character "e" with an acute accent). It is U-00E9 > > Since its code point is greater than U+0080, it requires more than one > byte. It depends. In ISO 8859-1 (Latin-1) and Windows-1252 (the default for many editors), only 1 byte is required: 0xE9. > Thus, é should be encoded in UTF-8 as: > > C3A9 Yes. > Something is wrong. Here's what I think may be wrong: > - the editor that I am using to display the hex values is displaying > the code points and not the hex values. However, I have now tried two > editors, and they both display the same thing (E9). PSPad has 2 methods to invoke a hex view of a file, giving somewhat different results: 1. Open the file in the default Text Editor mode, then switch to View/Hex Edit Mode. Here, encoding conversions are coming into play, when switching views of the "bytes in memory." 2. Open the file directly in the Hex Editor, by selecting File/Open in Hex Editor. In this mode you get a better view of the "bytes on disk" without encoding conversions. When I come across encoding problems, this is the view that I use. Perhaps the editors you've tried don't have the second type of hex view, which I think is what you want. Mike Waters
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



