[Home] [By Thread] [By Date] [Recent Entries]
Daniel Veillard wrote: > ... > > And using UCS-2 for memory encoding is also in a lot of cases > a really bad choice. Processor performances are cache related nowadays. > Filling them up with 0 for half of your data processed can simply > trash your caches. I will stick to UTF8 internally, it also allows > some processor to use hardcoded CISC instructions for 0 terminated C > strings (IIRC the Power line of processors have such a set of instructions). The costs and benefits of UTF-8 are well-known. Random-access at the character level becomes quite inefficient. Neither UCS-2 nor UTF-8 are right as the in-memory model for all applications. Paul Prescod
|

Cart



