[Home] [By Thread] [By Date] [Recent Entries]
My guess would be that Microsoft chose an ASCII encoding for this file rather than a UTF-8 encoding because, at the time, CVS repositories could be very temperamental about file encodings. End user applications, provided they use a real XML parser, are going to see exactly the same data is if it had all been encoded in UTF-8. Michael Kay Saxonica > On 16 Jan 2022, at 07:03, Mukul Gandhi <mukulg@s...> wrote: > > Hi all, > I came across, following XML instance document, provided with w3c xml schema test suite, > > <doc value="؀؁؂؃؄؅؆؇؈؉؊؋،؍؎؏ؘؙؚؐؑؒؓؔؕؖؗ؛؜؝؞؟ؠءآأؤإئابةتثجحخدذرزسشصضطظعغػؼؽؾؿـفقكلمنهوىيًٌٍَُِّْٕٖٜٟٓٔٗ٘ٙٚٛٝٞ٠١٢٣٤٥٦٧٨٩٪٫٬٭ٮٯٰٱٲٳٴٵٶٷٸٹٺٻټٽپٿڀځڂڃڄڅچڇڈډڊڋڌڍڎڏڐڑڒړڔڕږڗژڙښڛڜڝڞڟڠڡڢڣڤڥڦڧڨکڪګڬڭڮگڰڱڲڳڴڵڶڷڸڹںڻڼڽھڿۀہۂۃۄۅۆۇۈۉۊۋیۍێۏېۑےۓ۔ەۖۗۘۙۚۛۜ۝۞ۣ۟۠ۡۢۤۥۦۧۨ۩۪ۭ۫۬ۮۯ۰۱۲۳۴۵۶۷۸۹ۺۻۼ۽۾ۿ"/> > > Within the above mentioned, XML document, the text content of attribute "value" are arabic characters (specified with their unicode code points). I guess, specifying unicode characters with notation &#x.... (as with the example cited above), is a preferred way to mention and transport the related XML documents across software application systems. > > My questions please, > What would, end user applications do with such XML documents? I guess, most likely they'll render them within a UI (then relevant fonts would also be needed) or, get/extract text contents from the XML documents for specific computations (like string comparison, etc). Am I right, on these points? > > Any thoughts, on this topic would be great. > > > -- > Regards, > Mukul Gandhi
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



