RE: [xsl] possible workarounds to process files with invalid

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

Subject: RE: possible workarounds to process files with invalid character encoding ...
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Fri, 12 Dec 2008 21:26:38 -0000

If you're capable of writing a Java Reader that will process this file into
a stream of characters, then you can get Saxon to use this Reader by
nominating a custom UnparsedTextURIResolver.

Alternatively, I suspect you can do it at the Java level by registering an
encoding name for the encoding and associating it with a decoder for that
encoding - but I'm not familiar with the details.

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: Matthias Einbrodt [mailto:matthias.einbrodt@xxxxxxxxxxxxx] 
> Sent: 12 December 2008 21:14
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject:  possible workarounds to process files with 
> invalid character encoding ...
> 
> Hello,
> 
> I'm trying to transform a textfile with xslt using the 
> unparsed-text and tokenize functions. Unfortunately the text 
> file consists of characters which are encoded with a non 
> Unicode compliant encoding scheme. So as expected my Saxon 
> Processor (version 9.1.0.3 Basic) shows me a 
> *MalformedInputException *when I want to parse the file.
> 
> Now my question is if there are any "workarounds" to make 
> Saxon process the file anyway. Maybe by:
> 
> (1) Writing a sort of plugin that let's Saxon support also 
> non Unicode compliant encodings;
> 
> (2) By adding in some way Metadata to the input file which 
> Saxon or another XSLT Parser can handle and that specifies a 
> mapping of the used character encodings to the appropriate 
> code points of a Unicode compliant encoding.
> 
> And if there exists such a workaround is it even worth trying 
> to implement it or would someone be better of preprocessing 
> the file with a custom Java-Program or by even trying to 
> modify the program that creates such text-files in such a way 
> that it uses a Unicode-compliant encoding scheme rather than 
> it's own custom one?
> 
> What are your opinions?
> 
> Best Regard
> 
> Matthias Einbrodt

Current Thread
possible workarounds to process files with invalid character encoding ... Matthias Einbrodt - 12 Dec 2008 21:14:12 -0000 Michael Kay - 12 Dec 2008 21:27:16 -0000 <=

<- Previous	Index	Next ->
possible workarounds to proce, Matthias Einbrodt	Thread
possible workarounds to proce, Matthias Einbrodt	Date	Re: trying to figure out hand, Fred Christian
	Month

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >