[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] possible workarounds to process files with invalid ch
Hello, I'm trying to transform a textfile with xslt using the unparsed-text and tokenize functions. Unfortunately the text file consists of characters which are encoded with a non Unicode compliant encoding scheme. So as expected my Saxon Processor (version 9.1.0.3 Basic) shows me a *MalformedInputException *when I want to parse the file. Now my question is if there are any "workarounds" to make Saxon process the file anyway. Maybe by: (1) Writing a sort of plugin that let's Saxon support also non Unicode compliant encodings; (2) By adding in some way Metadata to the input file which Saxon or another XSLT Parser can handle and that specifies a mapping of the used character encodings to the appropriate code points of a Unicode compliant encoding. And if there exists such a workaround is it even worth trying to implement it or would someone be better of preprocessing the file with a custom Java-Program or by even trying to modify the program that creates such text-files in such a way that it uses a Unicode-compliant encoding scheme rather than it's own custom one? What are your opinions? Best Regard Matthias Einbrodt
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|