[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Character encoding/representation from ISO-8859-1
> On 11 Oct 2016, at 21:00, Bridger Dyson-Smith bdysonsmith@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > <?xml version="1.0" encoding="iso-8859-1"?> > <documents> > <document>The reality of the effect of natural ventilation in a residential attic cavity has been the topic of many debates and scholarly reports since the 1930C"b,b"s.</document> > </documents> It looks very much like 1) in the XML header you claim the document is ISO-8859-1 encoded, while really 2) it is not. I can see that one character, that b , was decoded as three (C"b,b"). Had the document really been encoded with ISO-8859-1, any decoding would have ended up with at most one character (because ISO-8859-1 does not use multibyte characters). try to replace biso-8859-1b in the xml header with butf-8b, does that work? Regards, Soren
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|