(The ��� chars actually appear as "?" within a black diamond. within SS)
What I don't understand is that when viewing the source document in SS it 'renders' ok but the resultant xml doesn't. The problem has something to do with the 'dodgy' characters '˙˙˙˙˙˙˙˙˙˙˙˙˙˙'. But why is my resulting XML in error? (When all I'm doing is 'echoing' the dodgy characters to the output xml?)
Note: the XSLT I developend in SS but apply it in my VBA application.
I've now tracked problem down to someting to do with 'encoding' because if I open up my resultant file in notepad and then "save as.." it says 'ANSI' and if I save it as "UTF-8" IE will open it ok.
Where as if I do the same with my souce document it says already 'UTF-8'
So where do I tell my application to save the resultant XML file to be UTF-8? in the stylesheet or somewhere in VBA or both?
Note: the XSLT I developend in SS but apply it in my VBA application.
I've now tracked problem down to someting to do with 'encoding' because if I open up my resultant file in notepad and then "save as.." it says 'ANSI' and if I save it as "UTF-8" IE will open it ok.
Where as if I do the same with my souce document it says already 'UTF-8'
So where do I tell my application to save the resultant XML file to be UTF-8? in the stylesheet or somewhere in VBA or both?
Richard,
it depend how your VBA application saves the file... if it relies on the MSXSL processor to actually write the file, it would be enough to add a <xsl:output encoding="utf-8"/> instruction right after the <xsl:stylesheet> node.
If your application is instead getting the result as a VBA string, you should investigate if the VBA file allows writing data in Unicode, instead of using the current locale.
Subject:dodgy(non-ascii) characters causing confusion Author:Richard Potts Date:15 Feb 2007 06:48 AM
Thanks for the pointer Alberto.
In my VBA code I was creating the resultant file using "Writeline" i.e.
'
'Output to file
'
ExtractFile.WriteLine xslProc.output
Further Investigation on the web, for similar problems - found that ADO 'streams' seemed to be the answer: I've now changed it to:
'
' Open stream and set output code to UTF-8
'
Str.Charset = "UTF-8"
Str.Open
xslProc.output = Str
'
' Apply Transform
'
xslProc.transform
'
' Save stream to file
'
Str.SaveToFile sExtractFileName
So hope that helps someone else.
Not sure if it is possible or not… but as a thought, is there a potential enhancement to SS here? – e.g. to have a 'properties' for each file loaded into the XML editor so you can see what 'encoding' it was created with. Also is it possible to high-light the encoding issue to the user when you get the "? in diamonds" displayed for a xml file - How about maybe a SS feature 'convert this file from one encoding format to another encoding format' ???
Subject:dodgy(non-ascii) characters causing confusion Author:Tony Lavinio Date:15 Feb 2007 09:40 AM
The XML standard says that if the encoding isn't in the first
line of the file, then it MUST be only one of a handful. The
rules are very specific, and anything else should be considered
broken and rejected by the parser.
The problem is that if the actual encoding of the file differs
from the stated encoding, then it is not always possible to determine
unambiguously what the encoding should have been. It is possible to
have a file that could be read successfully using several different
encodings and yielding several different results.
We do the best we can, but it is not deterministically possible to
tell in all cases, so the safest course is for Stylus Studio to
complain.
Subject:dodgy(non-ascii) characters causing confusion Author:Tony Lavinio Date:15 Feb 2007 09:42 AM
... and there is a way inside of Stylus Studio to change the encoding,
but it assumes that the file in the editor is displayed properly.
Really, to fix a broken encoding you'd need a non-unicode-aware tool,
or more ideally you would get the source of the broken XML to not write
invalid XML.