Re: Using entities for me dash problem
On 9/12/03 6:23 AM, "David Carlisle" <davidc@n...> wrote: > > >> Yes, but I thought using the glyph instead of the NCR was not valid XML? > > Glyphs are pictorial representations of a character (as produced by > fonts, typically) You don't mean glyph here. in XML you can use any > character that is in the specified encoding. So if you are using ASCII > for example you can use "A" as a character directly, or as A > But you can't enter an e-acute or an em dash directly as they are not in > the encoding so you have to use an NCR for them. UTF8 on the other hand > encodes eery character so you can always use character data directly (or > you can use NCR if you want) > >> Glyphs are what you are reading right now, > But they are not what is in an XML file. > The shapes I see on my screen depend on the shapes specified in the > fonts that I am using. > >> if you want to call them >> "characters" then that's fine by me. > > As you are finding, characters, glyphs and encodings can be confusing, > it's best to keep the different layers clearly distinguished. > You might want to look at I'm not finding them confusing, thank you. I'm finding that arguing over semantics is unnecessary. Yes, a glyph is representative of a certain *font style*, but in typographic circles when I say glyph people know what I'm talking about. There's no need to get technical about it in this discussion either. As far as I'm concerned, ? <-- that's a glyph, not a character, rendered in Monaco or whatever default font your computer is using. Remember, I'm not a computer scientist, so the difference between a glyph and a character means very little to me at present. >> transform a document with UTF encoded XML, it should output >> NCR data, not glyphs, or characters. > > No, XSLT will use character data (in most implementations) not NCR if > you ask for utf8 encoded output (or accept that as the default). To force > NCRs to be used, specify an encoding such as US-ASCII that does not > contain the characters. Thanks again, as I stated before, this works just fine. But as I've stated, if I ask for ASCII output, it's so that the NCR will be preserved. > >> it's false. — is not ASCII data, is it? > Yes, it is. that bit of your message contains 6 bytes of information. > You want me to understand it as > ampersand-hash-eight-two-one-two > I can only do that if I know that the bytes represent letters in ascii > (or an ascii compatible encoding) that is what the xml encoding > declaration is for. It does not specify the underlying character set, > there is no need to specify that as it is _always_ unicode in an XML > context.. Once again, you're arguing computer logic with me. Okay, yes, it's ASCII, fine. But I wanted to *preserve* the NCR and keep the declaration as UTF-8. That's a perfectly acceptable thing to ask for. Unfortunately, XSL does not allow me to preserve the NCR. It's "dumb". If anything, your above statement *proves* that the output method shouldn't be linked to the result declaration, because then the computer is assuming what the declaration should be based on how it was transformed. If the transformed result does not necessarily represent the declaration, I should have be able to change the declaration. In other words, if I've preserved the NCR for the sake of making the result UTF-8, then it shouldn't say US-ASCII just because I *had* to transform it due to the way the computer is programmed to encode these documents. To make it simpler, if I want to preserve NCR, there should be an option without using ASCII encoding, or rather, I should be able to declare whatever encoding I wish the result to be, regardless of how the transformation was encoded. Kind of like "browser spoofing" I suppose. Because in the end I'm just going to change it anyway, right? So that when it's rendered to screen, we see em dashes and not 8212 all over the place, because it's specified as what it *is*, not how it's been *transformed*. >> Is this an application/parsing error or is this >> currently how XSL works? > > It's how XSLT works, which is quite logical once you get to grips with > the meaning of an encoding declaration in XML. I think I've come to grips with the fact that it's illogical and output encoding should NOT be linked to the result declaration as they can be two different things. /johnny :) -- "You'll see it when you believe it."
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format