[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Feeler for SML (Simple Markup Language)
Tim Bray wrote: > > At 01:08 PM 11/15/99 -0800, David Brownell wrote: > >> The UTF-*'s are logically equivalent to most users, in that they share > >> the property that almost no real-world data objects are encoded in either. > > > >Quite true, from what I know, if you don't consider all the documents > >encoded in ASCII (which is a subset of UTF-8). Many of them aren't > >tagged as to encoding; assert they're UTF-8 not ASCII, and disproof is > >often going to be impossible! > > I used to think so too, but actually, if you look closely, the proportion > of "ascii" that's actually pure US-ASCII is not that high. Well, ASCII is ASCII -- if it's not pure, it's not ASCII (and hence it's not usable as UTF-8 either). ASCII uses only seven bits; always has (modulo parity), and I can't see that changing. But while that's key to what I was saying (if it _really_ is ASCII, it's also UTF-8, and there's lots of real ASCII), I suspect that was likely not what you were getting at there. > The prevalence > of é's and õ's and so on these days is in my experience really growing, > which means that documents which are ideally ISO-8859-1 but in fact > some Microsoft codepage is really immense. -T. Those characters are actually in ISO-8859-1, but I understand that Microsoft does cause real problems by its use of many characters that are reserved in 8859-1 ... look at the number of web pages with strange characters where you should have “ or ” (but hmm, not all browsers accept those entities anyway). Assert that one of those documents is ASCII, and disproof is trivial: some character has the eighth bit set. (When was the last time you saw a document using it for parity? A LONG time ago, for me!) Since it's not ASCII, you clearly can't read it as UTF-8. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|