[Home] [By Thread] [By Date] [Recent Entries]
This is all scaring me a bit ... just as I was thinking that dealing with text was easier than bit twiddling! I may need to scurry back into debugging DCOM protocols.... perhaps I will just keep using Java chars in blissful ignorance until some terrible calamity occurs :-) Jonathan > Elliotte Rusty Harold wrote: > > > It could be worse, though. You could be using C, and trying to decode > > UTF-8. :-) > > > ?? It's about 10 lines of code, and has been written lots of > times now. Last time I needed it I couldn't find one with the > exact buffer interface I needed so I coded it up from scratch > sometime in the course of an afternoon and it worked first time. > The spec is hardly unclear. And it's a set of shift/mask > operations that are processor-friendly. You need to use a > loop iterator rather than a for (i = 0; string[i]; i++) idiom, > big deal. > > UTF8 only really causes extra work when you want per-character > addressing into big strings, because then you need an indirect > table - the most common case I can think of is maintaining > on-screen render state. > > But in most apps it's more common to point into text at a > few places (tags, word-starts, search matches) in which case > you needed that indirect array anyhow. > > Conclusion: somewhat to my surprise, I find that for a lot > of C tasks, you can keep your text in UTF-8 and work with > it that way very efficiently. > > Elliote is right about the irritating fact that a Java > "char" isn't an XML character. The nasty fact is that > I suspect many Java application programmers will end up > simply blowing off non-BMP text either through ignorance > or based on a decision that it's not cost-effective. -Tim > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl> >
|

Cart



