[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML Max Character Value
* 'Alan Gutierrez' <alan-xml-dev@e...> [2005-08-14 23:12]: > * Michael Kay <mike@s...> [2005-08-14 17:24]: > > > -----Original Message----- > > > From: Alan Gutierrez [mailto:alan-xml-dev@e...] > > > Sent: 13 August 2005 12:06 > > > To: Derek Denny-Brown > > > Cc: xml-dev@l... > > > Subject: Re: XML Max Character Value > > > > > > * Derek Denny-Brown <derekdb@m...> [2005-08-13 01:29]: > > > > > > > In java, 0xFFFE or 0xFFFF should work. They aren't strictly > > > > the max Unicode character for XML, but since Java represents > > > > Unicode as utf-16 but doesn't really provide much support for > > > > surrogate pairs (last I checked), those should work. Hm.. > > > > Eclipse tells me that there is Character.MAX_VALUE. Use at > > > > your own risk. > > > > > > I am using it to design the algorithm. Concerned about what to > > > do if Unicode requires multiple characters for a single > > > character. It's perplexing. > > > > > > > Reading up on Unicode is also recommended though... > > > > internationalization is far, far more complicated than you > > > > ever imagined. I know people who get the shakes if you just > > > > mention "Turkish 'I'" in their presence. (mild > > > > exaggeration...) > > > > > > I have no illusions about the complexity. I'd simply hoped that > > > they would have made a hard and fast rule about min and > > > max values. > > > In XSLT 2.0, the collation used by xsl:key is not necessarily > > Unicode codepoint order. To build an index, you need to store the > > key value as a sequence of collation units, not as a sequence of > > Java chars or Unicode codepoints. So I suspect that what you > > really want is the highest collation unit in the particular > > collation used for the key in question. > This is a B-Tree implementation. The words 'collation unit' are > heartening, I'm looking to advance the string comparison myself, > using it to determine which branch to take in the B-Tree. > > I'm storing partial strings in tiers for branching. Partial > means, just enough of the string to indicate which branch to > take. My design stores a character and index pair as a branch > node, so I bump along the search string branching along the way. > I've found CollationKey.toByteArray() in java.text. http://java.sun.com/j2se/1.4.2/docs/api/java/text/CollationKey.html#toByteArray() It seems to do what I need. Create a sequence of units along which I can advance and compare. -- Alan Gutierrez - alan@e... - http://engrm.com/blogometer/index.html - http://engrm.com/blogometer/rss.2.0.xml
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|