[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML Max Character Value
* Michael Kay <mike@s...> [2005-08-14 17:24]: > > -----Original Message----- > > From: Alan Gutierrez [mailto:alan-xml-dev@e...] > > Sent: 13 August 2005 12:06 > > To: Derek Denny-Brown > > Cc: xml-dev@l... > > Subject: Re: XML Max Character Value > > > > * Derek Denny-Brown <derekdb@m...> [2005-08-13 01:29]: > > > > > In java, 0xFFFE or 0xFFFF should work. They aren't strictly > > > the max Unicode character for XML, but since Java represents > > > Unicode as utf-16 but doesn't really provide much support for > > > surrogate pairs (last I checked), those should work. Hm.. > > > Eclipse tells me that there is Character.MAX_VALUE. Use at > > > your own risk. > > > > I am using it to design the algorithm. Concerned about what to > > do if Unicode requires multiple characters for a single > > character. It's perplexing. > > > > > Reading up on Unicode is also recommended though... > > > internationalization is far, far more complicated than you > > > ever imagined. I know people who get the shakes if you just > > > mention "Turkish 'I'" in their presence. (mild > > > exaggeration...) > > > > I have no illusions about the complexity. I'd simply hoped that > > they would have made a hard and fast rule about min and > > max values. > In XSLT 2.0, the collation used by xsl:key is not necessarily > Unicode codepoint order. To build an index, you need to store the > key value as a sequence of collation units, not as a sequence of > Java chars or Unicode codepoints. So I suspect that what you > really want is the highest collation unit in the particular > collation used for the key in question. I don't need a sentry at this point. I've turned the equality tests around so they start from an implicit zero. Thus, for the sake of <xsl:key/>... > (Actually, xsl:key only supports equality semantics, not ordering > semantics. But I can see that you probably want to implement > indexes that also support ordering semantics. It's likely that > these too would need to be collation-sensitive.) ...I'm only using the sort in order to search and to find the values in the tree. Any sort will do. Collation in <xsl:key/> is only applied after the keyed nodes are recovered, or that's my understanding. Soon after, I'm going to want to support ordering as well, and attempt to integrate that with <xsl:sort/>. (Perhaps, XQuery can take advantage of ordered indices, I don't know.) This is a B-Tree implementation. The words 'collation unit' are heartening, I'm looking to advance the string comparison myself, using it to determine which branch to take in the B-Tree. I'm storing partial strings in tiers for branching. Partial means, just enough of the string to indicate which branch to take. My design stores a character and index pair as a branch node, so I bump along the search string branching along the way. This is FYI, for the group... I've written a document object model that's file backed, and I'm using it with Saxon for queries, and I've put together my own XUpdate implementation for node surgery. I want to provide Saxon with a file backed index. -- Alan Gutierrez - alan@e... - http://engrm.com/blogometer/index.html - http://engrm.com/blogometer/rss.2.0.xml
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|