[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Unicode and XSL (was substring())
In message <93CB64052F94D211BC5D0010A800133170EECF@xxxxxxxxxxxxxxxxxxxxx uk>, Kay Michael <Michael.Kay@xxxxxxx> writes > >We had this conversation a few weeks ago (regarding length()). As I learnt >then, it's all due to the appalling decision to allow non-spacing >diactricals in Unicode, which makes it quite hard to define what you mean by >"the first character" in a string. It isn't just diacriticals. Unicode has a concept of "combining characters" which is used for a wide range of purposes, most of which I don't begin to understand. It divides them into combining character classes, which group together characters which appear over, under, around, (etc.!) the base character. It also has a detailed algorithm for combining multiple combining characters into one base character. The *semantics* of "the first character" might be a difficult one. However, if you are simply trying to count characters, surely all you have to do is to ignore any combining characters that occur within the string. (The first character should be a 'real one' - combining characters always follow the base character they qualify.) Since XML adopts Unicode in an unqualified manner, I assume that XSL back-ends will support the rendering of these combined characters. Just like I assume that all XML editors will support Unicode. ;-( Richard Light. Richard Light SGML/XML and Museum Information Consultancy richard@xxxxxxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|