[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Here's how to process XML documents written in German
Hi Chris, > The real lesson here is that you should never make contracts with > Germans! Problem solved. > ;-) > this asks for a response by a German ;-) Please look into this BMP table (from [1]): http://stamm-wilbrandt.de/en/blog/BMP.xsl.html There are a LOT more Korean, Chinese, Japanese, ... characters than the only few German specials. If this (Japanese) XML sample does not show correctly, see [2]: $ xsltproc identity.xsl interesting.xml <?xml version="1.0"?> <é¢ç½ã>ç´ å</é¢ç½ã> $ [1] https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/entry/bmp_xsl_html_basic_multilingual_plane20 [2] http://stamm-wilbrandt.de/en/xsl-list/interesting.xml Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Level 3 support for XML Compiler team and Fixpack team lead WebSphere DataPower SOA Appliances https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/ https://twitter.com/HermannSW/ ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 |------------> | From: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Chris Maloney <voldrani@gmail.com> | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | To: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Tony Graham <tgraham@mentea.net>, | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Cc: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |"xml-dev@lists.xml.org" <xml-dev@lists.xml.org> | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Date: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |01/30/2013 10:55 PM | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Subject: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Re: Here's how to process XML documents written in German | >--------------------------------------------------------------------------------------------------------------------------------------------------| The real lesson here is that you should never make contracts with Germans! Problem solved. ;-) On Wed, Jan 30, 2013 at 4:31 PM, Tony Graham <tgraham@mentea.net> wrote: > On Wed, January 30, 2013 6:47 pm, Costello, Roger L. wrote: > ... >> This XPath expression does the job: >> >> sum(//Posten[@*[normalize-unicode(name(.)) eq >> normalize-unicode('währung')][. eq 'EUR']]) >> >> The normalize-unicode() function converts an attribute name into a >> standard, canonical form. >> >> Lesson Learned: >> >> When processing markup with diacritical marks, beware that two characters >> may visually appear the same but inside the computer they are represented >> very differently. Design XPath expressions accordingly -- use >> normalize-unicode() to convert markup into canonical form. > > The truism "validate at trust boundaries" comes to mind: if you can't > trust the encoding or normalization form of the XML that you receive, then > normalise it as soon as you receive it so all of your XML is consistent > and you don't have to make your XPaths unreadable. > > Your example is much like the example in Section 3.1.1, "Why do we need > character normalization?" [1] of "Character Model for the World Wide Web > 1.0: Normalization". That document discusses the advantages of early or > late normalization as well as more aspects of normalization that most of > us could think of on our own. Unfortunately its recommendations are in > flux (and have been since May last year), but your scenario would best be > handled by 'late normalization' where you normalize the data after it's > transmitted to you. > > Regards, > > > Tony Graham tgraham@mentea.net > Consultant http://www.mentea.net > Mentea 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland > -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > XML, XSL-FO and XSLT consulting, training and programming > > [1] http://www.w3.org/TR/charmod-norm/#sec-WhyNormalization > > _______________________________________________________________________ > > XML-DEV is a publicly archived, unmoderated list hosted by OASIS > to support XML implementation and development. To minimize > spam in the archives, you must subscribe before posting. > > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ > Or unsubscribe: xml-dev-unsubscribe@lists.xml.org > subscribe: xml-dev-subscribe@lists.xml.org > List archive: http://lists.xml.org/archives/xml-dev/ > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php > _______________________________________________________________________ XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting. [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-dev-unsubscribe@lists.xml.org subscribe: xml-dev-subscribe@lists.xml.org List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|