[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Here's how to process XML documents written in German
The real lesson here is that you should never make contracts with Germans! Problem solved. ;-) On Wed, Jan 30, 2013 at 4:31 PM, Tony Graham <tgraham@mentea.net> wrote: > On Wed, January 30, 2013 6:47 pm, Costello, Roger L. wrote: > ... >> This XPath expression does the job: >> >> sum(//Posten[@*[normalize-unicode(name(.)) eq >> normalize-unicode('währung')][. eq 'EUR']]) >> >> The normalize-unicode() function converts an attribute name into a >> standard, canonical form. >> >> Lesson Learned: >> >> When processing markup with diacritical marks, beware that two characters >> may visually appear the same but inside the computer they are represented >> very differently. Design XPath expressions accordingly -- use >> normalize-unicode() to convert markup into canonical form. > > The truism "validate at trust boundaries" comes to mind: if you can't > trust the encoding or normalization form of the XML that you receive, then > normalise it as soon as you receive it so all of your XML is consistent > and you don't have to make your XPaths unreadable. > > Your example is much like the example in Section 3.1.1, "Why do we need > character normalization?" [1] of "Character Model for the World Wide Web > 1.0: Normalization". That document discusses the advantages of early or > late normalization as well as more aspects of normalization that most of > us could think of on our own. Unfortunately its recommendations are in > flux (and have been since May last year), but your scenario would best be > handled by 'late normalization' where you normalize the data after it's > transmitted to you. > > Regards, > > > Tony Graham tgraham@mentea.net > Consultant http://www.mentea.net > Mentea 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland > -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > XML, XSL-FO and XSLT consulting, training and programming > > [1] http://www.w3.org/TR/charmod-norm/#sec-WhyNormalization > > _______________________________________________________________________ > > XML-DEV is a publicly archived, unmoderated list hosted by OASIS > to support XML implementation and development. To minimize > spam in the archives, you must subscribe before posting. > > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ > Or unsubscribe: xml-dev-unsubscribe@lists.xml.org > subscribe: xml-dev-subscribe@lists.xml.org > List archive: http://lists.xml.org/archives/xml-dev/ > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|