RE: XML Schema union type is evil (for XPath 2.0 proc
I think it's a bit strong to say "avoid unions", but you do need to be aware of their pitfalls. Another problem with union types is that you can't use them in the type declarations of parameters or variables, for example you can have an attribute in the schema whose type is union of (xs:date, xs:gYearMonth, xs:gYear), but you can't declare a variable or parameter of that type - it has to be either atomic or a node. Michael Kay http://www.saxonica.com/ -----Original Message----- From: Costello, Roger L. [mailto:costello@xxxxxxxxx] Sent: 10 April 2008 13:50 To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: XML Schema union type is evil (for XPath 2.0 processing) Hi Folks, In Michael Kay's book, XPath 2.0 (p. 259 and 289), he gives 3 cases where the use of a union type can lead to problems. Here are the 3 cases: 1. Consider this <prices> element which contains a list of prices (decimal values), and, if no price is available "N/A" is listed: <prices>40.99 19.00 N/A 23.80</prices> Each list value is either a decimal or the string value, "N/A". That is, each list value is a union of: - xs:decimal - a simpleType with enumeration value of "N/A" Now suppose that I want to write an XPath expression to see if there are some prices over 30.00. Here's one way to express it: if (some $i in data(prices) satisfies $i gt 30.00) then 'Expensive stuff' else 'Cheap stuff' I ran this XPath using SAXON and got this output: Expensive stuff. Then I changed the input by swapping the first list value with the N/A value: <prices>N/A 19.00 40.99 23.80</prices> I ran the same XPath against this input, using the same SAXON processor and I got an error message saying that I can't compare the string "N/A" against the decimal 30.00 So, depending on the "order" of the input data I get a successful result or an error! Furthermore, even with the first version of the input: <prices>40.99 19.00 N/A 23.80</prices> I may, or may not get an error. SAXON evaluates the list values from left to right, and stops as soon as it finds a true value (40.99 gt 30.00 is true, so it stops). XPath processors are free to evaluate the list values in any order. So, another XPath processor may evaluate the list values from right-to-left, and give an error. Recap: (a) You may, or may not, get an error depending on the order of the list values. (b) You may, or may not, get an error depending on the XPath processor that you use. The good news is that there is a way to protect yourself against this problem: if (some $i in data(prices)[. instance of xs:decimal] satisfies $i gt 30.00) then 'Expensive stuff' else 'Cheap stuff' The predicate will filter the "N/A" list value, and so there will never arise the situation where "N/A" is compared against 30.00 2. There is the same problem when using the "every" expression, e.g. if (every $i in data(prices) satisfies $i lt 30.00) then 'Buy at this store' else 'Shop elsewhere' With this input: <prices>40.99 19.00 N/A 23.80</prices> SAXON gives this output: Shop elsewhere With this input (swap the first list value with "N/A"): <prices>N/A 19.00 40.99 23.80</prices> SAXON generates an error. Again, it is possible to protect yourself: if (every $i in data(prices)[. instance of xs:decimal] satisfies $i lt 30.00) then 'Buy at this store' else 'Shop elsewhere' 3. Next, consider a <quantity> element whose value is either a number or the string "out-of-stock". Here are two examples: <quantity>out-of-stock</quantity> <quantity>20</quantity> The value is either a number or the string value "out-of-stock". That is, the value is a union of: - xs:nonNegativeInteger - a simpleType with enumeration value of "out-of-stock" Now, suppose I want to write an XPath expression to see if the quantity is out-of-stock: if (data(quantity) eq 'out-of-stock') then 'Bummer' else 'Buy them all!' With the first example above as input, the output is: Bummer With the second example as input, an error is generated. Again, there is a way to protect yourself: if (data(quantity) instance of xs:string) then if (data(quantity) eq 'out-of-stock') then 'Bummer' else "Something is screwed up in the input" else 'Buy them all!' SUMMARY 1. Input data that contains union values must be dealt with carefully. 2. If you don't design the XPath to protect yourself, then your XPath may succeed with some inputs and fail with others; it may succeed with some XPath processors and fail with others. QUESTIONS 1. While it is possible to write XPath expressions to "protect yourself" it is, I think, likely that people will either: - forget to do so - not know how to do so - not be aware of the problem with union types What's Best Practice? Perhaps Best Practice is: "Avoid using union types." What do you think? 2. Are there other cases where the union type presents a problem? (I haven't yet read all of Michael's book, so there may be other cases he identifies in his book that I haven't yet read) /Roger
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format