RE: regexs, grouping (?) and XSLT2?
> (Or, of course, you could create a schema for the entire document and > make sure the source is validated against that schema, so that the > <mods:dateIssued> element is annotated with the correct type from the > start.) I agree, using a schema-defined union type just for use within the stylesheet, when you aren't using it to describe the source or result documents, is probably over the top. It's interesting, though: there's no intrinsic reason why casts from string to list or union types shouldn't be allowed. > > Second, having the processor assign the correct type doesn't really > buy you anything anyway, because there's precious little support for > the xs:gHorribleKludge datatypes in XPath 2.0. I think it is worthwhile treating your "union of YYYY-MM-DD, YYYY-MM, or YYYY" as a user-defined data type (say m:date), and defining your own function library to manipulate this type. For example, you can define functions like m:get-year($p as m:date) to extract the year, m:make-date($s as xs:string) to construct an instance of this type, m:compare() to compare two instances, and so on. I agree that you probably make life easier if you define this by restricting xs:string rather than as a union over xs:date, xs:gYearMonth, and xs:gYear. This is partly, as you point out, because you can't cast to a union type, but also because you can then exploit the fact that the three member types have a lot in common, for example they all start with YYYY. If you were > constructing a function to group the <mods> elements by year, it would > look something like: > > <xsl:function name="mods:year" as="xs:integer"> > <xsl:param name="mods" as="element(mods:mods)" /> > <xsl:variable name="temp" as="element(*, mods:date)"> > <mods:dateIssued xsl:type="mods:date"> > <xsl:value-of select="$mods/mods:originInfo/mods:dateIssued" /> > </mods:dateIssued> > </xsl:variable> > <xsl:variable name="date" as="xdt:anyAtomicType" > select="data($temp)" /> > <xsl:choose> > <xsl:when test="$date instance of xs:date"> > <xsl:sequence select="year-from-date($date)" /> > </xsl:when> > <xsl:otherwise> > <xsl:sequence > select="xs:integer(substring(string($date), 1, 4))" /> > </xsl:otherwise> > </xsl:choose> > </xsl:function> Just to be clear, this is a function to extract the year component. It's of course horribly heavy to have to convert the string into a union type just so you can then extract an integer. If mods:dateIssued were defined in the schema as an instance of the union type m:date, you could do it like this: <xsl:function name="m:get-year" as="xs:integer"> <xsl:param name="date" as="element(*, m:date)"/> <xsl:apply-templates select="." mode="get-year"/> </xsl:function> <xsl:template match="element(*, xs:date)" mode="get-year"> <xsl:sequence select="year-from-date(.)"/> </xsl:template> <xsl:template match="element(*, xs:gYearMonth) | element(*, xs:gYear)" mode="get-year"> <xsl:sequence select="xs:integer(substring(string(.), 1, 4))"/> </xsl:template> It's a shame that you have to use template rules in order to get polymorphism - but at least it's possible in XSLT, which it isn't in XQuery! > > Another point to be made is that if you have a union type, there's no > way to compare the values within that type with each other: you can't > compare a xs:date with a xs:gYear, so you can't sort them into the > order that you'd expect. You can of course define a function that will compute a sort key, and use this sort key for comparisons. > > [FWIW, I thought that the schema-aware version would turn out to be > simple, since Mike's been going on about how much easier life is with > schema-awareness; I'm surprised at how complicated it turns out to be, > and it's possible that I'm missing some easier schema-aware method.] The biggest benefit I've seen from schema-aware processing is in validating the result document: my experience so far is that this definitely reduces the time it takes to produce a stylesheet that delivers correct results: not because the code you write is any different, but because you find the bugs more quickly. I've also seen some benefits in processing source documents that have been schema-validated, by exploiting the type information. This depends very much on the particular schema, but if the type hierarchy has been properly designed, then I think you get many opportunities to improve the structure (and reduce the length) of the stylesheet code by making template rules more generic (or specific!), and by defining functions to handle common processing in the same way you normally use methods. The scenario of creating a schema purely for the benefit of XSLT processing is a much less likely one, and as I think this example shows, types come into their own when associated with nodes rather than atomic values. Michael Kay
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format