[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Performance of predicate-based patterns
I don't think anyone at all familiar with normal DITA XSLT practice would use anything other than [contains(@class, ' foo/bar ')] or the DITA Community df:class() function: <xsl:function name="df:class" as="xs:boolean"> <xsl:param name="elem" as="element()"/> <xsl:param name="classSpec" as="xs:string"/> <!-- '\$" in the regex is a workaround for a bug in MarkLogic 3.x and for a common user error, where trailing space in class= attribute is dropped. --> <xsl:variable name="normalizedClassSpec" as="xs:string" select="normalize-space($classSpec)"/> <xsl:variable name="result" select="matches($elem/@class, concat(' ', $normalizedClassSpec, ' | ', $normalizedClassSpec, '$'))" as="xs:boolean"/> <xsl:sequence select="$result"/> </xsl:function> The df:class() function handles the case where a @class attribute value is missing the required trailing space in the @class value (a problem that MarkLogic used to cause but that was fixed in ML 4 I think). If there's a more efficient way to match values in the @class attribute, I'd certainly like to know about it. Cheers, E. bbbbb Eliot Kimber, Owner Contrext, LLC http://contrext.com On 1/23/15, 8:19 AM, "Graydon graydon@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >On Fri, Jan 23, 2015 at 11:28:31AM -0000, Michael Kay mike@xxxxxxxxxxxx >scripsit: >> We've started doing some performance work in Saxon on the DITA >> stylesheets, which use large numbers of match patterns in the form >> >> <xsl:template match="*[contains(@class, ' token ')]"> > >If anybody ever starts using XSLT 2.0 for DITA processing, there are >going to be things like > ><xsl:template match="*[(tokenize(@class,'\p{Zs}+')[normalize-space()])[2] >eq 'topic/li']]"> > >showing up. ("some $x in tokenize(@class,...." seems pretty likely, >too.) > >> Currently these require a very inefficient sequential search to find >> the matching rule for each element. >> >> Does anyone know of any other commonly-used stylesheets (or even, >> uncommonly used ones) which show similar characteristics, that is, >> large numbers of match patterns using predicate matching only, with no >> explicit element names? We'd like any optimizations we implement to be >> as general-purpose as possible. > >I've done some conversion work on legal documents where the goal was to >get everything back on a single schema after a couple decades of >evolution in the element names of various DTDs. Matches of the form > ><xsl:template match="*[name() = ('P','NP','PARA')]"> > >showed up a fair bit to match on the abstract "that's a paragraph" >across the range of evolved element names. > >There was also a fair bit of > ><xsl:template match="*[not(name() = ('PARA','LIST','TABLE')))]"> > >used as general "we don't think there's anything but those in the data >but let's not make rash assumptions" surprise handler templates. > >-- Graydon
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|