[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: generating ID strings that are both readable and

Subject: RE: generating ID strings that are both readable and unique
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 14 Oct 2008 10:44:03 +0100
RE:  generating ID strings that are both readable and
Quite hard to do in "pure" XSLT 1.0 without a node-set() extension, because
I think any solution that is moderately efficient is going to involve some
temporary data.

I would create a temporary document containing all distinct ids/titles like
this

<xsl:variable name="allids">
  <xsl:for-each-group select="//section" group-by="(@original-id,
@title)[1]">
      <id id="{current-grouping-key()}" count="count(current-group())"/>
  </xsl:for-each-group>
</xsl:variable>

Here's a function to get a unique ID derived from a string s and a sequence
number, that is guaranteed unique:

<xsl:function name="f:unique" as="xs:string">
  <xsl:param name="input" as="xs:string"/>
  <xsl:param name="gid" as="xs:string"/>
  <xsl:choose>
    <xsl:when test="exists($allids/id[@id=$input]">
      <xsl:sequence select="f:unique(concat($input, '_', $gid))"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:sequence select="$input"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:function>

And then when processing an individual section, 

<xsl:attribute name="id">
  <xsl:choose>
    <xsl:when test="@id">
      <xsl:value-of select="@id"/>
    </xsl:when>
    <xsl:when test="$allids/id[@id=current()/@title]/@count eq 1">
      <xsl:value-of select="@title"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="f:unique(@title, generate-id())"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:attribute>

Instead of using generate-id() for disambiguation, you could use the result
of xsl:number level="any". This would mean that if there are two sections
titled "Introduction", one gets the id "Introduction_1", the other
"Introduction_2". In the rare event that "Introduction_1" is already in use,
you would get "Introduction_1_1" etc.

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: Trevor Nicholls [mailto:trevor@xxxxxxxxxxxxxxxxxx] 
> Sent: 14 October 2008 07:26
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject:  generating ID strings that are both readable and unique
> 
> Hi
> 
> In this particular application we have a set of XML documents 
> which are divided into nested sections; each section may 
> (down the track) give rise to a url. Currently that url is 
> generated by <xsl:number level="multiple"> but this produces 
> urls that change frequently. Some sections have been given an 
> ID attribute by the process which originally created the 
> documents, but most have not.
> Additionally, all sections must have exactly one title child, 
> along with their other content.
> 
> The requirement is to process an XML file and generate an ID 
> attribute for sections which lack them - deriving the ID 
> value from the title so that the url is comprehensible. 
> Providing we ignore the problem cases, this is a trivial exercise:
> 
> ----
> 
>  <xsl:variable name="upchars" 
> select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />  <xsl:variable 
> name="lochars" select="'abcdefghijklmnopqrstuvwxyz'" />
> 
>  <!-- catchall -->
>  <xsl:template match="*">
>    <xsl:copy>
>      <xsl:apply-templates select="@*" />
>      <xsl:apply-templates />
>    </xsl:copy>
>  </xsl:template>
> 
>  <xsl:template match="@*">
>    <xsl:copy-of select="." />
>  </xsl:template>
> 
>  <xsl:template match="section[@id]">
>    <xsl:copy>
>      <xsl:apply-templates select="@*" />
>      <xsl:apply-templates />
>    </xsl:copy>
>  </xsl:template>
> 
>  <xsl:template match="section">
>    <xsl:copy>
>      <xsl:attribute name="id">
>        <xsl:apply-templates select="title" mode="id" />
>      </xsl:attribute>
>      <xsl:apply-templates select="@*" />
>      <xsl:apply-templates />
>    </xsl:copy>
>  </xsl:template>
> 
>  <xsl:template match="title" mode="id">
>    <xsl:value-of select="translate(translate(.,' 
> ','_'),$upchars,$lochars)"
> />
>  </xsl:template>
> 
> ----
> 
> The problem cases are
> (a) duplicate titles (after the translations) which would 
> lead to duplicate IDs, and
> (b) existing IDs which might also duplicate a title.
> 
> If there were no IDs in the document to begin with, I think I 
> could have solved the first problem by using a key. But the 
> second problem complicates it, and I haven't got enough 
> experience with keys to figure out how to adjust the "id" 
> mode title template to take both issues into account.
> 
> Can anyone offer some helpful advice here?
> XSL 1.0 is preferred, although I would be interested to see 
> how XSL2 might handle this problem too. 
> 
> Thanks
> Trevor

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.