[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: shuffling words in text content

Subject: Re: shuffling words in text content
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 7 Sep 2021 19:31:42 -0000
Re:  shuffling words in text content
What's wrong with

tokenize(.) => random-number-generator()?permute() => string-join(" ")

Michael Kay
Saxonica

> On 7 Sep 2021, at 20:20, Chris Papademetrious
christopher.papademetrious@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi everyone,
>
> I recently needed to write a transformation to shuffle words in text
content, but still keep the overall element structure intact. For example, I
might want to transform
>
> <p>Hey, here is some text!</p>
>
> into
>
> <p>is, text Hey some here!</p>
>
> I didn't see anything exactly like this in the list archives or in
StackOverflow, so I thought I'd share what I came up with:
>
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform
> 	xmlns:xs=http://www.w3.org/2001/XMLSchema
> 	exclude-result-prefixes="#all"
> 	version="2.0">
>  <xsl:output indent="yes"/>
>
>
>  <!-- regex that defines what a "word" is -->
>  <xsl:param name="word-pattern" select="'(\w+)'"/>
>
>
>  <!-- identity transformation -->
>  <xsl:template match="@*|node()">
>    <xsl:copy>
>      <xsl:apply-templates select="@*|node()"/>
>    </xsl:copy>
>  </xsl:template>
>
>
>  <!-- shuffle words in each text() element -->
>  <xsl:template match="text()[not(ancestor::pre)]">
>    <!-- get the list of words in this block of text -->
>    <xsl:variable name="words" as="node()*">
>      <xsl:analyze-string select="." regex="{$word-pattern}">
>        <xsl:matching-substring>
>          <word><xsl:value-of select="."/></word>
>        </xsl:matching-substring>
>      </xsl:analyze-string>
>    </xsl:variable>
>
>    <!-- perturb the word order -->
>    <xsl:variable name="shuffled-words" as="xs:string*">
>      <xsl:call-template name="pick-random-item">
>        <xsl:with-param name="items" select="$words"/>
>      </xsl:call-template>
>    </xsl:variable>
>
>    <!-- reform the string with the reordered words-->
>    <xsl:analyze-string select="." regex="{$word-pattern}">
>      <xsl:matching-substring>
>        <xsl:variable name="this-position" select="position()"/>
>        <xsl:value-of select="$shuffled-words[floor(($this-position + 1) div
2)]"/>
>      </xsl:matching-substring>
>      <xsl:non-matching-substring>
>        <xsl:value-of select="."/>
>      </xsl:non-matching-substring>
>    </xsl:analyze-string>
>  </xsl:template>
>
>
>  <!-- XSLT item shuffler, borrowed from
>       https://stackoverflow.com/questions/21953336/randomize-node-order-xslt
-->
>  <xsl:param name="initial-seed" select="123"/>
>  <xsl:template name="pick-random-item">
>    <xsl:param name="items" />
>    <xsl:param name="seed" select="$initial-seed"/>
>    <xsl:if test="$items">
>      <!-- generate a random number using the "linear congruential generator"
algorithm -->
>      <xsl:variable name="a" select="1664525"/>
>      <xsl:variable name="c" select="1013904223"/>
>      <xsl:variable name="m" select="4294967296"/>
>      <xsl:variable name="random" select="($a * $seed + $c) mod $m"/>
>      <!-- scale random to integer 1..n -->
>      <xsl:variable name="i" select="floor($random div $m * count($items)) +
1"/>
>      <!-- write out the corresponding item -->
>      <xsl:copy-of select="$items[$i]"/>
>      <!-- recursive call with the remaining items -->
>      <xsl:call-template name="pick-random-item">
>        <xsl:with-param name="items" select="$items[position()!=$i]"/>
>        <xsl:with-param name="seed" select="$random"/>
>      </xsl:call-template>
>    </xsl:if>
>  </xsl:template>
>
> </xsl:stylesheet>
>
>
> Link to XSLT Fiddle here:
https://xsltfiddle.liberty-development.net/nbiE1aZ/1
>
> The approach is:
>
> 1. Call <xsl:analyze-string> to extract the words from a text() element.
> 2. Call a template that shuffles the words.
> 3. Call <xsl:analyze-string> (again) to substitute the shuffled words in
place of the original words.
>
> Hopefully this is helpful if someone needs to solve a similar problem in the
future!
>
> -----
> Chris Papademetrious
> Tech Writer, Synopsys, Inc.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.