[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: tokenizing and counting with xsl:analyze-string
If you're really keen to avoid putting temporary results in memory, then with Saxon, I think you can do: <xsl:variable name="temp_result" as="xs:boolean*"> <xsl:analyze-string select="'abhello1cdehello2fghijklhello3hello4mhello5nhello6'" regex="hello[1-9]"> <xsl:matching-substring> <xsl:sequence select="true()"/> </xsl:matching-substring> <xsl:non-matching-substring> <xsl:sequence select="false0"/> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:iterate select="$temp_result"> <xsl:param name="m" select="0" as="xs:integer"/> <xsl:param name="n" select="0" as="xs:integer"/> <xsl:on-completion> <result> <yes count="{$m}"/> <no count="{$n}"/> </result> </xsl:on-completion> <xsl:next-iteration> <xsl:with-param name="m" select="$m + xs:integer(.)"/> <xsl:with-param name="n" select="$n + xs:integer(not(.))"/> </xsl:next-iteration> </xsl:iterate> This relies on the fact that Saxon will always try to inline a variable that's only referenced once; and if the variable is a sequence, this means that the value will be pipelined ratehr than being materialized in memory. For a sequence containing a few dozen booleans, that's not going to give any bottom-line savings. But if the sequence contains millions of items, it might. The `xsl:iterate` could also be replaced with a fold: <xsl:variable name="counts" select="fold-left($temp_result, ma p{true():0, false():0}, fu nction($val, $next){map:put($val, $next, $val($next)+1)})" as="map(xs:boolean, xs:integer)"/> <result> <yes count="{$counts(true())}"/> <no count="{$counts(false())}"/> </result> > On 17 Oct 2020, at 10:14, Michael Kay mike@xxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > You can construct a sequence of booleans, in which case you should use <xsl:sequence select="true()"/> in place of <xsl:value-of select="1"/>, and then you can use `count($temp_result[.])` and `count($temp_result[not(.)]` to count the number of true and false items respectively. > > If you want to construct the variable as a single string, you can use xsl:value-of as I suggested, but then you must declare the variable as="xs:string". But using a sequence of booleans is probably better. > > Michael Kay > Saxonica > > > >> On 17 Oct 2020, at 10:04, Mukul Gandhi gandhi.mukul@xxxxxxxxx <mailto:gandhi.mukul@xxxxxxxxx> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx <mailto:xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote: >> >> On Sat, Oct 17, 2020 at 1:22 PM Michael Kay mike@xxxxxxxxxxxx <mailto:mike@xxxxxxxxxxxx> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx <mailto:xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote: >> With xsl:analyse-string you would still need a variable, but it could be a simpler variable: for example it might just contain a "1" for a match, and a "0" for a non-match; at the end you then need to count the ones and zeros which you can do with string-length(translate(...)). >> >> With your suggestion, below mentioned is my new XSLT stylesheet, >> >> <xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform <http://www.w3.org/1999/XSL/Transform>" >> xmlns:xs="http://www.w3.org/2001/XMLSchema <http://www.w3.org/2001/XMLSchema>" >> exclude-result-prefixes="xs"> >> >> <xsl:output method="xml" indent="yes"/> >> >> <xsl:template match="/"> >> <xsl:variable name="temp_result" as="xs:boolean*"> >> <xsl:analyze-string select="'abhello1cdehello2fghijklhello3hello4mhello5nhello6'" >> regex="hello[1-9]"> >> <xsl:matching-substring> >> <xsl:value-of select="1"/> >> </xsl:matching-substring> >> <xsl:non-matching-substring> >> <xsl:value-of select="0"/> >> </xsl:non-matching-substring> >> </xsl:analyze-string> >> </xsl:variable> >> <result> >> <yes count="{count(index-of($temp_result, true()))}"/> >> <no count="{count(index-of($temp_result, false()))}"/> >> </result> >> </xsl:template> >> >> </xsl:stylesheet> >> >> The above stylesheet gives me the desired result. >> >> But the above mentioned XSLT stylesheet, doesn't do exactly what you've suggested. >> >> I would preferably, wish to declare my XSLT variable as follows, >> >> <xsl:variable name="temp_result" as="xs:string"> >> <xsl:analyze-string ... >> </xsl:variable> >> >> with an expectation that, content of this new kind of variable would be a string (i.e, an atomic xs:string value) of 1 s & 0 s characters, on which I could do string-length(translate(...)). Is this doable? >> >> >> >> -- >> Regards, >> Mukul Gandhi >> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list> >> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by email <applewebdata://8452EB5E-55B9-494F-A5B8-B9C3F798A4B0>) > > XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list> > EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by email <>)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|