[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Parse a date - exslt:parse-date in Saxon 6
Following a follow-up question about the if-condition-return idiom I managed to get something working sufficient for my requirements. As promised here's some XSL 1 code that can take a date expressed in any one of a number of formats and write the result in a given format. (XSL 1 because it fits in a DocBook customization, which needs to be XSLT 1) The formats available are quite a limited list, but sufficient for my purposes. As it's XSL 1 there's no regular expression usage. The result is more verbose than I'd expect to be able to achieve in a language I'm more familiar with (eg C++). The first template 'format.meta.date' calls the two templates for the two general sets of formats supported. 'parse.date.1' handles dates with alphabetic months. 'parse.date.2' handles numeric months. All dates are in UK/European format (d/m/y). If you want US (m/d/y) it should be straightforward enough to change. The output is a date in one of epub3's required formats - YYYY, YYYY-MM, YYYY-MM-DD. Two digit years are presumed to be in the past. They are considered to be 21st century if lower than the current year mod 100, 20th century if higher. The original had some xsl:message lines in to aid debugging - I may have mis-edited them in in the course of editing for this message, apologies if so. Regards, Richard. <!-- EPUB3 meta date should be of the form: YYYY, YYYY-MM or YYYY-MM-DD --> <xsl:template name="format.meta.date"> <xsl:param name="string" select="''"/> <xsl:param name="node" select="."/> <!-- A quick search has shown the following formats in use: 28 April 2009, 19 November 2003, 10 December 2003, 16/05/2012, 10/06/2014, 22/7/2010, 12/8/2010, 31 Mar 2011, 09 Dec 2010, 04 Nov. 09, 29 Oct. 09, 14 Oct. 09, Feb 09 Categorizing as follows (after normalize-space): "dd? mmm(m{0,6}).? yy(yy)?" "dd?/mm?/tt(yy)?" "mmm(m{0,6}).? yy(yy)?" Though XSLT 1 doesn't include regular expressions, so can't use it like this. --> <xsl:variable name="normalized" select="translate($string, '0123456789', '##########')"/> <xsl:variable name="date.ok"> <xsl:choose> <xsl:when test="string-length($string) = 4 and $normalized = '####'">1</xsl:when> <xsl:when test="string-length($string) = 7 and $normalized = '####-##'">1</xsl:when> <xsl:when test="string-length($string) = 10 and $normalized = '####-##-##'">1</xsl:when> <xsl:when test="string-length($string) = 10 and $normalized = '####-##-##'">1</xsl:when> <xsl:otherwise>0</xsl:otherwise> </xsl:choose> </xsl:variable> <!-- It isn't one of the permitted formats. See if we can parse it as one of our own formats. --> <xsl:variable name="string.1"> <xsl:call-template name="parse.date.1" > <xsl:with-param name="string" select="$string" /> </xsl:call-template> </xsl:variable> <xsl:variable name="string.2"> <xsl:call-template name="parse.date.2" > <xsl:with-param name="string" select="$string" /> </xsl:call-template> </xsl:variable> <xsl:variable name="new.string"> <xsl:choose> <xsl:when test="string-length( $string.1 ) > 0" > <xsl:value-of select="$string.1"/> </xsl:when> <xsl:when test="string-length( $string.2 ) > 0" > <xsl:value-of select="$string.2"/> </xsl:when> <xsl:otherwise> <xsl:message> <xsl:text>WARNING: wrong metadata date format: '</xsl:text> <xsl:value-of select="$string"/> <xsl:text>' in element </xsl:text> <xsl:value-of select="local-name($node/..)"/> <xsl:text>/</xsl:text> <xsl:value-of select="local-name($node)"/> <xsl:text>. It must be in one of these forms: </xsl:text> <xsl:text>YYYY, YYYY-MM, YYYY-MM-DD,</xsl:text> <xsl:text>DD MMM(...) (YY)YY, DD/MM/(YY)YY.</xsl:text> </xsl:message> <xsl:value-of select="''" /> </xsl:otherwise> </xsl:choose> </xsl:variable> <!-- return the string anyway --> <xsl:value-of select="$new.string"/> </xsl:template> <xsl:template name="parse.date.1"> <xsl:param name="string" select="''"/> <!-- Parse the following formats. "dd? mmm(m{0,6}).? yy(yy)?" "mmm(m{0,6}).? yy(yy)?" --> <!-- Months have three (May) to nine (September) letters. Optional dot. --> <xsl:variable name="normalized" select="translate($string, '0123456789', '##########')"/> <!-- normalize spaces. So " Dec 96 ", " 6 dec. 96 " etc all become "Dec 96" or "6 dec. 96" --> <xsl:variable name="normalized2" select="normalize-space($normalized)"/> <!-- force to lower case --> <xsl:variable name="normalized3" select="translate($normalized2, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz' )"/> <!-- strip numerics. Giving "may ", " dec. " etc. --> <xsl:variable name="normalized4" select="translate($normalized3, '#', '' )"/> <!-- normalize spaces again. Giving "may", "dec." --> <xsl:variable name="month-raw" select="normalize-space($normalized4)"/> <!-- remove trailing dot, if present. Giving "may", "sept" etc --> <xsl:variable name="month-dotless" > <xsl:choose> <xsl:when test="substring($month-raw, string-length($month-raw), 1) = '.'"> <xsl:value-of select="substring($month-raw, 1, string-length($month-raw) - 1)" /> </xsl:when> <xsl:otherwise> <xsl:value-of select="$month-raw" /> </xsl:otherwise> </xsl:choose> </xsl:variable> <!-- categorize. alphabetics become '%' --> <xsl:variable name="normalized7" select="translate($month-dotless, 'abcdefghijklmnopqrstuvwxyz', '%%%%%%%%%%%%%%%%%%%%%%%%%%' )"/> <!-- By this point we have month names in isolation, without dots, length as given. So expecting '%' only, three to nine times. --> <xsl:variable name="normalized8" select="translate($normalized7, '%', '' )"/> <!-- cleared alphabetics, so expect nothing left. --> <xsl:variable name="date.ok.1"> <xsl:choose> <xsl:when test="string-length($normalized8) = 0">true</xsl:when> <xsl:otherwise>false</xsl:otherwise> </xsl:choose> </xsl:variable> <!-- <xsl:if test="$date.ok.1 = false"> <xsl:message> <xsl:text>WARNING: unrecognized month (non-alphabetics): '</xsl:text> <xsl:value-of select="$month-dotless"/> <xsl:text>' in '</xsl:text> <xsl:value-of select="$string"/> <xsl:text>'.</xsl:text> </xsl:message> </xsl:if>--> <!-- check range of lengths. --> <xsl:variable name="date.ok.2"> <xsl:choose> <xsl:when test="string-length($normalized7) >= 3 and string-length($normalized7) <= 9">1</xsl:when> <xsl:otherwise>0</xsl:otherwise> </xsl:choose> </xsl:variable> <!-- extract three letter prefix of month name. month-dotless has the month in lower case, whatever length it was given. --> <xsl:variable name="normalized9" select="substring($month-dotless, 1, 3)" /> <!-- check three letter version is valid. Look it up in the reference set. --> <xsl:variable name="months">janfebmaraprmayjunjulaugsepoctnovdec</xsl:variable> <xsl:variable name="month-valid" select="contains($months, $normalized9)" /> <!-- Now we're saying "ok we found it, but what was the index we found it at. --> <xsl:variable name="month-before" select="substring-before($months, $normalized9)" /> <xsl:variable name="month-index" select="(string-length( $month-before ) div 3) +1" /> <xsl:variable name="month-name"> <xsl:choose> <xsl:when test="$month-index = 1">january</xsl:when> <xsl:when test="$month-index = 2">february</xsl:when> <xsl:when test="$month-index = 3">march</xsl:when> <xsl:when test="$month-index = 4">april</xsl:when> <xsl:when test="$month-index = 5">may</xsl:when> <xsl:when test="$month-index = 6">june</xsl:when> <xsl:when test="$month-index = 7">july</xsl:when> <xsl:when test="$month-index = 8">august</xsl:when> <xsl:when test="$month-index = 9">september</xsl:when> <xsl:when test="$month-index = 10">october</xsl:when> <xsl:when test="$month-index = 11">november</xsl:when> <xsl:when test="$month-index = 12">december</xsl:when> </xsl:choose> </xsl:variable> <!-- We now have the full name of the month. Check that if a longer form was given it matches the full name. --> <xsl:variable name="month-valid-full" select="$month-dotless = substring( $month-name, 1, string-length( $month-dotless ) )" /> <xsl:variable name="month-string-3" select="format-number( $month-index, '00' )" /> <!-- Now get the day and year. --> <!-- force to lower case --> <xsl:variable name="normalized10" select="translate($string, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz' )"/> <xsl:variable name="day-string" select="substring-before( $normalized10, $month-dotless )" /> <xsl:variable name="day-string-2" select="normalize-space( $day-string )" /> <xsl:variable name="day-num" select="number( $day-string-2 )" /> <xsl:variable name="day-string-3" select="format-number( $day-num, '00' )" /> <xsl:variable name="year-string" select="substring-after( $normalized10, $month-raw )" /> <xsl:variable name="year-string-2" select="normalize-space( $year-string )" /> <xsl:variable name="this-year" select="date:year()" /> <xsl:variable name="this-year-in-century" select="$this-year mod 100" /> <xsl:variable name="year-num" select="number( $year-string-2 )" /> <xsl:variable name="year-num-2"> <xsl:choose> <xsl:when test="$year-num < $this-year-in-century"> <xsl:value-of select="$year-num + 2000" /> </xsl:when> <xsl:when test="$year-num > $this-year-in-century and $year-num < 100"> <xsl:value-of select="$year-num + 1900" /> </xsl:when> <xsl:otherwise> <xsl:value-of select="$year-num" /> </xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="year-string-3" select="format-number( $year-num-2, '0000' )" /> <!-- Return something. --> <xsl:variable name="return" > <xsl:if test="$date.ok.1 and $date.ok.2 and $month-valid and $month-valid-full"> <xsl:choose> <xsl:when test="string-length( $day-string ) > 0"> <xsl:variable name="result"> <xsl:value-of select="$year-string-3" />-<xsl:value-of select="$month-string-3" />-<xsl:value-of select="$day-string-3" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> <xsl:when test="string-length( $day-string ) = 0"> <xsl:variable name="result"> <xsl:value-of select="$year-string-3" />-<xsl:value-of select="$month-string-3" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> </xsl:choose> </xsl:if> </xsl:variable> <xsl:value-of select="$return" /> </xsl:template> <xsl:template name="parse.date.2"> <xsl:param name="string" select="''"/> <!-- Parse the following formats. "dd?/mm?/tt(yy)?" ie. dd/mm/yyyy mm/yyyy (where dd may be d, mm may be m, yyyy may be yy) --> <!-- Turn numbers to # and remove all spaces. --> <xsl:variable name="normalized" select="translate($string, '0123456789 ', '##########')"/> <!-- should now be '#/#/##' '#/#/####' '#/##/##' '#/##/####' '##/#/##' '##/#/####' '##/##/##' '##/##/####' --> <!-- strip numerics. Giving "//" or "/" --> <xsl:variable name="normalized2" select="translate($normalized, '#', '' )"/> <!-- cleared numerics, so expect "//". --> <xsl:variable name="date.check.1"> <xsl:choose> <xsl:when test="$normalized2 = '//'">2</xsl:when> <xsl:when test="$normalized2 = '/'">1</xsl:when> <xsl:otherwise>0</xsl:otherwise> </xsl:choose> </xsl:variable> <!-- <xsl:if test="$date.check.1 = 0"> <xsl:message> <xsl:text>WARNING: unrecognized format (n/n/n or n/n): '</xsl:text> <xsl:value-of select="$normalized2"/> <xsl:text>' in '</xsl:text> <xsl:value-of select="$string"/> <xsl:text>'.</xsl:text> </xsl:message> </xsl:if>--> <!-- strip slashes. Giving "####" to "########". --> <xsl:variable name="normalized3" select="translate($normalized, '/', '' )"/> <!-- check range of lengths. --> <xsl:variable name="date.ok.2"> <xsl:choose> <xsl:when test="string-length($normalized3) >= 4 and string-length($normalized3) <= 8">true</xsl:when> <xsl:otherwise>false</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="before-slash-1" select="substring-before($string, '/')" /> <xsl:variable name="after-slash-1" select="substring-after($string, '/')" /> <xsl:variable name="before-slash-2" select="substring-before($after-slash-1, '/')" /> <xsl:variable name="after-slash-2" select="substring-after($after-slash-1, '/')" /> <!-- Work out which is which, ie dd/mm/yy(yy) or mm/yy(yy). --> <xsl:variable name="year-num" > <xsl:choose> <xsl:when test="($date.check.1 = 2) and $date.ok.2"> <!-- dd/mm/yy(yy) --> <xsl:variable name="result"> <xsl:value-of select="number( $after-slash-2 )" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> <xsl:when test="($date.check.1 = 1) and $date.ok.2"> <!-- mm/yy(yy) --> <xsl:variable name="result"> <xsl:value-of select="number( $after-slash-1 )" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> </xsl:choose> </xsl:variable> <xsl:variable name="month-num" > <xsl:choose> <xsl:when test="($date.check.1 = 2) and $date.ok.2"> <!-- dd/mm/yy(yy) --> <xsl:variable name="result"> <xsl:value-of select="number( $before-slash-2 )" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> <xsl:when test="($date.check.1 = 1) and $date.ok.2"> <!-- mm/yy(yy) --> <xsl:variable name="result"> <xsl:value-of select="number( $before-slash-1 )" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> </xsl:choose> </xsl:variable> <xsl:variable name="this-year" select="date:year()" /> <xsl:variable name="this-year-in-century" select="$this-year mod 100" /> <xsl:variable name="year-num-2"> <xsl:choose> <xsl:when test="$year-num < $this-year-in-century"> <xsl:value-of select="$year-num + 2000" /> </xsl:when> <xsl:when test="$year-num > $this-year-in-century and $year-num < 100"> <xsl:value-of select="$year-num + 1900" /> </xsl:when> <xsl:otherwise> <xsl:value-of select="$year-num" /> </xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="day-num" > <xsl:choose> <xsl:when test="($date.check.1 = 2) and $date.ok.2"> <!-- dd/mm/yy(yy) --> <xsl:variable name="result"> <xsl:value-of select="number( $before-slash-1 )" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> <xsl:when test="($date.check.1 = 1) and $date.ok.2"> <!-- mm/yy(yy) --> <xsl:variable name="result" select="0" /> <xsl:value-of select="$result" /> </xsl:when> </xsl:choose> </xsl:variable> <xsl:variable name="day-string" select="format-number( $day-num, '00' )" /> <xsl:variable name="month-string" select="format-number( $month-num, '00' )" /> <xsl:variable name="year-string" select="format-number( $year-num-2, '0000' )" /> <!-- Return something. We've already worked out which is year and which is the num. --> <xsl:variable name="return" > <xsl:choose> <xsl:when test="($date.check.1 = 2) and $date.ok.2 and $day-num > 0"> <!-- dd/mm/yy(yy) --> <xsl:variable name="result"> <xsl:value-of select="$year-string" />-<xsl:value-of select="$month-string" />-<xsl:value-of select="$day-string" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> <xsl:when test="($date.check.1 = 1) and $date.ok.2 and $day-num = 0"> <!-- mm/yy(yy) --> <xsl:variable name="result"> <xsl:value-of select="$year-string" />-<xsl:value-of select="$month-string" /> </xsl:variable> <xsl:value-of select="$result" /> </xsl:when> </xsl:choose> </xsl:variable> <xsl:value-of select="$return" /> </xsl:template> Richard Kerry BNCS Engineer, SI SOL Telco & Media Vertical Practice T: +44 (0)20 3618 2669 M: +44 (0)7812 325518 G300, Stadium House, Wood Lane, London, W12 7TA richard.kerry@xxxxxxxx This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. ________________________________________ From: Martin Honnen martin.honnen@xxxxxx [xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx] Sent: 12 June 2014 19:14 To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: Parse a date - exslt:parse-date in Saxon 6 Kerry, Richard richard.kerry@xxxxxxxx wrote: > I can see that a suitable parser function (parse-date) is defined in > Exslt but it isn't clear whether it is already available to me or how to > get it into use if not. Actually according to the exslt documentation > it is definitely not available in any XSLT processor but there > are JavaScript and Msxsl implementations available. > > Can someone advise how I can get this to work ? > > Can I get Saxon 6 to call a JavaScript function ? As far as I know there is no way with Saxon 6 to use Javascript to implement extension functions. Atos, Atos Consulting, Worldline and Canopy The Open Cloud Company are trading names used by the Atos group. The following trading entities are registered in England and Wales: Atos IT Services UK Limited (registered number 01245534), Atos Consulting Limited (registered number 04312380), Atos Worldline UK Limited (registered number 08514184) and Canopy The Open Cloud Company Limited (registration number 08011902). The registered office for each is at 4 Triton Square, Regentbs Place, London, NW1 3HG.The VAT No. for each is: GB232327983. This e-mail and the documents attached are confidential and intended solely for the addressee, and may contain confidential or privileged information. If you receive this e-mail in error, you are not authorised to copy, disclose, use or retain it. Please notify the sender immediately and delete this email from your systems. As emails may be intercepted, amended or lost, they are not secure. Atos therefore can accept no liability for any errors or their content. Although Atos endeavours to maintain a virus-free network, we do not warrant that this transmission is virus-free and can accept no liability for any damages resulting from any virus transmitted. The risks are deemed to be accepted by everyone who communicates with Atos by email.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|