[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: String hashing code

Subject: Re: String hashing code
From: Sascha Mantscheff <922492@xxxxxx>
Date: Fri, 14 Dec 2007 09:07:58 +0100
Re:  String hashing code
I don't have a XSLT solution. With Saxon I use for a similar problem an extension which returns md5 hashes for the serialized content. Find the source code below. I call it from within XSLT with

<xsl:variable name="serialized_content">
<xsl:value-of select="saxon:serialize(current-group()[1],'')"/>
</xsl:variable>
<xsl:variable name="hash">
<xsl:value-of select="md5:md5($serialized_content)"/>
</xsl:variable>


--- file md5.java ---
import java.util.*;
import java.io.*;
import java.security.*;

/* Saxon extension for generating unique hash values. */

public class Md5 {
public static String hex(byte[] array) {
StringBuffer sb = new StringBuffer();
for (int i = 0; i < array.length; ++i) {
sb.append(Integer.toHexString((array[i] & 0xFF) | 0x100).toUpperCase().substring(1,3));
}
return sb.toString();
}
public static String md5 (String message) throws NoSuchAlgorithmException, UnsupportedEncodingException {
MessageDigest md = MessageDigest.getInstance("MD5");
return hex (md.digest(message.getBytes("CP1252")));
}
}



Deborah Pickett schrieb:
A challenge to the XSLT demigods...

I am processing a number of separate XML documents using an Ant <xslt>
task, pulling out the MathML that is embedded inside them into their own
XML files using xsl:result-document (where I render them using Batik).
I want to make sure that the result document names don't clash, but
because they are across several source files, generate-id() isn't going
to suffice.  There are thousands of source files, all with
English-sounding names spread across many directories.

I was thinking of hashing document-uri(/) to produce a probably-unique
string that I can then append generate-id(.) to.  I rejected
encode-for-uri() as producing strings that are too long, and for not
anonymizing the document uri enough.  All the hashing algorithms I know
(MD5, for instance) happen to be heavy on bitwise operations, and I feel
dirty doing bitwise operations with arithmetic.

I prefer not to escape to non-XSLT, because I am providing this as part
of a library that needs to run on almost any XSLT 2.0 platform.

Any clever ideas?

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.