Using XSLT to build an index

Play the video

Subject: Using XSLT to build an index
From: "Mark" <mark@xxxxxxxxxxxx>
Date: Sun, 30 Oct 2011 14:47:34 -0700

The list archives did not seem to contain an XSLT stylesheet that could index an XML file, but I may have missed it. Is it practical to write my own XSLT 2 indexing stylesheet? If so, I have a bilingual XML file that I want to index. My assumptions are that I must get rid of the punctuation properly, then isolate the words, sort them, remove stop words, and so on. To get started, I need a bit of help. All of the phrases are found in two attributes: @czech and @eng.

Three questions: (1) I am aware from Michaelbs book that regex expressions may be used in the replace() function, but I do not know how to write that regex expression. I would like to remove all the punctuation from a phrase as follows: for everything except a hyphen [-], replacement should be with an empty string; the hyphen should be replaced with a single space.

(2) I assume that to get rid of extra spaces (if any), I can use a construct like: normalize-space(replace(@czech, bsome regex expressionb)).

(3) I assume that tokenize(normalize-space(replace(@czech, 'some regex expression'))) will permit me to write out a list of the words found in those attributes to an XML document. I am not completely clear as to what tokenize() returns, or how to access that return.

I would appreciate any comments, and especially the construction of the regex expression needed. Thanks, Mark

Current Thread
Using XSLT to build an index Mark - 30 Oct 2011 21:47:50 -0000 <= G. Ken Holman - 30 Oct 2011 22:07:51 -0000 Michael Kay - 30 Oct 2011 23:07:47 -0000 Mark - 30 Oct 2011 23:24:47 -0000 Mark - 30 Oct 2011 23:11:34 -0000

<- Previous	Index	Next ->
Re: Position() Function Using, Michael Kay	Thread	Re: Using XSLT to build an in, G. Ken Holman
Re: Position() Function Using, Michael Kay	Date	Re: Using XSLT to build an in, G. Ken Holman
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >