Re: Muenchian keys ... plus a bit?

Play the video

Subject: Re: Muenchian keys ... plus a bit?
From: "Thomas B. Passin" <tpassin@xxxxxxxxxxxx>
Date: Wed, 22 Aug 2001 11:05:46 -0400

Dave, here's what I would try.

1) Create a list of all items and assign it to a variable" all-items".

2) Create a list of all unique items (based on their PCDATA - that is,  all
<item>content</item>
elements get represented by one element in this list.  Assign it to a
variable "unique-items".  This is the "Muenchian" part, of course.

3) Do a for-each on $unique-items.  At each iteration, output that item's
header (e.g., "content"), then find all the item nodes with that name:
<variable name='this-items-name' select='name()'/>
<variable name='these-items' select='$all-items[name()=$this-items-name]'/>

4) Do a for-each over $these-items.  You could sort them, too.  This is
where you output the pages.

Once this is working, you could create some keys if your files are big and
you need some speed-up action.

I didn't try this so some details may need tuning up, but it should work
nicely.

Cheers,

Tom P

[<DPawson@xxxxxxxxxxx>]
> Given
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE idx [
> <!ELEMENT idx (ent+)>
> <!ELEMENT ent (item, pge+)>
> <!ELEMENT item (#PCDATA)>
> <!ELEMENT pge  (#PCDATA)>
> <!ATTLIST pge key (t|f) 'f'>
>
>
> ]>
>
> <idx>
>  <ent>
>   <item>content</item>
>   <pge key="f">98</pge>
>  </ent>
>  <ent>
>   <item>content</item>
>   <pge key="f">108</pge>
>   <pge>110</pge>
>  </ent>
>  <ent>
>   <item>another</item>
>   <pge key="f">100</pge>
>  </ent>
>  <ent>
>   <item>zero</item>
>   <pge key="t">210</pge>
>  </ent>
> </idx>
>
>
> And indexing DTD.
>
> I want to present it as
>
> A  B  C .... Z
> (each hotlinked to the start of that letter).
>
> Then
>
> A  (the anchor)
>
> aardvark, page 1,67,79
>   (say with page 67
> -------------------
> B
>
> bathtub, page 3,5,7
>
> ------------------
>
> Z
>
> zero, page 210
>    (210 in bold, its the main entry)
> etc.
>
> Two pass solution, first sorting, to make data entry easy.
> Being lazy, I don't always remember that I've already made
> an entry for a particular element, so there are duplicates.
> the <item> is duplicated, but the page numbers are not,
> hence the 'remove duplicates' approach of keys only partially works.
> Hence the Muenchian plus (I think :-).
>
> Question, how to remove the duplicate entries without losing
> the page numbers associated with the duplicate?
>
> I found this quite an interesting stylesheet, till I couldn't
> figure out the key definitions/usage, then I was stopped.
>
> I have everything except the 'remove duplicates' bit.
>



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread
Muenchian keys ... plus a bit? DPawson - Wed, 22 Aug 2001 04:03:51 -0400 (EDT) Jeni Tennison - Wed, 22 Aug 2001 05:53:02 -0400 (EDT) Thomas B. Passin - Wed, 22 Aug 2001 11:14:45 -0400 (EDT) <=

<- Previous	Index	Next ->
Re: Muenchian keys ... plus a, Jeni Tennison	Thread	Re: basic calendar date XML?, Joerg Pietschmann
RE: A little cross referencin, Pywell, Andrew	Date	RE: How to select the followi, Stephane . Le-Deaut
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >