[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: sorting titles w stopwords but w/o value in every

Subject: Re: sorting titles w stopwords but w/o value in every title node
From: "Susan Campbell" <SCampbell@xxxxxxxxxxxxxx>
Date: Wed, 1 Sep 2004 16:44:33 -0400
stopword code
Anton and Bruce,
Thanks for your help.  I'm sorry for the delay in responding.  A large tree
fell on my house about 1 AM Tuesday morning and I have been away from work
finding a tree service and contractors, etc.  It's  quite a challenge.

I cannot do a triple sort using doc-number as the first sort.  That just puts
things in doc-number order.  I don't think I can group on doc-number and then
sort by title within that group. I think xsl:sort needs a path name.

Anton says it succinctly, I need to treat records that don't have a title as
if they do have a title. The link is that they have the same document number.
I need the records with the same doc number to show up with the corresponding
title in arrival-date order.

The processor is Saxon but it's being called from within another application.
I do not believe I can do a two-step process.  That's why I'm calling the
stopwords with document() from this stylesheet.
sc
------------------------------

Date: Mon, 30 Aug 2004 09:01:10 -0400
To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
From: "Susan Campbell" <SCampbell@xxxxxxxxxxxxxx>
Subject:  Re:  sorting titles w stopwords but w/o value in every title
node
Message-ID: <D44554884CB7D74B87423B62952F369901BD1DF1@xxxxxxxxxxxxxxxxxxxxxx>

Thanks for the help. (I am still referring to the stop-words variable =
with document('')/xsl:stylesheet/sw:stop/word because that does give me =
the sort order. Because
our setup, that may be my only option.)

The problem I still have is that entries without a value in the title =
sort first. =20
I need to group by title when the doc-number is the same. It may be both =
a sorting=20
and grouping problem, but I don't know how to go about it. =20

(The doc number is included only for testing. I left out imprint and =
ISBN from this sample for clarity. It is possible to have the same issue =
or different issue arrive on the same or different days as there are =
multiple subscriptions.)

The output I need is:
doc#	Title				      Description			Arrived date=09
53690 American Artist 			v.68:no.738(2004:Jan.)	 02/26/2004
57769	The American city & country	v.119:no.1(2004:Jan.)	 02/11/2004=09
57769						v.119:no.3(2004:Mar.)	 03/25/2004
58345 American demographics		v.26:no.1(2004:Feb.)	 02/05/2004
58345 					v.26:no.1(2004:Feb.)	 02/26/2004
58345 					v.26:no.2(2004:Mar.)	 02/26/2004
58345						v.26:no.2(2004:Mar.)	 02/26/2004

Sample of problem causing xml:
-------------
<section-02>
<title>Forbes.</title>
<isbn-issn>0015-6914</isbn-issn>
<doc-number>58615</doc-number>
<description>v.173:no.5(2004:Mar.15)</description>
<arrival-date>03/15/2004</arrival-date>
</section-02>

<section-02>
<title></title>
<isbn-issn-code></isbn-issn-code>
<doc-number>58615</doc-number>
<description>v.173:no.1(2004:Jan. 12)</description>
<arrival-date>01/12/2004</arrival-date>
</section-02>

<section-02>
<title></title>
<isbn-issn-code></isbn-issn-code>
<doc-number>58615</doc-number>
<description>v.173:no.2(2004:Feb. 02)</description>
<arrival-date>01/21/2004</arrival-date>
</section-02>

My stylesheet:
-------------
<xsl:stylesheet
   xmlns:xsl=3D"http://www.w3.org/1999/XSL/Transform" version=3D"1.0"
   xmlns:sw=3D"mailto:bubba@xxxxxxx";
   exclude-result-prefixes=3D"sw">
<xsl:include href=3D"funcs.xsl"/>
<sw:stop>
	<word>the</word>
	<word>a</word>
	<word>an</word>
</sw:stop>
<xsl:variable name=3D"stop-words" =
select=3D"document('')/xsl:stylesheet/sw:stop/word"/>
<xsl:variable name=3D"lowercase" =
select=3D"'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name=3D"uppercase" select=3D"'ABCDEFGHIJKLMNOPQRSTUV'"/>

<xsl:template match=3D"/">=09
<table border=3D"'1'">
<th colspan=3D"6">Arrived Issues sorted without stop words</th>
<tr>
<td align=3D"center"><b/>number</td>
<td align=3D"center"><b/>Title</td>
<td align=3D"center"><b/>ISBN-ISSN</td>
<td align=3D"center"><b/>Imprint</td>
<td align=3D"center"><b/>Description</td>
<td align=3D"center"><b/>Arrived</td>
</tr>
<xsl:for-each select=3D"//section-02/title">
<xsl:sort select=3D"concat(substring(substring-after(.,' '), 0 div =
boolean
($stop-words[starts-with(translate(current(), $uppercase, $lowercase),=20
concat(translate(., $uppercase, $lowercase), ' '))])), substring(., 0 =
div not
($stop-words[starts-with(translate(current(), $uppercase, $lowercase),=20
concat(translate(., $uppercase, $lowercase), ' '))])))"/>

<xsl:sort select=3D"number(concat(substring(../arrival-date, 7,4),
substring(../arrival-date, 1,2),=20
substring(../arrival-date, 4,2)))" order=3D"descending"/>=20
		=09
<tr>
<td width=3D"10%"><xsl:value-of select=3D"../doc-number"/></td>
<td width=3D"30%"><xsl:value-of select=3D"../title" /></td>
<td width=3D"10%"><xsl:value-of select=3D"../isbn-issn"/></td>
<td width=3D"20%"><xsl:value-of select=3D"../imprint"/></td>
<td width=3D"20%"><xsl:value-of select=3D"../description"/></td>
<td width=3D"10%"><xsl:value-of select=3D"../arrival-date"/></td>
</tr>
</xsl:for-each>
</table>
</xsl:template>
</xsl:stylesheet>

Thanks,
Susan Campbell
College Center for Library Automation
1753 W. Paul Dirac Drive
Tallahassee, FL 32310
850-922-6044

------------------------------

Date: Mon, 30 Aug 2004 09:17:01 -0400
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
From: Bruce D'Arcus <bdarcus@xxxxxxxxxxxxx>
Subject: Re:  sorting titles w stopwords but w/o value in every title
node
Message-Id: <E0EAD541-FA86-11D8-B6E0-000A959F0E52@xxxxxxxxxxxxx>

On Aug 30, 2004, at 9:01 AM, Susan Campbell wrote:

> The problem I still have is that entries without a value in the title
> sort first.
> I need to group by title when the doc-number is the same. It may be
> both a sorting
> and grouping problem, but I don't know how to go about it.

So is it the case that if two records -- one with a title and one
without -- share the same doc-number, then they share the same title,
even if not explicitly coded?

If that were true, I guess logically you'd group by doc-number, and
then take a title from one among the group and sort on that for the
groups?

Bruce

------------------------------

Date: Mon, 30 Aug 2004 18:34:32 +0200
To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
From: "cking" <cking@xxxxxxxxxx>
Subject: Re:  sorting titles w stopwords but w/o value in every title
node
Message-ID: <002901c48eaf$3a7e6740$408876d5@xxxxxxxxxx>

Hi Susan,

> Thanks for the help. (I am still referring to the stop-words variable with
> document('')/xsl:stylesheet/sw:stop/word because that does give me the sort
order.
> Because our setup, that may be my only option.)

I found out why it didn't work for me, it's a namespace issue. I had put your
template
inside a XHTML-output stylesheet (with xmlns="http://www.w3.org/1999/xhtml"),
and then "document('')/xsl:stylesheet/sw:stop/word" didn't return anything. If
I
change the <word> elements to <sw:word>, it works.

> The problem I still have is that entries without a value in the title sort
first.
> I need to group by title when the doc-number is the same. It may be both a
sorting
> and grouping problem, but I don't know how to go about it.
>
> (The doc number is included only for testing. I left out imprint and ISBN
from this
> sample for clarity. It is possible to have the same issue or different issue
arrive on
> the same or different days as there are multiple subscriptions.)

Maybe I don't fully understand what you're trying to get (esp. that last
sentence),
but can't you simply perform a triple-sort instead of double-sort?
First sort by doc-number, then by title and finally by date?

> <xsl:for-each select="//section-02/z13-title">

I guess you're only using "//" in your sample code, because you know this can
seriously
slow down the transform process (esp. with large input files)? Unless of
course your
input files are organized with <section-02> elements that can appear anywhere
in
the document...

Best regards
Anton Triest

------------------------------

Date: Tue, 31 Aug 2004 03:37:28 +0200
To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
From: "cking" <cking@xxxxxxxxxx>
Subject: Re:  sorting titles w stopwords but w/o value in every title
node
Message-ID: <010401c48efb$13a60780$408876d5@xxxxxxxxxx>

Susan,

I wrote:
> but can't you simply perform a triple-sort instead of double-sort?
> First sort by doc-number, then by title and finally by date?

By rereading your message (desired output, and Bruce's reply), I think I
understand
your point. You don't want to sort by doc-number. You want to treat the
records that
don't have a title, as if they do have a title, taken from another record with
the same
doc-number. Is that correct?

What processor are you using? I mean, would it be OK to do a transformation
in two steps?

Greetings
Anton Triest

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.