[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Aw: Re: Using 'collection'

Subject: Aw: Re: Using 'collection'
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 29 Aug 2015 20:05:50 -0000
Aw: Re:  Using 'collection'
Try
-xsl:read1.xsl -it:runit
--
Diese Nachricht wurde von meinem Android Mobiltelefon mit GMX Mail
gesendet.

"Mark Wilson pubs@xxxxxxxxxxxx"
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>schrieb:

  Not sure what I am doing wrong here.

  Using this batch file:
  set SAXON_HOME=C:\saxon
  set SAXON_JAR=%SAXON_HOME%\saxon9.jar
  java -jar c:\saxon\saxon9.jar read1.xsl -it:runit

  I get this error.
  P:\British Library>set SAXON_HOME=C:\saxon
  P:\British Library>set SAXON_JAR=C:\saxon\saxon9.jar
  P:\British Library>java -jar c:\saxon\saxon9.jar read1.xsl -it:runit
  Stylesheet file -it:runit does not exist

  Using this stylesheet:
  <?xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:saxon="http://saxon.sf.net/"
  xmlns:mets="http://www.loc.gov/METS/"
  xmlns:blprocess="http://bl.uk/namespaces/blprocess"
  exclude-result-prefixes="xs" version="2.0">
  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template name="runit">
  <xsl:apply-templates select="collection('docs?select=*.xml')"/>
  <xsl:for-each select="collection('docs?select=*.xml')">
  <xsl:apply-templates select="saxon:discard-document(.)"/>
  </xsl:for-each>
  </xsl:template>

  <xsl:template match="/">
  <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="mets:amdSec">
  <xsl:if test="@ID eq 'amd0002'">
  <xsl:copy-of select="descendant::blprocess:processMetadata"
  copy-namespaces="no"/>
  </xsl:if>
  </xsl:template>
  </xsl:stylesheet>

  On 8/29/2015 8:59 AM, Michael Kay mike@xxxxxxxxxxxx wrote:
  > It&rsquo;s worth putting the data in an XML database such as BaseX
  if you&rsquo;re going to use it often enough to justify the cost of
  database loading. If you just want to use it once, e.g. to extract a
  subset of the data, then collection() should do the job - either in
  XQuery or XSLT.
  >
  > To keep memory usage down, assuming you&rsquo;re implementing with
  Saxon, the simplest way is to ensure that each document is unloaded
  from memory as soon as it has been processed, which you can do with
  saxon:discard-document:
  >
  > <xsl:for-each
  select=&ldquo;collection(&lsquo;docs?select=*.xml&rsquo;)&rdquo;>
  > <xsl:apply-templates
  select=&ldquo;saxon:discard-document(.)&rdquo;>
  > </xsl:for-each>
  >
  > discard-document() is a pseudo-function that returns a document
  unchanged, but with the side effect that it is marked as available
  for garbage collection.
  >
  > Streamed processing is an alternative - but unfortunately in Saxon
  (until the next release) streaming can&rsquo;t be used together with
  collection().
  >
  > Michael Kay
  > Saxonica
  >
  >
  >> On 29 Aug 2015, at 15:25, Mark Wilson pubs@xxxxxxxxxxxx
  <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
  >>
  >> Hi Elliot,
  >> I have never used XQuery or BaseX and will look into that, but
  what you have said about the XSLT looks good. I will try to sort this
  out and see where it goes. Thanks for taking the time.
  >> Regards,
  >> Mark
  >>
  >> On 8/29/2015 7:13 AM, Eliot Kimber ekimber@xxxxxxxxxxxx wrote:
  >>> This sounds like a job better done using XQuery. A quick solution
  would be
  >>> to install BaseX and use its GUI to load your XML files and then
  apply the
  >>> query you need to the loaded docs. If you have to do complex
  >>> transformations on the things you find you can have the XQuery
  emit an XML
  >>> file that you can then apply an XSLT to, rather than trying to
  implement
  >>> the transform entirely in XQuery.
  >>>
  >>> With XSLT and Saxon you could do something like:
  >>>
  >>> <xsl:stylesheet ...>
  >>>
  >>> <xsl:template name="run">
  >>> <xsl:apply-templates select="collection('docs?select=*.xml')"/>
  >>> </xsl:template>
  >>>
  >>> <xsl:template match="/">
  >>> <!-- do stuff to find what you want in each doc -->
  >>> </xsl:template>
  >>> </xsl:stylesheet>
  >>>
  >>> Then use the -i flag for Saxon to specify the initial template to
  run
  >>> ("run").
  >>>
  >>> The size of the documents shouldn't be a big issue, especially if
  you can
  >>> allocate sufficient memory to the processor. You could probably
  take
  >>> advantage of new streaming features in XSLT 3 and implemented in
  the
  >>> latest Saxon versions.
  >>>
  >>> For something like this you might have to see how much virtual
  memory the
  >>> process requires by running it and if it fails with an
  out-of-memory
  >>> error, give it more until it either runs or you've run out of
  available
  >>> real memory.
  >>>
  >>> Cheers,
  >>>
  >>> Eliot
  >>>
  >>> ----
  >>> Eliot Kimber, Owner
  >>> Contrext, LLC
  >>> http://contrext.com
  >>>
  >>>
  >>>
  >>>
  >>> On 8/29/15, 8:36 AM, "Mark Wilson pubs@xxxxxxxxxxxx"
  >>> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
  >>>
  >>>> I have been asked to isolate two elements each from a set of
  individual
  >>>> xml files containing hundreds of elements. I thought collect()
  would
  >>>> work, but each individual file is very large (36,000 + lines)
  and there
  >>>> are 8000 of them. I have no idea as how to begin. I would
  include a
  >>>> sample file, but as I said, they are very large. Where might I
  look to
  >>>> get ideas?
  >>>> Thanks,
  >>>> Mark
  >>>>
  >>>>
  >
  >

XSL-List info and archiveEasyUnsubscribe (by email)

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.