Re: [xsl] [xslt performance for big xml files]

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

Subject: Re: [xslt performance for big xml files]
From: Robert Koberg <rob@xxxxxxxxxx>
Date: Sun, 26 Apr 2009 06:54:09 -0400

On Apr 25, 2009, at 10:05 PM, Liam Quin wrote:

On Sat, Apr 25, 2009 at 07:16:04PM -0400, Robert Koberg wrote:
Of all the real world applications deployed that use XQuery (I suppose I could be more specific and say as recommended by Liam, but maybe probably not necessary), how many do you think would work on more than one XQuery processor?
I think quite a few, although yes, you generally will have to change
the collection() and document() arguments.  Try creating a SQL
database and querying it in Oracle, DB2, MySQL, PostgresQL and you'll
generally find you have to change the code at least a little, but
that does not make SQL completely non-interoperable.  It's a case
of managing expectations, and of "the application was ported in
a week" vs "we would need to rewrite millions of lines of code from
scratch".

yes, but people aren't being encouraged to write webapps in SQL. And there are tools that allow you to abstract away the differences.

[...] XQuery as used/promoted by the XML DBs tend to favor their
own extensions in documentation and lists  (though there seems to be
more caveats on the lists lately, though).


I don't actually remember which implementations I suggested -- most
likely MarkLogic, Qizx and dbxml, since I've used them.  I've not
had major problems moving queries between them, though, once the
files are indexed, which is a separate (although not unfair) question.

We didn't standardise collection() -- at some point you have to
say, "this is the scope of our spec" and stop.  Maybe for XQuery 1.1
we could consider an optional directory-of-files-as-collection()
function, but then people would say they needed options to say whether
to re-run indexes, what collation sequences and file encodings to
assume, whether to follow shortcuts and sumbolic links... and pretty
soon it'd be a huge mess.  or at least that's been a difficulty in
the past.  Relational database schemas aren't entirely portable
either, and neither are filenames (e.g. between MS Windows and
Solaris and OS X the character sets, lengths, and default encodings
differ).

You're right that extension functions are a problem -- that's true
for XSLT as well, of course, and XPath, and for that matter C and
Perl and Python....

But with XSL and XPath the extensions are really not needed and you are not steered away from the standard right away.

best,
-Rob

Liam

--
Liam Quin, W3C XML Activity Lead, http://www.w3.org/People/Quin/
http://www.holoweb.net/~liam/ * http://www.fromoldbooks.org/

Current Thread

Re: [xslt performance for big xml files], (continued)

<- Previous	Index	Next ->
Re: [xslt performance for big, Liam Quin	Thread	Re: [xslt performance for big, Florent Georges
Re: [xslt performance for big, Robert Koberg	Date	maintaining sequence numbers , fred
	Month

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >