[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XSLT compiler and syntax extensions

Subject: Re: XSLT compiler and syntax extensions
From: Brandon Ibach <brandon.ibach@xxxxxxxxxxxxxxxxxxx>
Date: Thu, 18 Nov 2010 10:47:20 -0500
Re:  XSLT compiler and syntax extensions
Thanks to everyone for the wonderful comments and ideas.  Keep them
coming!  It's great to get some fresh thoughts and ideas about
something that has been rolling around almost exclusively in my own
mind for so long.

Sorry for the length of this, but I wanted to pull the myriad of
excellent feedback together and try to address some of the recurring
points together.

On Wed, Nov 17, 2010 at 9:30 PM, Dimitre Novatchev <dnovatchev@xxxxxxxxx>
wrote:
>> So, the question is, would anyone here be interested in a tool like
>> this?
>
> Not really. The most valuable features of XPath 2.0 and XSLT 2.0 are
> extremely difficult to implement in XPath 1.0/XSLT and without these
> features it is meaningless to use a much weaker language. Why should I
> be interested in a language lacking such features, just to name a few,
>  as sequences, rich types and type checking, the enormous standard
> function library, the convenient support for regular expressions, the
> much higher composability of XPath expressions, ...

Yes, those are all very valuable features and yes, they would not be
trivial to implement, but isn't that what makes it fun? :)

However, I disagree that it would be meaningless to have a language
without all of those, since this project would be aimed at those who
currently only have XSLT 1.0 at their disposal, so any features from
XSLT 2.0 would be an enhancement.

> Also, such a project would add very little to the already available
> EXSLT implementations.

Perhaps, but not everyone has an EXSLT implementation available to
them.  Even if they do, wouldn't it be better for them to use standard
functionality from XSLT 2.0, instead, and have it compiled to EXSLT or
whatever proprietary extensions may be available in their environment,
as necessary (see below)?

> Even the current XPath 2.0/XSLT 2.0 languages are already getting
> "older" as XPath 3.0/XSLT 3.0 are about to appear.

Certainly this doesn't mean that efforts to get more people using 2.0
are pointless?

> We are on the verge of XSLT 2.0 being made available (compiled to
> Javascript) to any of the five major browsers (see Michael Kay's
> latest blog post on this topic). Given that this effort would be
> successful, I don't see any compelling reasons for developers to
> continue using XSLT 1.0

To be honest, I'm really not concerned with browsers.  My own use of
XSLT in the documentation/publishing world rarely ever involves a
browser, other than for displaying HTML generated offline or
server-side.

There is a wide range of XSLT 1.0 environments out there and I want to
build something that can be used in any of them.  I also want
something that can be used with minimal effort by any XSLT user, even
if they came to it with little to no development background and would
be lost trying to piece together a more capable processor by
integrating even pre-written extensions like an EXSLT library.

I especially want something that can be used in restricted
environments where the user has no control over their processor
choice, like tightly-controlled corporate or government environments
or COTS applications that allow the user to customize by writing XSLT,
but completely dictate the processor and its configuration.  If I
happen to end up with something that could also be useful in
browser-based environments, that's a bonus. :)

On Wed, Nov 17, 2010 at 10:00 PM, Liam R E Quin <liam@xxxxxx> wrote:
> I think quite a few people would be interested, and that it might
> be a way to experiment with some XSL 2.0 featuers for example in
> Web browsers.
>
> Release it and see :-)

I hope to.  It's been on the shelf without attention for a while, so
I'm going to have to find some time to dust it off and clean it up.
Asking about and getting shows of interest here is part of finding the
motivation. :)

On Wed, Nov 17, 2010 at 11:30 PM, Andriy Gerasika
<andriy.gerasika@xxxxxxxxx> wrote:
> Speaking of this "xsl:if+xsl:attribute" extension, I do not think it is
very
> practical.

This was just one simple example of syntax sugar I've been playing
with, but why do you think it isn't practical?  It's a pattern I found
myself using frequently in XSLT 1.0, so I wanted a more concise way to
express it.

> a) It will not support all XSLT 2.0 functionality, because of obvious
> limitations in XSLT 1.0

XSLT 1.0 certainly wouldn't be the best language for implementing
these features, but it has been proven to be Turing-complete, so the
sky is (theoretically) the limit, with caveats regarding things that
the parser doesn't make available and so on.  That said, I don't
really expect full compliance for a long time, if ever.  This project
will have met its goals if it helps people write better, more
maintainable XSLT by embracing XSLT 2.0 features and maybe encourages
them to move to an XSLT 2.0 environment, if they can, or encourages
vendors to implement those features because more of their users are
using them.

> b) Using it will make XSLT code difficult to debug. If you'll make an
error,
> it will report line location distorted for all code that follows

Yes, this is a valid concern.  I think there are certain things that
could be done to help deal with it, but no foolproof solutions.  I
hardly think it's a deal-killer, though.

> Generally, the best possible way to experiment w/ XSLT syntax sugar is to
> use URIResolver coupled w/ SAX ContentHandler.

That may be true for Java environments (and, with some tweaks,
possibly others), but is hardly universal to all XSLT environments.
Also, as noted above, I'm hoping to make this tool available to XSLT
users of any level, many of whom wouldn't know where to start with any
other programming language, even if their environment could
accommodate it.

On Thu, Nov 18, 2010 at 4:44 AM, Michael Kay <mike@xxxxxxxxxxxx> wrote:
> My advice would be to keep the two projects separate.

Yes, this is a good idea.  My extensions are currently an integral
part of the compiler, but I have plans to pull them out as just a
library of extension elements using the "stylesheet elements" feature
provided by the compiler, where an element in any declared extension
namespace can be used to invoke a named template with attributes
providing parameters (with support for designating which attributes
should be interpreted as expressions or AVTs, as in XSLT instructions)
and a mechanism to make the element content available to the template
in a designated parameter.  Certain templates can also be marked (or
maybe even automatically selected, based on certain criteria) for
inlining during the compilation process, enabling, for instance, an
extension that compiles into an <xsl:when> instruction.

> An XSLT 2.0 processor written in XSLT 1.0 could be a very attractive
> product. It needs to be close to 100% conformant to XSLT 2.0 to get
> adoption, and this will not be easy to achieve. You might be able to use
> Javascript extensions if you can find a way to make them work
cross-browser.
> Implementing the XSLT 2.0 data model on top of the 1.0 model is the main
> challenge. I can see how one would implement any atomic sequence as a
> string, for example, but it's less easy to see how to implement a sequence
> containing both nodes and atomic values, or a node-sequence that contains
> duplicate nodes. Perhaps you would have to represent such sequences using
> the generate-id() value of the node, dereferencing the IDs using keys.

For completely generic support of sequences, eliminating
result-tree-fragment restrictions, etc., I figure string manipulation
will be the way to go, but many details certainly remain to be worked
out.  I definitely want every supported feature to be available with a
pure XSLT 1.0 implementation, but I also think there's value in the
compiler eventually having options to generate output for specific
"targets" that support extensions such as the various node-set()
implementations.  This could provide more performant results while
still being able to write standard, non-proprietary XSLT for the input
to the compiler.

> As for XSLT extensions, there's masses of scope using a pre-processor; the
> challenge is to make your extensions powerful and attractive enough that
> people adopt them and start using then in preference to the "standard"
> primitives underneath.

Most of my extensions are driven by my desire to be able to write more
concise XSLT.  XSLT 2.0's "stylesheet functions" can do a lot on the
XPath side (and I look forward to being able to use them on my XSLT
1.0 projects) and I think my "stylesheet elements" idea will enable a
lot in much of the rest of XSLT by providing a convenient, concise
syntax for invoking named templates.

I admit, though, that my desire for conciseness can go a bit too far,
sometimes, so I hope to get more feedback on where that line should
fall.

2010/11/18 Emmanuel Bigui <eb@xxxxxxxxxx>:
> While I understand that it's interesting to have a compiler that would
> run in a pure XSLT 1.0 processor, would it not be also useful, and
> much easier, to write the compiler in XSTL 2.0? (or, actually, in any
> language...)

Yes, it certainly would be easier.  That's part of the plan (see below).

> The need is to be able to write XSLT 2.0 stylesheets and have them run
> in a 1.0 processor; I'm not sure the fact that they run "as is", ie,
> that the compiling happens "on the fly" in the production environment,
> is the same level of priority.

It's not so much that the compiling needs to happen "on the fly" as
much as I don't want users to have to download and install any extra
tools which might require administrative privileges or other
permissions that might be hard to come by in some environments.  If
this can run in any XSLT 1.0 environment, then I know they already
have what they need to use it.

> If I can write an XSLT 2.0 stylesheet, test it in an XSLT 2.0
> processor, and then be guaranteed the same result when compiled in 1.0
> and run in a pure 1.0 processor, then the extra compiling step (that
> can be part of the development process) does not seem like a great
> impediment?

Yes, this is a good point.  For users with the ability to work in an
XSLT 2.0 environment for development, but having to deploy to an XSLT
1.0 production environment, this pattern could work very well and
avoid the "offset line numbers in error messages" problem mentioned
earlier, among other issues.  The compiler will just need to make it
very clear when they've used a feature that their development tool
supports but the compiler does not (yet).

On Thu, Nov 18, 2010 at 6:15 AM, Michael Kay <mike@xxxxxxxxxxxx> wrote:
> You're right of course - this is how compilers get bootstrapped. You write
a
> compiler in XSLT 2.0 that compiles XSLT 2.0 stylesheets into XSLT 1.0
> stylesheets, then you use an existing XSLT 2.0 processor to compile this
> compiler into XSLT 1.0, and then you have a compiler that's written in XSLT
> 1.0 and can execute in browsers. Easy really. Except for the data model
> issues!

Isn't bootstrapping wonderful? :)  I started compiling the compiler
with itself early on.  It sometimes makes for some tricky debugging,
but I'm sort of a glutton for that kind of punishment. :P

2010/11/18 Emmanuel Bigui <eb@xxxxxxxxxx>:
> On Thu, Nov 18, 2010 at 12:15 PM, Michael Kay <mike@xxxxxxxxxxxx> wrote:
>> Easy really.
>
> If the compiler needs to be able to compile itself then basically it
> amounts to rewriting Saxon in XSLT: very easy, probably, yes (I wonder
> why it has not yet been done...?)

Too easy... boring, even... just not enough of a challenge, I'm sure. ;)

> A more developer-friendly XSLT 1.0 would be interesting and would not
> need to be 100% 2.0 conformant; for instance, 2.0-grouping and getting
> rid of the dreaded "result tree fragments" would be a nice step
> forward. But then of course it would amount to create yet another XSLT
> version (maybe close to 1.1 ...?) and would only add to the confusion.

Right, I think some XSLT 2.0 is better than none.  The goal here is to
enable people to use the more advanced standard features instead of
possibly proprietary extensions, regardless of their processing
environment.  I don't look at it as another XSLT version, but as a
partial implementation of XSLT 2.0 that provides a stepping-stone to
help get users to a full implementation, either by providing the
motivation to switch (either to the user, because they like what
they've tasted and want more, or those in charge of their environment,
because they like the user's enhanced productivity, etc.) or providing
some pressure on the vendor of the user's processor to implement the
new features.

> And who is the target audience?

Anyone currently using XSLT 1.0, by choice or otherwise.

> Besides browsers, what other environments have good XSLT 1.0 support
> and zero XSLT 2.0 support?

See comments above, but any number of environments limited to XSLT 1.0
by architectural choices, policy choices, vendor choices, COTS
support, etc.

2010/11/18 COUTHURES Alain <alain.couthures@xxxxxxxxxxxxx>:
> Having manually translated a complex XSLT 2.0 stylesheet into XSLT 1.0, my
> point of view is that the main difficulty is concerning XPath 2.0. Complex
> XPath 2.0 expressions require extra XSLT 1.0 instructions and frequent use
> of node-set() function to allow to re-transform results. Have a look at
> http://www.svgopen.org/2010/papers/14-Dynamic_SVG_graphs_for_XForms/#d4e184
>
> XSLTForms has its own XPath 1.0 parser written in XSLT 1.0. It is clearly
> its most complex part and surely XSLT 2.0/XPath 2.0 would have helped to
> simplify it.

Yes, XPath 2.0 parsing and processing is a big part of the challenge
which I have only begun to tackle.  I have a working LALR
(table-driven) parser in XSLT 1.0 and have begun putting XPath 2.0
grammar into a form that can be read by the grammar compiler (by hand,
for the small parts I've done so far... probably should automate a
conversion from the W3C-supplied grammar for the rest).  I have a
start on a framework for processing the results (string-manipulation
based), but it needs a fair bit of work, yet.

On Thu, Nov 18, 2010 at 9:52 AM, Hermann Stamm-Wilbrandt
<STAMMW@xxxxxxxxxx> wrote:
> Btw, XSLT pointers might be useful as Michael posted this in another email:
>> ... for example, but it's less easy to see how
>> to implement a sequence containing both nodes and atomic values, or a
>> node-sequence that contains duplicate nodes. Perhaps you would have to
>> represent such sequences using the generate-id() value of the node,
>> dereferencing the IDs using keys.
>
> I did "XSLT pointers" because one of the seven node types, namespace nodes,
> cannot be matched with xsl:key -- XSLT pointers work around that
> limitation.

That's an interesting approach.  Thanks for the tip!

-Brandon :)

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.