[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: grouping + global variable (?) (was re: regexs,

Subject: Re: grouping + global variable (?) (was re: regexs, grouping (?) and XSLT2?)
From: Deirdre Saoirse Moen <deirdre@xxxxxxxxxxx>
Date: Fri, 13 Aug 2004 20:05:26 -0700 (PDT)
cocoon global variable
On Fri, 13 Aug 2004, Wendell Piez wrote:

> Your project sounds very ambitious. Up-conversion is a challenging and
> fascinating business, which we're all going to learn much more about.
> You have several conference papers' worth of material here, I bet.

I'm hoping so.

Quite frankly, I hadn't realized we were so cutting edge. :)

Ultimately, my goal is to provide an application that offers integration
between the text file (written using the user's text processor of choice).

User wants to submit a manuscript, then the application performs all the
necessary generation of the document (including cover letter) using
user-specific information about how they want the document to appear,
including any market- or genre-specific styles. Press a button, out pops
the PDF or RTF. For now, I'll settle for PDF. :)

I'd already written the submission manager and am trying to work to
integrate the work of another person into the project. Thus my struggle to
understand.

> At 08:15 PM 8/12/2004, you wrote:

>> But I've been thinking, based on the comments from the list, that a
>> better process might be eliminating the perl script entirely.
>
> Maybe: but you'll need something at least as good to do the work it's
> doing, and Perl is really good at regular-expressions and string processing
> generally.
>
> (Personally I might have tried it in Python, but that's mainly because I
> can count the lines of Perl I've written in my life on one hand. Of course,
> I can count in binary on my hands, which gets me higher than five.)

I didn't write the perl script, thus my frustration (as a Python person).
My partner-in-crime and I have come at the problem from entirely different
directions.

> Now it has some regexp support, XSLT 2.0 should be at least a credible
> option here, but its features have yet to be stress-tested TMK and
> tools support is still somewhat up in the air. (I believe Mike Kay is
> speaking on this very topic at XML 2004 this November in Washington
> DC.)

OK, that's what I'd been beginning to understnad based on list comments. I
wasn't aware of the tool support problem.

> A split-down-the-middle option could be to write a little function
> library in the language of your choice to do the upconversion
> string-processing, and call out to it from your XSLT using extension
> functions. (This is what I kind of imagined would happen five years
> ago, but it turns out processor-dependent extension functions are
> unfashionable these days.)

This is an intriguing option.

99% of the problem comes from documents saved in the native platform that
aren't correctly tagged. I'm not quite certain what to do about this so
that the editing is transparent. Yet.

I feel moderately confident that this might make it a more contiguous
process, which would also require fewer installed pieces in order to work.

> >I'm not sure I'd
> >want to eliminate the intermediate XML file, though.

> I think having the intermediate format will prove to be good design in
> any case.

OK.

> >Option 3 seems to be ruled out based on my current toolchain
> >(apache-FOP), which probably eliminates #2 as well. (I could easily be
> >wrong on this)
>
> Apache Xalan-J has support for a node-set function, so you could use
> option 2 if you wanted. It will even recognize it in the exslt.org
> namespace, which is nice.

Neat.

> >So, my question (you knew there was one): can someone give me a
> >description of how to accomplish #4, given the workflow I've got, using
> >something like Saxon? I see that it's an XSLT processor, but I'm don't get
> >the map of how all the pieces fit together. Right now, I know (after
> >having looked) that I'm using xalan for the simple reason that it came
> >with my apache-fop install.
>
> Saxon is well-liked by developers (it runs well, it's conformant, and
> it has good error messages), and can be switched in for Xalan in your
> toolchain if you prefer it. Saxon also supports exslt:node-set, so you
> can use option #2 with it as well.

Well, I can see if it offers me more options. I know enough to figure out
how to wrest it into the toolchain.

> As I mentioned, it has an extension attribute, saxon:next-in-chain, that
> can be invoked for pipelining. IIRC it passes SAX events between processor
> invocations (Mike?), so it's much faster than writing a file and reparsing,
> though perhaps not quite as fast as passing unserialized trees, as options
> 2 and 3 would do.

Right now, I'm running a script daily that re-generates XML files from any
changed text files in a given directory tree. The generation of a PDF is
upon-request, with re-generation of XML if it's needed. So part A
(txt->xml) doesn't necessarily happen when part B (xml->pdf) does.

Nevertheless, you've given me another idea, which I'll try over this
weekend.

> I am reasonably sure Xalan offers similar features, however, or the Cocoon
> framework does.

Cocoon seems very interesting, but I don't quite get where it fits into
the overall picture of things, though I am reading up on it.

> >I'd also eventually like to get a decent RTF output. Standard manuscript
> >prose is not terribly complex, so something that supported basic features
> >should suffice for that. Unfortunately, the commercial options are too
> >expensive for the intended audience. Is jfor likely to be my best
> >available option?
>
> I'd be interested to hear myself from the list on this question. I haven't
> yet myself seen a really nice route to RTF. I think two passes to this
> (analogous to the way IBM deployed a "TeXML" which could be targeted as a
> route to TeX) might be the best way to do it: have yet another tag set that
> describes only the formatting primitives supported by RTF and a utility
> stylesheet to make RTF out of that. Or use XSL-FO, if any of the formatters
> can make decent RTF yet.

jfor hasn't been updated at all in over a year, so it seems like a dead
project. And jfor.org is down.

I should add that I *do* need API access rather than a standalone
application.

-- 
_Deirdre  web: http://deirdre.net        blog: http://deirdre.org/blog/
yarn: http://fuzzyorange.com    cat's blog: http://fuzzyorange.com/vsd/
"Memes are a hoax! Pass it on!"

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.