[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Streaming and mapping plain text

Subject: Streaming and mapping plain text
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx>
Date: Tue, 17 Sep 2013 15:25:09 -0400
 Streaming and mapping plain text
Hi,

Like Roger, I have some questions about streaming in XSLT 3.0.

Consider the problem of the classic mapping of CSV into XML. Assume we
have files over 1GB in size, so we wish to stream.

Assume also that the lines in the CSV input need to be grouped --
outputs will be XMLs containing data sets from adjacent lines, based
on common values in a designatedd field in those lines. But which
field this is has to be parameterized, because not every CSV input
will have this cell in the same place in the row.

We can easily map each line to a sequence of cell elements:

<line><cell>1</cell><cell>2</cell><cell>3</cell></line>

Since we know the mapping we wish to use, we can also mark the cell we
wish to use to group:

<line><keycell>1</keycell><cell>2</cell><cell>3</cell></line>

(Maybe next time the second cell, not the first, will be keycell.)

Then we can group-adjacent select="keycell" over the lines to collect
our sequences of lines.

My question is how can this be streamed most effectively?

If I can stream to a stream, maybe the best way is first to stream the
lines, with the mapping I need to generate XML, and then stream the
lines into the sequences of grouped lines.

If, however, I can only stream the plain text input through, and
cannot stream the lines I generate in my first pass (with cells
marked) into the second pass (to group the lines) then I need to
collect the lines first, based on group-adjacent not on the value of
'keycell' (which isn't known yet) but on (say)
tokenize(.,$delimiter)[$pos], where $pos is the position among the
cells of 'keycell' for this mapping.

Any advice or ideas would be welcome.

Thanks!
Wendell

Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.