[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: I processed a 3GB XML file ... using XSLT streaming
And because Peter mentioned this, here are two facts one should be aware of: 1. When streaming one doesn't know until the end whether the stream is "well-formed" or not. If after many hours of processing there is a wellformedness error, it is likely that some side-effect was already created (such as sending or posting), that cannot (or is too-late) to be undone. Are we missing the "(long) transaction" concept here? 2. People usually forget that an XML document is two-dimensional. There are XML documents that aren't so big in size, that can choke any streaming processing. One only needs to construct an XML document with big enough depth. As a streaming processor needs to keep track of all ancestors of the current node, it will crash at a certain depth. So, "size is not all" :), or do we need to redefine what "huge" means wrt an XML document? Cheers, Dimitre On Fri, Sep 13, 2013 at 12:47 PM, Peter Hunsberger <peter.hunsberger@gmail.com> wrote: > It might be worth noting that streaming XML has been around for some many > years now (in spite of the W3C's belief in what makes a well formed > document). I think the first custom XML parser I ever wrote was for the > Sports TIcker data feed probably not long after they first started up 15 > years ago... The thing that has changed, and I think maybe the point of > Roger's original post (whether he intended it or not), is the ability to use > XSLT to handle the streaming data without having to resort to custom > software. > > Peter Hunsberger > > > On Fri, Sep 13, 2013 at 2:28 PM, Dimitre Novatchev <dnovatchev@gmail.com> > wrote: >> >> > The point of streaming is (of course) being able to do such things >> > without using much memory, even if it's slower. Not everyone has 96G, or >> > even 16G of memory available... :-) >> >> >> Absolutely true. >> >> Also, there could be cases when the XML data is generated in real time >> continuously and non-stop, and must be processed again in real time. >> In such scenarios even terabytes of RAM wouldn't help. >> >> >> Cheers, >> Dimitre >> >> On Fri, Sep 13, 2013 at 12:14 PM, Liam R E Quin <liam@w3.org> wrote: >> > On Fri, 2013-09-13 at 20:53 +0200, Hermann Stamm-Wilbrandt wrote: >> >> I did give your non-streaming stylesheet (B) a try (A). >> >> Slight modifications were necessary to get back to XSLT 1.0 >> >> (eq -> = , doc -> document). >> >> 16GB of memory were used (17851470K-1067694K), (D). >> > >> > The point of streaming is (of course) being able to do such things >> > without using much memory, even if it's slower. Not everyone has 96G, or >> > even 16G of memory available... :-) >> > >> > Liam >> > >> > -- >> > Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ >> > Pictures from old books: http://fromoldbooks.org/ >> > Ankh: irc.sorcery.net irc.gnome.org freenode/#xml >> > >> > >> > _______________________________________________________________________ >> > >> > XML-DEV is a publicly archived, unmoderated list hosted by OASIS >> > to support XML implementation and development. To minimize >> > spam in the archives, you must subscribe before posting. >> > >> > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ >> > Or unsubscribe: xml-dev-unsubscribe@lists.xml.org >> > subscribe: xml-dev-subscribe@lists.xml.org >> > List archive: http://lists.xml.org/archives/xml-dev/ >> > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php >> > >> >> >> >> -- >> Cheers, >> Dimitre Novatchev >> --------------------------------------- >> Truly great madness cannot be achieved without significant intelligence. >> --------------------------------------- >> To invent, you need a good imagination and a pile of junk >> ------------------------------------- >> Never fight an inanimate object >> ------------------------------------- >> To avoid situations in which you might make mistakes may be the >> biggest mistake of all >> ------------------------------------ >> Quality means doing it right when no one is looking. >> ------------------------------------- >> You've achieved success in your field when you don't know whether what >> you're doing is work or play >> ------------------------------------- >> Facts do not cease to exist because they are ignored. >> ------------------------------------- >> Typing monkeys will write all Shakespeare's works in 200yrs.Will they >> write all patents, too? :) >> ------------------------------------- >> I finally figured out the only reason to be alive is to enjoy it. >> >> _______________________________________________________________________ >> >> XML-DEV is a publicly archived, unmoderated list hosted by OASIS >> to support XML implementation and development. To minimize >> spam in the archives, you must subscribe before posting. >> >> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ >> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org >> subscribe: xml-dev-subscribe@lists.xml.org >> List archive: http://lists.xml.org/archives/xml-dev/ >> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php >> > -- Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- Never fight an inanimate object ------------------------------------- To avoid situations in which you might make mistakes may be the biggest mistake of all ------------------------------------ Quality means doing it right when no one is looking. ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play ------------------------------------- Facts do not cease to exist because they are ignored. ------------------------------------- Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :) ------------------------------------- I finally figured out the only reason to be alive is to enjoy it.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|