|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Streaming XML and SAX
I've been following the discussions on streaming (on both XML-Dev and XSL-list, which has been interesting to compare) with lots of interest. Unfortunately, it's been through the haze of just having finished (yesterday) another book, so my observations may be tilted. I started my thinking about the subject with the XP (Extensible Protocol) proposal at the IETF: (it's about three weeks old now.) http://www.ietf.org/internet-drafts/draft-harding-extensible-protocol-00.txt (I've cc'd the author of the draft since I don't know if he's one of our lurkers, and figure he might be interested in knowing that we've got a live discussion going on here. I'm not saying XP is the cure to all our troubles, but it's a good place to start.) XP proposes a pretty simple mechanism for sending requests and responses as streams of XML documents. As the draft puts it, >To extend XML from a class of data objects into a protocol is to >extend the rules for constructing a single document into rules for >constructing two interrelated streams of documents. Accordingly, we >introduce mechanisms for handling both the sequential and >interrelated aspects of the document streams. Requests are prefaced with a processing instruction (PI) that uses the form: > RequestPI ::= '<?xp' S 'Request' Eq Nmtoken '?>' Responses are prefaced with a PI using the form: > ResponseToPI ::= '<?xp' S 'ResponseTo' Eq Nmtoken '?>' A 'terminator PI' is used to mark the end of a document, using the form: > TerminatorPI ::= '<?xp' S '/?>' It's a pretty simple mechanism, using Nmtokens to keep two streams of processing and information in sync with each other. XP doesn't directly address the issues that seem to be bedeviling this list. The issue of associating DTDs with documents, for instance, is left untouched, and the examples use simple well-formed XML. It does, however, suggest a fairly simple approach to stream processing that might be appropriate in a number of situations. Basically, rather than arguing about documents and streams and how they should relate to each other within the context of XML, maybe it's time to step outside the tight XML framework and start thinking of streams as a set of XML documents presented in some kind of sequence with meaningful delimiters. The stream itself may not be a valid or even well-formed XML document - since the end element may appear a long ways in the future, or even possibly never appear - but the stream can be decomposed into a set of valid XML documents. Some folks on this list have suggested mechanisms like control characters - ^L or ^C - to manage these streams. While that might work, it doesn't provide very much flexibility of expression. For example, it providrd no information about the relation of the documents in the stream except their sequence. In many cases, relating documents in the stream to each other - or, like XP, to an entirely separate stream - may be important. The use of processing instructions (or, if you want to be grouchy, markup that uses a PI-like syntax) seems appropriate. This might also reduce the need for preprocessing, or for parsers that look specifically for control characters, and would allow the reuse of mechanisms we've already got. A SAX parser might be able to carry out stream parsing, sending standard SAX events to multiple threads representing different document components of the stream, for example. The PIs could be sent as part of the prolog - it might mean rearranging the prolog so <?xml?> comes before the PI, but that I think is doable - so the application could get the information. It could give startDocument and endDocument some real work to do that isn't just the province of the first startElement and the last endElement. (Yes, I know startDocument is important for catching stuff that appears before the root element.) Defining this in a general way doesn't seem like it would be too painful. It might be a general description of a mechanism that XP applies in a particular request/response situtation, or it might be something else. In any event, defining XML streams and rules for dealing with them is an important issue, one with very important implications for interchange. If we could hammer this down, we might be able to ensure that all kinds of developers will be able to share XML streams as easily as they share XML documents. If we define streams cleanly, we might even be able to nest streams within streams (hopefully) avoiding the next round up of multiple-container processing battles. It'd be worth fleshing out, and I could see adding two new events to SAX - beginStream and endStream or something like that. On the other hand, maybe I've just been working too hard too long and it's time for a nice long vacation. If folks thinks this is worthwhile, though, I'd be happy to put some work into it. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








