[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML API spec
In message <4109@u...>, Peter Murray-Rust <Peter@u...> writes >In message <5lx6vCAzGtFzEwII@l...> Richard Light writes: >[...] >> Operation >> --------- >> Presumably the XML processor is a 'slave' to the application, and >> only does what it's told to. > >I think that's right. OTOH it may be that it's possible to build a parser >that only does one thing and that the application decides what use to make >of the output. sgmls is rather like this - you either get the ESIS stream >or nothing (except error messages :-). I think that the "ESIS stream or nothing" case can (and should) still be seen in API terms. Essentially such a parser can have a very simple API with three commands: "open this XML document/fragment" "deliver me the whole tree structure in ESIS format" "close this XML document/fragment" Looking at it this way, I'm confident that the existing implementations can be developed to have an 'API', and we'll be on our way. The advantage of this approach is that it is easy to extend the command set to match the capabilities of the parser. For example, if the parser becomes capable of deciding whether or not to include marked sections or comments in its "ESIS" stream, then the "deliver me the whole tree structure in ESIS format" command can be refined to have arguments that determine which features the user wants delivered. (And in fact, this is exactly what a 'grove plan' is (as I understand it). "Give me elements, attributes, external entities, data content." It's a pretty obvious concept: shame about the air of mystery around it!) The other big issue to resolve at this stage is what in format the parser ("XML processor") should deliver information to the application. And that leads us (me, anyway!) to consider the "division of labour" issue. ESIS gives us a rather strange precedent, which perhaps think we shouldn't take too much as gospel, even if we are all very used to it. In the most general terms, the parser ("XML processor") has to deliver information about the XML document to the application. In ESIS a sequence of textually-represented tokens indicate parsing events from which an application can deduce the tree structure that is the XML document: element start, element end, data content, new line in source file, e.g.: (SOURCEDESC AID IMPLIED AN IMPLIED ALANG IMPLIED AREND IMPLIED ATEIFORM CDATA p (P -Generated from ASCII file by an OmniMark script )P )SOURCEDESC L8 This approach means that the application has to stay on its toes if it wants to get the structure right. And, fundamentally, it means that the _application_ has the job of building the tree, whether it wants/needs to or not. In the SGML world, this is perhaps a reasonable division of labour, since the parser has already done a lot of work for the application by inferring omitted end-tags, shortrefs, etc, etc. However, the whole point of XML is to _remove_ all of this complexity. I would therefore argue that in the XML world it is reasonable to ask the parser ("XML processor") to do a bit more: to "build the tree" and then let the application cherry-pick the bits it requires. Having resolved that (which we havn't - comments please!), we still have the delivery issue. I think a valuable aspect of the ESIS aproach is that the output is textual in nature. In principle, we could have a sequence of (binary) "objects" with "properties", splurging out of the parser, but to do so would in my view limit the usefulness of the output to a specific application environment. Bad thing! So, what does our "textual" output look like? As I said above, ESIS is a rather strange precedent. It uses a set of conventions all of its own: - a newline for certain events (but not for all); - first character of the line is an ESIS-specific code for the event type ("(" = start-tag; "-" = data content, etc.); - character entities represented (uselessly) by their mapping; - etc. A much simpler approach, which I _think_ is what would happen in a DSSSL-style transformation, is for the parser simply to output tidied-up XML. In which case, you might ask, what the heck is the parser doing for us? To which I would reply "about the same as what ESIS is doing for you!" The value of the parser will be apparent once it is able to filter out and deliver: - exactly those properties of the XML that the appplication is interested in; - any required subtree from the full document >> View of the XML document >> ------------------------ >> What does the application 'see' of the XML document it has asked the >> XML processor to open? The spec implies that it should have pretty >> direct access, e.g.: >> >> "An XML processor must inform the application of the length of >> comments if they are not passed through, to enable the application >> to keep track of the correct location of objects in the XML >> document." >> >> This fills me with gloom. Shouldn't there be a level of abstraction > ^^^^^^^^^^ >It would fill me with gloom _if I had to write the parser_ :-). If someone >else has done this, and didn't mind doing it, and if the result made it >trivial to discard comments (or other information) then it's not a problem. Sorry, Peter, I didn't make my point clearly. The "gloom" related to the phrase: "... to enable the application to keep track of the correct location of objects in the XML document". In my view of things, the application should _never_ have, or need, direct access to the actual XML document. It should get _everything_ it needs through the API. In the context of an editing application, where one might think the application needed to "poke" new or changed data directly into the XML document, I would argue that the parser should be performing a read-only operation on the source XML. If an editor is letting the user make changes, it is on an _in-memory copy_ of the source document (which clearly, as several of us have noted, needs to be a complete copy). When the user of the XML editing application decides to save their edited result, they will be overwriting the source XML document on disc with their in-memory copy, just as you do with any word processor. There is no need for the parser ("XML processor") to be involved in this stage of the process at all. Richard Light. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|