[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Simple approaches to XML implementation
[from PeterMR] > > Thanks Ingo, > This is very useful, because it shows that a great deal can be done quite > simply. > > In message <199703010216.DAA00533@f...> Ingo Macherius writes: > [...] > > I have made up a perl5 module which models a very simple forest-like strukture, > > that holds Perl5 objects. The objects are created by reading nsgmls' ESIS > > I believe that ESIS has potentially a useful role in producing XML documents > from SGML documents - this was certainly my own strategy until recently. > ESIS is the normalised output from a parser (especially sgmls or NSGMLS from > James Clark - these are freely available.) It's trivial to transform > ESIS into XML, but not the other way round, since XML is richer. > > ESIS doesn't retain everything from the original document(s) and I've been > asking the experts what gets lost. My rough summary is that XML->ESIS > loses: > - comments (this matters if you want to edit the document or have > it read by humans. However comments should not be used > by machines - simply passed through) > - entities. If your document includes entities such as &chapter1; > these may be expanded and replaced by their contents. In > this way some of the structure may be less clear > - conditional markup. If you use INCLUDE and/or IGNORE then the > IGNORE'd sections won't come through and the INCLUDE'd > ones won't be marked as such > [I think that processing instructions come through OK? And that you can > determine whether an attribute value was defaulted or not?] > > If you use this simple level of markup (and _I_ do for molecular science) > then XML WF documents are equivalent to ESIS output from sgmls or nsgmls. > [Query: Are there plans for nsgmls/sgmls to output XML as an alternative > to ESIS? I expect it's straightforward]. > > > > and putting anything between certain named tags into a hash, which > > basically is the object content. The objects can be inserted as a root or into > > another object, which yields a forest-like structure. > > The tree-relations between objects are stored outside in a libdbm database, > > one per tree. It holds three tables, > > - id -> hashed data > > - id -> id of father object, or NULL > > - id -> ids of all sons > > Obviously any object must have a method giving a unique id within the forest. > > I think this may be called a poor-mans-grove :) I made up a simple API: > ^^^^^^^^^^^^^^^ > It's still very powerful, and you have recognised the importance of > structured documents. The good news is that this will all be addressed > (literally and metaphorically) in the discussion of addressing within > XML documents. The TEI project has developed a pointer scheme which > covers most aspects of structure and extends the metaphor to descendants, > ancestors, siblings and navigation by attributes and their values. I > am expecting one or more 'black boxes' to be developed which support this, > so that you don't have to write perl scripts any more. I'm waiting to hear > from another thread :-) > > [... code deleted ...] > > > > I found this sufficient to solve small problems for which ESIS is not enough > ^^^^^^^^^^ > I think you were operating _on_ the ESIS stream. You mean that simple > 'grep' or other tools weren't powerful enough? > > > and a grove is overkill. I must admit, albeit I read most of ISO 10179, I > ^^^^^^^^^^^^^^^^^ > This is one of the points at issue. Is it going to be possible to produce > software quickly, and easy enough to read and use. I'm waiting to find out:-) > > > really didn`t get the details. But what I found valuable is the choice > ^^^^^^^^^^^ > I think it's very important not to be frightened by 10179. What you have > done is very similar to what I and many others have done - devising > home-grown tools for searching structured documents. 10179 has an > implementation in Scheme (am I right?) but not in more procedural or > object-oriented languages. > > > between navigating (father/son) and id-based lookups (fetch). > > [...] > P. > -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|