[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Simple approaches to XML implementation

  • From: Peter@u... (Peter Murray-Rust)
  • To: xml-dev@i...
  • Date: Sat, 01 Mar 1997 11:13:37 GMT

include ignore xml equivalent sgml
[from PeterMR]
> 
> Thanks Ingo,
> This is very useful, because it shows that a great deal can be done quite 
> simply.
> 
> In message <199703010216.DAA00533@f...> Ingo Macherius writes:
> [...]
> > I have made up a perl5 module which models a very simple forest-like strukture,
> > that holds Perl5 objects. The objects are created by reading nsgmls' ESIS
> 
> I believe that ESIS has potentially a useful role in producing XML documents
> from SGML documents - this was certainly my own strategy until recently.
> ESIS is the normalised output from a parser (especially sgmls or NSGMLS from
> James Clark - these are freely available.)  It's trivial to transform
> ESIS into XML, but not the other way round, since XML is richer.
> 
> ESIS doesn't retain everything from the original document(s) and I've been
> asking the experts what gets lost.  My rough summary is that XML->ESIS
> loses:
> 	- comments (this matters if you want to edit the document or have
> 		it read by humans.  However comments should not be used
> 		by machines - simply passed through)
> 	- entities.  If your document includes entities such as &chapter1;
> 		these may be expanded and replaced by their contents.  In
> 		this way some of the structure may be less clear
> 	- conditional markup.  If you use INCLUDE and/or IGNORE then the
> 		IGNORE'd sections won't come through and the INCLUDE'd 
> 		ones won't be marked as such
> [I think that processing instructions come through OK?  And that you can
> determine whether an attribute value was defaulted or not?]
> 
> If you use this simple level of markup (and _I_ do for molecular science)
> then XML WF documents are equivalent to ESIS output from sgmls or nsgmls.
> [Query: Are there plans for nsgmls/sgmls to output XML as an alternative
> to ESIS?  I expect it's straightforward].
> 
> 
> > and putting anything between certain named tags into a hash, which
> > basically is the object content. The objects can be inserted as a root or into
> > another object, which yields a forest-like structure.
> > The tree-relations between objects are stored outside in a libdbm database,
> > one per tree. It holds three tables,	
> > - id -> hashed data
> > - id -> id of father object, or NULL
> > - id -> ids of all sons
> > Obviously any object must have a method giving a unique id within the forest.
> > I think this may be called a poor-mans-grove :) I made up a simple API:
>                                ^^^^^^^^^^^^^^^
> It's still very powerful, and you have recognised the importance of
> structured documents.  The good news is that this will all be addressed
> (literally and metaphorically) in the discussion of addressing within
> XML documents.  The TEI project has developed a pointer scheme which
> covers most aspects of structure and extends the metaphor to descendants,
> ancestors, siblings and navigation by attributes and their values.  I
> am expecting one or more 'black boxes' to be developed which support this,
> so that you don't have to write perl scripts any more.  I'm waiting to hear
> from another thread :-)
> 
> [... code deleted ...]
> > 
> > I found this sufficient to solve small problems for which ESIS is not enough
>                                                       ^^^^^^^^^^
> I think you were operating _on_ the ESIS stream.  You mean that simple
> 'grep' or other tools weren't powerful enough?
> 
> > and a grove is overkill. I must admit, albeit I read most of ISO 10179, I
>         ^^^^^^^^^^^^^^^^^
> This is one of the points at issue.  Is it going to be possible to produce
> software quickly, and easy enough to read and use.  I'm waiting to find out:-)
> 
> > really didn`t get the details. But what I found valuable is the choice 
>                     ^^^^^^^^^^^
> I think it's very important not to be frightened by 10179.  What you have 
> done is very similar to what I and many others have done - devising
> home-grown tools for searching structured documents.  10179 has an 
> implementation in Scheme (am I right?) but not in more procedural or 
> object-oriented languages.
> 
> > between navigating (father/son) and id-based lookups (fetch).
> 
> [...]
> 		P.
> 

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@i... the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.