[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Preserving the structure of the XML file


oleg dulin
At 12:01 PM 10/10/2003 -0400, Oleg Dulin wrote:
>Does anyone know if there are less-lossy XML parsers and serializers that 
>can capture and reproduce the structure of the input XML file including 
>tabulation,whitespace, etc. ? We would love to know about
>experiences with parse/serialization approaches that have a greater 
>infoset than that provided by SAX and DOM, especially related to ignorable 
>whitespace and attributes ordering/whitespace.  We are editing XML and 
>want to preserve the file as much as possible.

I've written what I call a half-parser, available (in Java) as part of my 
Gorille project. It reports every character in the document and stays away 
(for now) from entity expansion, attribute defaulting, and other infoset 
excitement. It also has a context object which makes it easier to handle 
issues like entity values and namespaces.

Gorille is at:
http://simonstl.com/projects/gorille/

Details on Ripper's API, which should give you a good idea what's included, 
are at:
http://simonstl.com/projects/gorille/docs/com/simonstl/gorille/DocProcI.html
http://simonstl.com/projects/gorille/docs/com/simonstl/gorille/ContextI.html
http://simonstl.com/projects/gorille/docs/com/simonstl/gorille/Ripper.html

A paper explaining this more thoroughly is at:
http://www.mulberrytech.com/Extreme/Proceedings/html/2003/StLaurent01/EML2003StLaurent01.html

A presentation on it in English is at:
http://simonstl.com/articles/halfparse/

A presentation on it in Playmobil (requires SMIL, in RealPlayer One) is at:
http://simonstl.com/articles/halfparse-smil/

I'm planning a lot more work surrounding this parser, but have a painfully 
serious shortage of time at the moment.  There should be a lot more in 2004.



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.