[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Another way to present XML data

  • From: Ihe Onwuka <ihe.onwuka@gmail.com>
  • To: jf.larvoire@free.fr
  • Date: Tue, 12 Sep 2017 13:58:04 -0400

Re:  Another way to present XML data
What value (or readability) does this offer over and above a scheme that simply transformed elements designated as atomic to attributes.

    <format name="XML" 
    advantage="Can define formal syntaxes, that can be verified Widely adopted, many XML-based standards"
          drawback="Hard to read by humans"/>

On Tue, Sep 12, 2017 at 10:31 AM, <jf.larvoire@free.fr> wrote:
Following last month's thread about the relative merits of XML and JSON, I think there's a way to get the advantages of both:

Instead of creating yet another data format incompatible with XML, it's actually possible to transform _reversibly_ XML into something that is as human-friendly as JSON.
This way, we can present the same data as "normal" XML to be consumed programs, or as "simplified" XML to be reviewed and edited by humans. And convert it back and forth between the two presentations when going from programs to humans and back.
I developed several years ago a Tcl script that did exactly that.

The script, and my proposed "Simplified XML" format, aka. SML, were presented at the 2013 Tcl conference:
This script is open-sourced and available there:
To use it in Windows, you'll need to install a Tcl interpreter. See instructions on this page if needed:

Note that I've been told that another data format called sml was proposed in 1999.
Mine has no relationship whatsoever with the other.
If this homonymy is a problem, I'm open to any alternative name!

Here's an example of what the transformation does:

`cat sample.xml`

<?xml version="1.0"?>
    <format name="XML">
<advantage>Can define formal syntaxes, that can be verified</advantage>
<advantage>Widely adopted, many XML-based standards</advantage>
<drawback>Hard to read by humans</drawback>
    <format name="JSON">
<author>Douglas Crockford</author>
<advantage>Easy to read by humans</advantage>
<drawback>Incompatible with XML</drawback>
    <format name="SML">
<author>Jean-Francois larvoire</author>
<advantage>Same advantages as XML. It is XML presented differently.</advantage>
<advantage>Easy to read by humans</advantage>
        <drawback>No I/O libraries available yet</drawback>

Then `cat sample.xml | sml` displays:

?xml version="1.0"
formats {
    format name="XML" {
author W3C
standardized 2008
advantage "Can define formal syntaxes, that can be verified"
advantage "Widely adopted, many XML-based standards"
drawback "Hard to read by humans"
    format name="JSON" {
author "Douglas Crockford"
standardized 2013
advantage "Easy to read by humans"
drawback "Incompatible with XML"
    format name="SML" {
author "Jean-Francois larvoire"
advantage "Same advantages as XML. It is XML presented differently."
advantage "Easy to read by humans"
        drawback "No I/O libraries available yet"

And finally `cat sample.xml | sml | sml` outputs an identical copy of the initial XML file.

Try it out with any of your XML files, and you'll be surprised of how much easier it is to edit them!
XSLT files transformed this way even become pleasant for C programmers!

Of course, there's nothing that prevents programs from actually producing or consuming SML directly. There are no libraries available for doing it yet, but in simple cases as in the example above, the parsing is relatively trivial.
The nice thing is that this transition can be done progressively, as the new SML-aware programs will remain compatible with your old programs that only know about standard XML.
Simply pipe the data through the sml.tcl script to make them understand each other!

Limitations: The sml.tcl program is well tested (I've used it for years), but it has known limitations:
- It will not convert UTF-16 or EBCDIC or MBCS files correctly.
- I've done very little testing with Unicode characters > \u00FF, so I'm not sure it will work fine with these in UTF-8 files.
- I've tested the reversibility with the whole libxml2 test suite. Still, my parsing definitely does not cover all corner cases of the XML specification. There surely are bugs still hiding there, but hopefully only for the least used features of XML.

Any feedback welcome!

Please report any bug on the github issues list:

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.