[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Feeler for SML (Simple Markup Language)

  • From: Ken MacLeod <ken@b...>
  • To: <xml-dev@i...>
  • Date: 11 Nov 1999 11:56:34 -0600

xml shorthand
"Don Park" <donpark@d...> writes:

> I have been thinking that there are applications out there that can
> benefit from using XML yet donot need all of its features.  The
> Canonical XML spec goes quite a distance in cutting away some of the
> features, for different purpose, but I still feel that more can be
> cut away.  Rick Jelliffe's rather funny message along with some
> WebDAV papers I read over the weekend got my mind buzzing enough for
> me to share this idea with the rest of XML-DEV.

There have been many threads on "XML could have/should have been much
simpler if they had just done XYZ" often referring to Scheme or TeX
syntax, for example.

I too have had one of those brainfarts because I like to hand edit XML
and it's way too bulky in most cases.  I've been playing with an "XML
shorthand" that follows the Scheme camp's proposals (I originally
started out with a TeX style ;-).  One of the key requirements,
though, is the need to be interoperable with XML 1.0.  Here's what I
came up with.  The basic syntax only sees what XML people would call
"elements" and "character data", all other XML structures are actually
built from XML-SH "elements".  XML-SH "element names" may contain XML
special characters.  First an example with minimal XML markup:

  {p I've been playing with an {quote XML shorthand} that follows the
  {language Scheme} camp's proposals {note{=type parenthetical} I
  originally started out with a {language TeX} style {wink}}.  One of
  the key requirements, though, is the need to be interoperable with
  {standard{=version 1.0} XML}.}

With more XML markup:

  {?xml{=version 1.0}}
  {!DOCTYPE{=PUBLIC -//blah/}{=SYSTEM foo.dtd}
   {!ELEMENT foo (#PCDATA|subfoo)*}
  {foo this is a foo, {subfoo this is subfoo.}
   This is {subfoo{=bar with an attribute}}.}

And a very rough draft of the basic syntax:

  Document ::= (Element | S)*

  Char ::= #x9 | #xA | #xD | [#x20-#xD7FF]
          | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
        /* as defined in XML 1.0 */

  S ::= (#x20 | #x9 | #xD | #xA)+
        /* as defined in XML 1.0 */

  Element ::= StartDelim (Char - S)*
              (S | StartDelim | EndDelim) Char* EndDelim
        /* This is meant to state that an element starts with a
           StartDelim, followed by zero or more non-white space
           characters, and may contain mixed content (starting with
           another StartDelim or S), and ending with EndDelim.  Empty
           elements are represented using {TAG}, no space after tag
           content is represented using {TAG{}content} */

StartDelim and EndDelim are charset-dependent.  For ASCII I'd use `{'
and `}' as in the examples above.  In Unicode I'd really like to see
two characters dedicated to this purpose so markup can never conflict
with character data.

Interoperability with XML 1.0 is implemented using a transform and
well-formedness constraints in XML-SH.  "Well formed" in XML-SH would
mean "well formed when converted to XML 1.0".

The transform is fairly guessable from the above.  The XML-SH parser
reads everything as elements and character data, the transformer
recognizes elements that are actually XML structures and converts them
appropriately.  An XML-SH compliant parser should not need to generate
XML events directly, depending on a transform filter to do that
instead.  A minimal filter should be provided to restrict passing
non-well-formed element names in startElement()/endElement() events.
This may (should) also involve gathering attribute-elements into
startElement() attributes.  More examples of transformable XML-SH:

 {?TARGET DATA}                          -- processing instruction
 {?TARGET {=NAME VALUE}}                 -- a PI using attr form
   {!ELEMENT NAME CONTENTSPEC}           -- element declaration
 {!-- COMMENT}                           -- comment
 {&ENTITYREF}                            -- entity reference
 {&#123}                                 -- character reference
 {&#xabcd}                               -- hex character reference
 {NAME ...}                              -- an element with content
 {NAME{=ATTR VALUE} ...}                 -- an element with attributes
 {NAME{}...}                             -- element w/o extra whitespace
 {NAME followed by many lines of text
   {/NAME}}                              -- a no-op used as a comment

Some messages in this thread refer to "contracts".  Several of these
contracts could be based on limiting what features the XML transformer
will allow.  For example, there is no support in the XML-SH parser for
external entities, that's left to the transformer.

  Ken MacLeod

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@i... the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.