[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Schemas and Other Crucial XML Questions

  • From: Tyler Baker <tyler@i...>
  • To: "XML Developers' List" <xml-dev@i...>
  • Date: Mon, 10 Aug 1998 15:16:04 -0400

Re: Schemas and Other Crucial XML Questions
Tyler Baker wrote:

> David Megginson wrote:
>
> > Sam Gentile writes:
> >
> >  > > Also, we have been hearing rumors of a "short" XML notation. Is
> >  > > there one?  We have a need to reduce the size of our buffers.
> >
> > No, there is no such thing.  XML's parent, SGML, included extensive
> > facilities for markup minimisation and has suffered badly for it,
> > since SGML tools are far too difficult to write (there is still not a
> > single Java-based SGML parser, beside probably more than a dozen
> > Java-based XML parsers).
> >
> > There are, however, alternatives: for example, you could compile the
> > XML to a compact binary format for internal storage then decompile it
> > back to a verbose format for export -- there's no requirement to store
> > it internally as text.
>
> Simple some very simple compression algorithms like Huffman encoding for
> instance, do very well with XML documents as the Name production that is used for
> identifying tags among other things will be converted to some binary symbol that
> is used as an index to lookup the actual name production.  In fact, you could do
> this all with entities by simply taking all of the Names specified in the DTD,
> spit them into a List, and then declare all entities.
>
> You could index all of this by using base 10 digits or else use something as high
> as base 64 to encode the array references.
>
> <!ENTITY % 0 "Foo">
> <!ENTITY % 1 "Bar">
>
> Then for a document which had element types with names "Foo" and "Bar" occurences
> of:
>
> <foo></foo>
> <bar></bar>
>
> would be converted to:
>
> <0></0>
> <1></1>

Please forgive any confusion I may have caused but this is not valid XML as digits
are not allowed as the first character in a name (only Letter's and the characters
'_' and ':').  In this case, instead of using numeric digits, use characters that are
Letters and simply map each letter to a corresponding digit value.

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.