[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Suggestions for a slightly less verbose (and easierto a

  • To: xml-dev@l...
  • Subject: Re: Suggestions for a slightly less verbose (and easierto author) XML
  • From: Sean McGrath <sean.mcgrath@p...>
  • Date: Mon, 24 Jun 2002 11:42:00 +0100
  • In-reply-to: <1024877339.13461.ezmlm@l...>

pyx verbose
[Paul Prescod]:
 >>If the instances are generated under your control by a machine, then by
 >>definition they won't use the short-tag feature if your regexps don't
 >>support it. The complexity argument also does not wash: entities and
 >>CDATA sections easily add the most complexity to XML of any feature.

[Tim Bray]
 >Machine-generated XML usually doesn't do entities or CDATA.  It does do

 ><someTag>
 > ..stuff..
 > ..stuff..
 ></someTag>

 >and perl is just the ticket.

The problem of course is that there is no way to tell whether or not
the 1 Gig XML instance you are about to process contains any entities,
CDATA sections etc.

So you need to make assumptions about the processing environment in your
code. Such assumptions make me nervous and make Walter Perry very
nervous indeed (they are tantamount to XML vocabulary semantics assumptions).

I see three possibilities to make this work reliably:

1) a XML-Lint type utility that would flag the presence of such things
so that assumption-laden Perl is protected from making erroneous
processing decisions. Such lint-like utilities would make excellent
components in XPipe or Schemamachine or Ant or Cocoon or DSDL
pipelines.

2) A canonical XML representation guaranteed to have resolved away
all the funnies e.g. canonical XML or PYX.

3) An manifest mechanism is XML to allow a human/machine to declare
what features the XML instance uses e.g. XFM. This would be of the
hint variety - subject to formal confirmation by an XML-Lint type
utility - but very useful in stopping "grep" and Perl etc. in their tracks
if the manifest asserts something that contradicts the processing
assumptions.

4) A PSVI that  .... (only joking!!!!!!)

Personally (surprise, surprise) I think the lint utility in a *pipeline*
is the way to go. That way, people can re-invent all of SGML's tag
minimization features in a layered way without heaping them
all into a monolithic morass with trickle down complexity
to all XML tools. This trickle down effect is what made
SGML such an exasperatingly powerful pain in the ass.
Lets not re-invent it.

Sean


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.