[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: The subsetting has begun


xmlk
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

/ ari@c... (K. Ari Krupnikov) was heard to say:
| Elliotte Rusty Harold <elharo@m...> writes:
|
|> I suspect part of the problem is that the members of the expert group
|> did not have a clear understanding of the difference between
|> validation and reading the DTD, between the DTD and the document type
|> declaration, and between the internal and external DTD subsets.
|> 
|> These are common areas of confusion for a lot of developers. However,
|> if you're going to write specs, you need to understand such matters
|> better than the average developer.
|
| It would be interesting to hear what people like Norman Walsh think
| about it.

Gulp. I'm not sure on what precisely you want my opinion. For the
record, I work for Sun (someone replied privately to a message I sent
a few weeks ago suggesting that "my cover was blown" when in fact I
had no intentions of a cover at all. I've just been subscribed to this
list longer than I've worked for Sun :-).

I've spoken to the folks working on JSR 172 and I think they
understand the distinctions to which Elliotte Rusty Harold alludes.
They're building a SOAP processor for devices with a code footprint of
something like 25kb. (*kilo*bytes). I think there's room for their
spec to be clearer about the decisions they've made, why they've made
them, and the ways in which the API they're exposing is intended to be
used. And I think they're going to make those changes.

I think it's more valuable to look at the broader issues here.

As it happens, I'm giving a presentation for the TAG on the
xmlProfiles-29 issue on Wednesday at the technical plenary. My rough
draft slides are in a public space[1], so feel free to peek at them.
But I may change them before Wednesday.

As far as I can see, the following statements are true:

1. People will subset XML. They already have.

2. Developers will write code that only processes those subsets.

3. The result will be reduced interoperability if developers think
   that they can use that code for general-purpose XML processing.

It looks to me like the single biggest hunk-o-stuff that people want
to get rid of in subsets is the DTD processing. I can even imagine a
world in the distant future where schema processors are widespread,
well-understood, and fast enough that documents don't often have
document type declarations. That's a world in which we all might
benefit from smaller parsers.

So when I first started thinking about this issue, I thought that it
might make sense to define a single new subset of XML. Basically, XML
1.1 without DTDs. I even wrote a spec for it:

  1 Introduction

  Extensible Markup Language Kernel, abbreviated XMLK, describes a
  subset of the class of data objects called XML documents defined by
  [XML], as amended by [XML 1.1].

  The design goals for XMLK are:

   1. XMLK documents shall be backwards compatible with XML 1.1.
   2. XMLK documents shall be standalone.

  This specification, together with [XML 1.1], provides all the
  information necessary to understand XMLK Version 1.0 and construct
  computer programs to process it. 2 Definition

  XMLK 1.0 is identical to XML 1.1 with the following single,
  normative change. Production 22 is replaced with:

  [22] prolog ::= XMLDecl? Misc? [WFC: Document Type Declaration]

  Well-formedness constraint: Document Type Declaration

    A document type declaration must not occur. XMLK documents cannot
    contain an internal or external subset.

  With this change, a number of validity and well-formedness
  constraints are trivially satisfied, but they hold nonetheless.

As time has passed and there's been more pushback against the idea of
a new subset, my conviction has waivered.

Perhaps the right answer is simply to say that a processor for the
subset of XML defined by "foo" should be called a "foo processor" and
not an XML processor.

The argument that "foo" isn't XML probably isn't very interesting from
a purely practical standpoint. But maybe we can get everyone to agree
to call a spade a spade.

                                        Be seeing you,
                                          norm

[1] http://www.w3.org/2003/03/05/tag/xmlProfiles-29/

- -- 
Norman.Walsh@S...    | To the man who is afraid everything
XML Standards Architect | rustles.--Sophocles
Web Tech. and Standards |
Sun Microsystems, Inc.  | 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 <http://mailcrypt.sourceforge.net/>

iD8DBQE+X9GzOyltUcwYWjsRArCoAJ4mwiiN2vxh4Cna+7ftqtTSAmcm5gCgrXq2
8wXRXKEtQBIsg2bEUp0tnec=
=VZDO
-----END PGP SIGNATURE-----

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.