[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: MicroXML

  • From: Amelia A Lewis <amyzing@talsever.com>
  • To: David Lee <dlee@calldei.com>
  • Date: Mon, 13 Dec 2010 13:56:51 -0500

RE:  MicroXML
On Mon, 13 Dec 2010 13:26:56 -0500, David Lee wrote:
> Filesystems often use the file extension as a magic number.
> I find this convenient but shouldn't be counted on ( particuarly on systems
> where you can pipe via stdin ).
> I'd presume that the app has to take care of using the right processor, just
> as it does today if you have a mix of text, image, html , xml and Json data
> in the same directory.

Not congruent problems.

By JC's design, uXML is XML 1.0.  Namespace handling, in particular, 
could be problematic, if a uXML document is handed to a processor 
expecting XML 1.0 + namespaces.  The same is true in reverse if an XML 
1.0 + namespaces document were handed to a uXML parser (it's possible 
that the parser writer could build in a fallback).  With adoption and 
tool development, the problem would eventually mostly go away, but a 
five-year old document with extension .xyzzy that starts with <xyzzy 
xmlns="http://great.underground.empire/"> is immediately attributable 
to XML 1.0 or XML 1.0 + namespaces or uXML, but it *ought* to be 
possible to distinguish more quickly--it's immediately distinguishable 
from <html> (or <!DOCTYPE html) as initial characters, or the magic 
numbers for PNG, JPEG, GIF, etc.  Distinguishing from text is harder, 
but a text file that starts with an SGML/XML/uXML-like 
tag-containing-general-identifier is "reasonably" misidentified.  I've 
no clue what JSON looks like, or if it's detectable.

If you don't care that the heavier-weight parser is (always?) going to 
be used, fine ... but I can't see much impetus for adoption; this 
becomes again a best-practices proposal.  Optimizations are possible if 
you know it's uXML.  If you see <?xml version="1.0"?> you know it's XML 
1.0, not uXML.  A lot of XML documents lack the declaration, though, 
and absence of the declaration means that such XML documents are UTF-8 
(like uXML), but these documents may have namespace fun (a problem for 
a uXML parser), may contain PIs, etc.  You could always try lightweight 
and fall back to heavy, but this may be problematic.  It would be 
better, in my opinion, to have something recognizable.

Amelia A. Lewis                    amyzing {at} talsever.com
    Songs and fame are vain endeavor--
    only two things fail us never,
    only two things last forever--
    sorrow and love, sorrow and love ....
                -- The Last Song of Sirit Byar

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.