[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Well-formed Blueberry

  • From: Julian Reschke <julian.reschke@g...>
  • To: Elliotte Rusty Harold <elharo@m...>, xml-dev@l...
  • Date: Fri, 13 Jul 2001 16:59:24 +0200

blueberry character
Although I like the idea of not producing "blueberry" when "1.0" would have
done: how would you *produce* these documents?

For instance, how would an XSLT processor whether certain name characters
will appear later in the output or not?

> -----Original Message-----
> From: Elliotte Rusty Harold [mailto:elharo@m...]
> Sent: Friday, July 13, 2001 4:49 PM
> To: xml-dev@l...
> Cc: www-xml-blueberry-comments@w...
> Subject: Re: Well-formed Blueberry
>
>
> At 3:25 PM +0100 7/13/01, Rob Lugt wrote:
>
> >I can see a good reason for doing what you suggest, and I sympathise with
> >your comments but the fact is that your proposal would turn a trivial
> >implementation change into something much more difficult.  It could also
> >have a performance impact, so is unlikely to be popular with Parser
> >developers.
> >
>
> Not necessarily. Most correct parsers and other APIs already have
> to check whether each character is legal in an XML name.
> Blueberry doesn't really change that. It changes the list, but it
> doesn't change the fact that parsers need to maintain and consult
> against very large tables of characters and code points.
>
> I can see a number of ways to efficiently implement my proposal
> without a great deal of effort. One is to maintain two tables,
> one for XML 1.0 legal characters and one for the extra characters
> in Blueberry. This is probably necessary anyway to allow parsers
> to handle both kinds of documents.
>
> A typical parser would first check if a character was legal
> according to XML 1.0. Only if that failed, would it then check to
> see if the character was a legal Blueberry character. This is
> quite natural. At least one API (JDOM) and probably others
> already carefully choose which characters are checked in which
> order to improve efficiency for the common characters vs. the
> uncommon characters.
>
> In fact, to ease the handling of both kinnds of documents I'd
> expect there to be two separate method calls, one like
> isXMLNameCharacter() and one like isXMLBlueberryNameCharacter().
> The second method would only be called if the first returned
> false. (This is hardly the only way to do it, but it is one possibility.)
>
> Before parsing, the parser could set a boolean variable such as
> usesBlueberryCharacters to false. The
> isXMLBlueberryNameCharacter() could set this variable to true the
> first and every time it saw a blueberry character. Then the
> parsing was done, the parser would signal a well-formedness error
> if the variable was still false.
>
> Anyway, that's a very rough sketch, but you get the idea. The
> storage of the one extra boolean, and the setting of it each time
> a Blueberrry character is seen is trivial compared to the table
> lookup overhead that parsers do at this stage anyway.
>
> If I were revising JDOM to handle Blueberry (I pick JDOM just
> because its the only API whose internals I'm familiar with)
> setting up the tables for the new Blueberry characters would take
> as long or longer than implementing the scheme I just described.
> (JDOM isn't a parser but it does perform parser-like name checks.)
>
> >Wouldn't a better solution be one of education and market
> forces?  Just like
> >most people write backwards-compatible HTML today, most people
> will continue
> >to write backwards-compatible XML tomorrow for the simple reason
> that they
> >want it to be interoperable.
>
>
> As somebody who spends most of my time educating people about XML
> and related technologies, I don't want to leave to education
> anything we can enforce in the code. I will most certainly warn
> people through my books and seminars not to mark their documents
> as Blueberry when they don't need to. But I still know I'll
> encounter masses of developers who've half-read the specs, and
> skimmed some half-accurate books or articles. Lord knows I've
> encouraged enough brain damage in my earlier books that I don't
> want to rely on books or any other form of education as being the
> sole solution to a potentially nasty problem.
> --
>
> +-----------------------+------------------------+-------------------+
> | Elliotte Rusty Harold | elharo@m... | Writer/Programmer |
> +-----------------------+------------------------+-------------------+
> |          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
> |              http://www.ibiblio.org/xml/books/bible2/              |
> |   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
> +----------------------------------+---------------------------------+
> |  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
> |  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
> +----------------------------------+---------------------------------+
>
> ------------------------------------------------------------------
> The xml-dev list is sponsored by XML.org, an initiative of OASIS
> <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To unsubscribe from this elist send a message with the single word
> "unsubscribe" in the body to: xml-dev-request@l...
>


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.