[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Schemas and mixed content with Relax NG and W3C XML Sc

  • From: rjelliffe@a...
  • To: "Philippe Poulard" <philippe.poulard@s...>
  • Date: Thu, 17 Jul 2008 00:42:19 +1000 (EST)

Re:  Schemas and mixed content with Relax NG and W3C XML     Sc
> hi,
>
> this is a question about schemas
>
> I know that with DTDs, when a text is allowed with elements, the best we
> can do is to allow it everywhere between other elements that can be
> repeated at any place in the text :
>
> <!ELEMENT p (#PCDATA|a|ul|b|i|em)*>
>
> unfortunately, we can't enforce the text to be at a given place :
>
> <person>Mr <firstname>John</firstname><lastname>Doe</lastname></person>
>
> the following DTD is invalid, but explain what we'd like to have :
> <!ELEMENT person (#PCDATA,firstname,lastname)>
>
> I wonder if there are also similar limitations with Relax NG and W3C XML
> Schema and why ?

SGML DTDs do allow that kind of structure.

Unfortunately, there was a logical flaw that it exposed that was very
difficult. It was called the pernicuous mixed content problem.

Say you have a content model like this:
    <!ELEMENT person ( (title | #PCDATA) , firstname, lastname)>
where you can either mark up the title or just have it.

Now we have a document
    <person><title>Mr</title><firstname>John</firstname><lastname>Doe</lastname></person>

That is fine.

But now we take that same document and pretty print it.

<person>
    <title>Mr</title>
    <firstname>John</firstname>
    <lastname>Doe</lastname>
</person>

This is invalid!  Why? Because the initial whitespace is taken to match the
$PCDATA, and the the <title> element is unexpected.

This problem could happen for all sorts of strange reasons, such as if you
were using a system with automatic line breaking and the start tag for
person was at the end of the line.

So in the end, in XML it was decided to dump this as too problematic. So
only (#PCDATA, ...)* was allowed, which is the same as XSD's mixed=true.

However, with RELAX NG it was realized that the problem does not occur for
tokens. So having tokens as well as elements such as
   ( "Mr" | "Mrs), firstname, lastname
will not trigger this problem.

Cheers
Rick Jelliffe


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.