[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Questions on XML syntax and conformance issues
[ crossposting to the OASIS XML conformance WG, which should be seeing and dealing with such issues ] Takuki Kamiya wrote: > > Morus Walter <morus.walter@g...> wrote: > > > > The test suite says test 'valid-sa-094' (from James Clarks test cases) to be > > not wellformed. > > .... Hence I would not regard this > > as an entity reference. Any comments on that? > > I agree. I think that's widely believed, and I know that OASIS has this on a list of things to resolve. I don't know when it'll be done though. Note that in section 2.8, after the [29] markupdecl production and right before the "PEs-in-internal-subset" WFC that's relevant there, the spec is quite explicit: ... individual nonterminals describe the declarations <em>after</em> all the parameter entities have been included. That conflicts with section 4.4 saying that PE handling is really a boatload of special cases, including the one being used to claim this test case isn't malformed. Seems like maybe W3C should change that first (and most formative -- since it's the only one with a simple model behind it!) description of PE processing. > > Attribute normalization: ... > > But what it says must be accepted as what it means when we are dealing with > conformance tests; I mean the tests are not to be conformant otherwise. That's the ideal. However, earlier discussions on this topic turned up quite a number of internal inconsistencies in the XML 1.0 spec in this area. (See my post, appended.) The issue is related to the one above: entity processing for attributes is fuzzy. > See http://www.w3.org/XML/xml-19980210-errata#E61 > > Therefore I believe that now test case sa02 needs to be corrected. I glanced at that a while back, and think that maybe _one_ of those inconsistencies I noted has now been addressed. Maybe. Until that family of issues is fully addressed, I think it'd be best to think of that issue as open ... lest some fixes cause later unfixes. (Also, for the record, if one takes XP 0.5 as the best effort "by W3C" to support XML conformance, I seem to recall that it was used to create that particular output test.) - Dave APPENDED: my xml-dev post, seemingly not archived, identifying several seeming internal inconsistencies in the XML spec. ====================================================================== Subject: Re: Attribute normalisation and character entities Date: Thu, 27 Jan 2000 15:00:58 -0800 From: David Brownell <david-b@p...> Organization: Yoyodyne Systems Labs To: Richard Tobin <richard@c...> CC: xml-dev@i... Richard Tobin wrote: > > How is an attribute containing a character reference to to whitespace > character (other than space) supposed to be normalised? > > Section 3.3.3 seems to me to say that character references are not > subject to the translation to #x20 - the four bulleted points are > an exhaustive disjunction. > > However the Oasis test suite, in tests sa02 and not-sa02, requires > that they are replaced with spaces. > > Which is correct? As a data point, those output tests were originally generated using the then-current version of XP. I suspect Tom Passim's observation is close: except for CDATA, _whitespace_ should be replaced with just one space. As I've commented elsewhere, I find that much of the entity processing in the XML spec seems to be specified as a collection of special cases (updated via errata as inconsistencies turn up) rather than being based on simple and consistent rules. This is another place that it seems to be happening. There are two curious points in 3.3.3 ... first, that character and entity refs may appear, and second that CRLF sequences may appear (line endings already having been normalized). How would these appear? If we assume that 4.4 applies first, then those OASIS cases are correct, and they'd appear "doubly escaped" as: <element char-ref-attr = "foo &#9; bar" ent-ref-attr1 = "AT&amp;T" ent-ref-attr2 = "AT&amp;T" crlf-attr = "a
b" /> If we assume that 3.3.3 has needless duplication of 4.4 then I can't see how the literal CRLF can ever show up as input to the normalization, since line-ends have already been normalized. On the other hand, I don't think anyone actually writes what ent-ref-attr2 has -- "AT&T" is it. Perhaps 4.4 applies first, _and_ there is needless duplication (for entity refs). Or 3.3.3 has both duplication and several errors. - Dave *************************************************************************** This is xml-dev, the mailing list for XML developers. To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev List archives are available at http://xml.org/archives/xml-dev/ ***************************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|