Stylus Studio XML Editor

Table of contents

Appendices

4.4 XML Processor Treatment of Entities and References

XML Processor Treatment of Entities and References

The table below summarizes the contexts in which character references, entity references, and invocations of unparsed entities might appear and the REQUIRED behavior of an XML Processor in each case. The labels in the leftmost column describe the recognition context:

Reference in Content

as a reference anywhere after the Start-Tag and before the End Tag of an element; corresponds to the nonterminal content.

Reference in Attribute Value

as a reference within either the value of an attribute in a Start-Tag, or a default value in an Attribute-List Declaration; corresponds to the nonterminal AttValue.

Occurs as Attribute Value

as a Name, not a reference, appearing either as the value of an attribute which has been declared as type ENTITY, or as one of the space-separated tokens in the value of an attribute which has been declared as type ENTITIES.

Reference in Entity Value

as a reference within a parameter or internal entity's Literal Entity Value in the entity's declaration; corresponds to the nonterminal EntityValue.

Reference in DTD

as a reference within either the internal or external subsets of the Document Type Declaration, but outside of an EntityValue, AttValue, PI, Comment, SystemLiteral, PubidLiteral, or the contents of an ignored conditional section (see [Conditional Sections]).

.

1border7Entity type/reference matrix centerbottom centermiddle centermiddle centermiddle centermiddle centermiddle
21 4centerbottom1Entity Type 2center1Character
11Parameter 11Internal General 11External Parsed General 11Unparsed
right11Reference in Content 11Not recognized 11Included 11Included if validating 11Forbidden 11Included
right11Reference in Attribute Value 11Not recognized 11Included in literal 11Forbidden 11Forbidden 11Included
right11Occurs as Attribute Value 11Not recognized 11Forbidden 11Forbidden 11Notify 11Not recognized
right11Reference in EntityValue 11Included in literal 11Bypassed 11Bypassed 11Error 11Included
right11Reference in DTD 11Included as PE 11Forbidden 11Forbidden 11Forbidden 11Forbidden

Not Recognized[top]

Not Recognized

Outside the DTD, the % character has no special significance; thus, what would be parameter entity references in the DTD are not recognized as markup in content. Similarly, the names of unparsed entities are not recognized except when they appear in the value of an appropriately declared attribute.

Included[top]

Included

An entity is included when its Replacement Text is retrieved and processed, in place of the reference itself, as though it were part of the document at the location the reference was recognized. The replacement text MAY contain both Character Data and (except for parameter entities) Markup, which MUST be recognized in the usual way. (The string "AT&T;" expands to "AT&T;" and the remaining ampersand is not recognized as an entity-reference delimiter.) A character reference is included when the indicated character is processed in place of the reference itself.

Included If Validating[top]

Included If Validating

When an XML processor recognizes a reference to a parsed entity, in order to Validity the document, the processor MUST Include its replacement text. If the entity is external, and the processor is not attempting to validate the XML document, the processor MAY, but need not, include the entity's replacement text. If a non-validating processor does not include the replacement text, it MUST inform the application that it recognized, but did not read, the entity.

This rule is based on the recognition that the automatic inclusion provided by the SGML and XML entity mechanism, primarily designed to support modularity in authoring, is not necessarily appropriate for other applications, in particular document browsing. Browsers, for example, when encountering an external parsed entity reference, might choose to provide a visual indication of the entity's presence and retrieve it for display only on demand.

Forbidden[top]

Forbidden

The following are forbidden, and constitute Fatal Error:

  • the appearance of a reference to an Unparsed Entity, except in the EntityValue in an entity declaration.

  • the appearance of any character or general-entity reference in the DTD except within an EntityValue or AttValue.

  • a reference to an external entity in an attribute value.

Included in Literal[top]

Included in Literal

When an Entity Reference appears in an attribute value, or a parameter entity reference appears in a literal entity value, its Replacement Text MUST be processed in place of the reference itself as though it were part of the document at the location the reference was recognized, except that a single or double quote character in the replacement text MUST always be treated as a normal data character and MUST NOT terminate the literal. For example, this is well-formed:

<!ENTITY % YN '"Yes"' >
<!ENTITY WhatHeSaid "He said %YN;" >

while this is not:

<!ENTITY EndAttr "27'" >
<element attribute='a-&EndAttr;>

Notify[top]

Notify

When the name of an Unparsed Entity appears as a token in the value of an attribute of declared type ENTITY or ENTITIES, a validating processor MUST inform the application of the System Identifier and Public identifier (if any) identifiers for both the entity and its associated Notation.

Bypassed[top]

Bypassed

When a general entity reference appears in the EntityValue in an entity declaration, it MUST be bypassed and left as is.

Included as PE[top]

Included as PE

Just as with external parsed entities, parameter entities need only be included if validating. When a parameter-entity reference is recognized in the DTD and included, its Replacement Text MUST be enlarged by the attachment of one leading and one following space (#x20) character; the intent is to constrain the replacement text of parameter entities to contain an integral number of grammatical tokens in the DTD. This behavior MUST NOT apply to parameter entity references within entity values; these are described in [Included in Literal].

Error[top]

Error

It is an Error for a reference to an unparsed entity to appear in the EntityValue in an entity declaration.