Table of contentsAppendices |
4.2 Entity DeclarationsEntity Declarations Entity Declaration
The Name identifies the entity in an Entity Reference or, in the case of an unparsed entity, in the value of an ENTITY or ENTITIES attribute. If the same entity is declared more than once, the first declaration encountered is binding; at user option, an XML processor MAY issue a warning if entities are declared multiple times. Internal Entities[top]Internal EntitiesIf the entity definition is an EntityValue, the defined entity is called an internal entity. There is no separate physical storage object, and the content of the entity is given in the declaration. Note that some processing of entity and character references in the Literal Entity Value may be required to produce the correct Replacement Text: see [Construction of Entity Replacement Text]. An internal entity is a Text Entity. Example of an internal entity declaration: <!ENTITY Pub-Status "This is a pre-release of the specification."> External Entities[top]External EntitiesIf the entity is not internal, it is an external entity, declared as follows: External Entity Declaration
If the NDataDecl is present, this is a general Unparsed Entity; otherwise it is a parsed entity. Validity Constraint: Notation Declared Notation DeclaredThe Name MUST match the declared name of a Notation.
The SystemLiteral is called the entity's system
identifier. It is meant to be
converted to a URI reference
(as defined in [rfc2396], updated by [rfc2732]),
as part of the
process of dereferencing it to obtain input for the XML processor to construct the
entity's replacement text. It is an error for a fragment identifier
(beginning with a System identifiers (and other XML strings meant to be used as URI references) MAY contain characters that, according to [rfc2396] and [rfc2732], must be escaped before a URI can be used to retrieve the referenced resource. The characters to be escaped are the control characters #x0 to #x1F and #x7F (most of which cannot appear in XML), space #x20, the delimiters '<' #x3C, '>' #x3E and '"' #x22, the unwise characters '{' #x7B, '}' #x7D, '|' #x7C, '\' #x5C, '^' #x5E and '`' #x60, as well as all characters above #x7F. Since escaping is not always a fully reversible process, it MUST be performed only when absolutely necessary and as late as possible in a processing chain. In particular, neither the process of converting a relative URI to an absolute one nor the process of passing a URI reference to a process or software component responsible for dereferencing it SHOULD trigger escaping. When escaping does occur, it MUST be performed as follows:
In addition to a system identifier, an external identifier MAY include a public identifier. An XML processor attempting to retrieve the entity's content MAY use any combination of the public and system identifiers as well as additional information outside the scope of this specification to try to generate an alternative URI reference. If the processor is unable to do so, it MUST use the URI reference specified in the system literal. Before a match is attempted, all strings of white space in the public identifier MUST be normalized to single space characters (#x20), and leading and trailing white space MUST be removed. Examples of external entity declarations: <!ENTITY open-hatch SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml"> <!ENTITY open-hatch PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN" "http://www.textuality.com/boilerplate/OpenHatch.xml"> <!ENTITY hatch-pic SYSTEM "../grafix/OpenHatch.gif" NDATA gif > |