Stylus Studio XML Editor

Table of contents

Appendices

4.2 Entity Declarations

Entity Declarations

Entities are declared thus:

Entity Declaration
4.2    EntityDecl   ::=   GEDecl | PEDecl
4.2    GEDecl   ::=   '<!ENTITY' S Name S EntityDef S? '>'
4.2    PEDecl   ::=   '<!ENTITY' S '%' S Name S PEDef S? '>'
4.2    EntityDef   ::=   EntityValue| (ExternalID NDataDecl?)
4.2    PEDef   ::=   EntityValue | ExternalID

The Name identifies the entity in an Entity Reference or, in the case of an unparsed entity, in the value of an ENTITY or ENTITIES attribute. If the same entity is declared more than once, the first declaration encountered is binding; at user option, an XML processor MAY issue a warning if entities are declared multiple times.

Internal Entities[top]

Internal Entities

If the entity definition is an EntityValue, the defined entity is called an internal entity. There is no separate physical storage object, and the content of the entity is given in the declaration. Note that some processing of entity and character references in the Literal Entity Value may be required to produce the correct Replacement Text: see [Construction of Entity Replacement Text].

An internal entity is a Text Entity.

Example of an internal entity declaration:

<!ENTITY Pub-Status "This is a pre-release of the
specification.">

External Entities[top]

External Entities

If the entity is not internal, it is an external entity, declared as follows:

External Entity Declaration
4.2.2    ExternalID   ::=   'SYSTEM' S SystemLiteral
| 'PUBLIC' S PubidLiteral S SystemLiteral
4.2.2    NDataDecl   ::=   S 'NDATA' S Name[VC: Notation Declared]

If the NDataDecl is present, this is a general Unparsed Entity; otherwise it is a parsed entity.

Validity Constraint: Notation Declared

Notation Declared

The Name MUST match the declared name of a Notation.

The SystemLiteral is called the entity's system identifier. It is meant to be converted to a URI reference (as defined in [rfc2396], updated by [rfc2732]), as part of the process of dereferencing it to obtain input for the XML processor to construct the entity's replacement text. It is an error for a fragment identifier (beginning with a # character) to be part of a system identifier. Unless otherwise provided by information outside the scope of this specification (e.g. a special XML element type defined by a particular DTD, or a processing instruction defined by a particular application specification), relative URIs are relative to the location of the resource within which the entity declaration occurs. This is defined to be the external entity containing the '<' which starts the declaration, at the point when it is parsed as a declaration. A URI might thus be relative to the Document Entity, to the entity containing the Document Type Declaration, or to some other External Entity. Attempts to retrieve the resource identified by a URI MAY be redirected at the parser level (for example, in an entity resolver) or below (at the protocol level, for example, via an HTTP Location: header). In the absence of additional information outside the scope of this specification within the resource, the base URI of a resource is always the URI of the actual resource returned. In other words, it is the URI of the resource retrieved after all redirection has occurred.

System identifiers (and other XML strings meant to be used as URI references) MAY contain characters that, according to [rfc2396] and [rfc2732], must be escaped before a URI can be used to retrieve the referenced resource. The characters to be escaped are the control characters #x0 to #x1F and #x7F (most of which cannot appear in XML), space #x20, the delimiters '<' #x3C, '>' #x3E and '"' #x22, the unwise characters '{' #x7B, '}' #x7D, '|' #x7C, '\' #x5C, '^' #x5E and '`' #x60, as well as all characters above #x7F. Since escaping is not always a fully reversible process, it MUST be performed only when absolutely necessary and as late as possible in a processing chain. In particular, neither the process of converting a relative URI to an absolute one nor the process of passing a URI reference to a process or software component responsible for dereferencing it SHOULD trigger escaping. When escaping does occur, it MUST be performed as follows:

  1. Each character to be escaped is represented in UTF-8 [Unicode] as one or more bytes.

  2. The resulting bytes are escaped with the URI escaping mechanism (that is, converted to %HH, where HH is the hexadecimal notation of the byte value).

  3. The original character is replaced by the resulting character sequence.

In addition to a system identifier, an external identifier MAY include a public identifier. An XML processor attempting to retrieve the entity's content MAY use any combination of the public and system identifiers as well as additional information outside the scope of this specification to try to generate an alternative URI reference. If the processor is unable to do so, it MUST use the URI reference specified in the system literal. Before a match is attempted, all strings of white space in the public identifier MUST be normalized to single space characters (#x20), and leading and trailing white space MUST be removed.

Examples of external entity declarations:

<!ENTITY open-hatch
SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml">
<!ENTITY open-hatch
PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
"http://www.textuality.com/boilerplate/OpenHatch.xml">
<!ENTITY hatch-pic
SYSTEM "../grafix/OpenHatch.gif"
NDATA gif >