Stylus Studio XML Editor

Table of contents

Appendices

2.2 XML Schema Abstract Data Model

XML Schema Abstract Data Model

This specification builds on [ref-xml] and [ref-xml-namespaces]. The concepts and definitions used herein regarding XML are framed at the abstract level of information items as defined in [ref-xmlinfo]. By definition, this use of the infoset provides a priori guarantees of well-formedness (as defined in [ref-xml]) and namespace conformance (as defined in [ref-xml-namespaces]) for all candidates for assessment and for all schema document.

Just as [ref-xml] and [ref-xml-namespaces] can be described in terms of information items, XML Schemas can be described in terms of an abstract data model. In defining XML Schemas in terms of an abstract data model, this specification rigorously specifies the information which must be available to a conforming XML Schema processor. The abstract model for schemas is conceptual only, and does not mandate any particular implementation or representation of this information. To facilitate interoperation and sharing of schema information, a normative XML interchange format for schemas is provided.

Schema component is the generic term for the building blocks that comprise the abstract data model of the schema. An XML Schema is a set of schema component. There are 13 kinds of component in all, falling into three groups. The primary components, which may (type definitions) or must (element and attribute declarations) have names are as follows:

  • Simple type definitions

  • Complex type definitions

  • Attribute declarations

  • Element declarations

The secondary components, which must have names, are as follows:

  • Attribute group definitions

  • Identity-constraint definitions

  • Model group definitions

  • Notation declarations

Finally, the "helper" components provide small parts of other components; they are not independent of their context:

  • Annotations

  • Model groups

  • Particles

  • Wildcards

  • Attribute Uses

During valid, declaration components are associated by (qualified) name to information items being valid.

On the other hand, definition components define internal schema components that can be used in other schema components.

Declarations and definitions may have and be identified by names, which are NCNames as defined by [ref-xml-namespaces].

Several kinds of component have a target namespace, which is either absent or a namespace name, also as defined by [ref-xml-namespaces]. The target namespace serves to identify the namespace within which the association between the component and its name exists. In the case of declarations, this in turn determines the namespace name of, for example, the element information items it may valid.

NOTE: 

At the abstract level, there is no requirement that the components of a schema share a target namespace. Any schema for use in assessment of documents containing names from more than one namespace will of necessity include components with different target namespace. This contrasts with the situation at the level of the XML representation of components, in which each schema document contributes definitions and declarations to a single target namespace.

valid, defined in detail in [Schema Component Details], is a relation between information items and schema components. For example, an attribute information item may valid with respect to an attribute declaration, a list of element information items may valid with respect to a content model, and so on. The following sections briefly introduce the kinds of components in the schema abstract data model, other major features of the abstract model, and how they contribute to valid.

Type Definition Components[top]

Type Definition Components

The abstract model provides two kinds of type definition component: simple and complex.

This specification uses the phrase type definition in cases where no distinction need be made between simple and complex types.

Type definitions form a hierarchy with a single root. The subsections below first describe characteristics of that hierarchy, then provide an introduction to simple and complex type definitions themselves.

Type Definition Hierarchy[top]

Type Definition Hierarchy

Except for a distinguished ur-type definition, every type definition is, by construction, either a restriction or an extension of some other type definition. The graph of these relationships forms a tree known as the Type Definition Hierarchy.

A type definition whose declarations or facets are in a one-to-one relation with those of another specified type definition, with each in turn restricting the possibilities of the one it corresponds to, is said to be a restriction. The specific restrictions might include narrowed ranges or reduced alternatives. Members of a type, A, whose definition is a restriction of the definition of another type, B, are always members of type B as well.

A complex type definition which allows element or attribute content in addition to that allowed by another specified type definition is said to be an extension.

A distinguished ur-type definition is present in each XML Schema, serving as the root of the type definition hierarchy for that schema. The ur-type definition, whose name is anyType, has the unique characteristic that it can function as a complex or a simple type definition, according to context. Specifically, restriction of the ur-type definition can themselves be either simple or complex type definitions.

A type definition used as the basis for an extension or restriction is known as the base type definition of that definition.

Simple Type Definition[top]

Simple Type Definition

A simple type definition is a set of constraints on strings and information about the values they encode, applicable to the normalized value of an attribute information item or of an element information item with no element children. Informally, it applies to the values of attributes and the text-only content of elements.

Each simple type definition, whether built-in (that is, defined in [ref-xsp2]) or user-defined, is a restriction of some particular simple base type definition. For the built-in primitive types, this is the simple version of the ur-type definition, whose name is anySimpleType. This is in turn understood to be a restriction of the ur-type definition. Simple types may also be defined whose members are lists of items themselves constrained by some other simple type definition, or whose membership is the union of the memberships of some other simple type definitions. List and union simple type definitions are also understood as restrictions of the simple ur-type definition.

For detailed information on simple type definitions, see [Simple Type Definitions] and [ref-xsp2]. The latter also defines an extensive inventory of pre-defined simple types.

Complex Type Definition[top]

Complex Type Definition

A complex type definition is a set of attribute declarations and a content type, applicable to the [attributes] and [children] of an element information item respectively. The content type may require the [children] to contain neither element nor character information items (that is, to be empty), to be a string which belongs to a particular simple type or to contain a sequence of element information items which conforms to a particular model group, with or without character information items as well.

Each complex type definition is either

or or

A complex type which extends another does so by having additional content model particles at the end of the other definition's content model, or by having additional attribute declarations, or both.

NOTE: 

This specification allows only appending, and not other kinds of extensions. This decision simplifies application processing required to cast instances from derived to base type. Future versions may allow more kinds of extension, requiring more complex transformations to effect casting.

For detailed information on complex type definitions, see [Complex Type Definitions].

Declaration Components[top]

Declaration Components

There are three kinds of declaration component: element, attribute, and notation. Each is described in a section below. Also included is a discussion of element substitution groups, which is a feature provided in conjunction with element declarations.

Element Declaration[top]

Element Declaration

An element declaration is an association of a name with a type definition, either simple or complex, an (optional) default value and a (possibly empty) set of identity-constraint definitions. The association is either global or scoped to a containing complex type definition. A top-level element declaration with name 'A' is broadly comparable to a pair of DTD declarations as follows, where the associated type definition fills in the ellipses:

<!ELEMENT A . . .>
<!ATTLIST A . . .>

Element declarations contribute to valid as part of model group valid, when their defaults and type components are checked against an element information item with a matching name and namespace, and by triggering identity-constraint definition valid.

For detailed information on element declarations, see [Element Declarations].

Element Substitution Group[top]

Element Substitution Group

In XML 1.0, the name and content of an element must correspond exactly to the element type referenced in the corresponding content model.

Through the new mechanism of element substitution groups, XML Schemas provides a more powerful model supporting substitution of one named element for another. Any top-level element declaration can serve as the defining element, or head, for an element substitution group. Other top-level element declarations, regardless of target namespace, can be designated as members of the substitution group headed by this element. In a suitably enabled content model, a reference to the head valid not just the head itself, but elements corresponding to any member of the substitution group as well.

All such members must have type definitions which are either the same as the head's type definition or restrictions or extensions of it. Therefore, although the names of elements can vary widely as new namespaces and members of the substitution group are defined, the content of member elements is strictly limited according to the type definition of the substitution group head.

Note that element substitution groups are not represented as separate components. They are specified in the property values for element declarations (see [Element Declarations]).

Attribute Declaration[top]

Attribute Declaration

An attribute declaration is an association between a name and a simple type definition, together with occurrence information and (optionally) a default value. The association is either global, or local to its containing complex type definition. Attribute declarations contribute to valid as part of complex type definition valid, when their occurrence, defaults and type components are checked against an attribute information item with a matching name and namespace.

For detailed information on attribute declarations, see [Attribute Declarations].

Notation Declaration[top]

Notation Declaration

A notation declaration is an association between a name and an identifier for a notation. For an attribute information item to be valid with respect to a NOTATION simple type definition, its value must have been declared with a notation declaration.

For detailed information on notation declarations, see [Notation Declarations].

Model Group Components[top]

Model Group Components

The model group, particle, and wildcard components contribute to the portion of a complex type definition that controls an element information item's content.

Model Group[top]

Model Group

A model group is a constraint in the form of a grammar fragment that applies to lists of element information items. It consists of a list of particles, i.e. element declarations, wildcards and model groups. There are three varieties of model group:

  • Sequence (the element information items match the particles in sequential order);

  • Conjunction (the element information items match the particles, in any order);

  • Disjunction (the element information items match one of the particles).

For detailed information on model groups, see [Model Groups].

Particle[top]

Particle

A particle is a term in the grammar for element content, consisting of either an element declaration, a wildcard or a model group, together with occurrence constraints. Particles contribute to valid as part of complex type definition valid, when they allow anywhere from zero to many element information items or sequences thereof, depending on their contents and occurrence constraints.

A particle can be used in a complex type definition to constrain the valid of the [children] of an element information item; such a particle is called a content model.

NOTE: 

XML Schema: Structures content model are similar to but more expressive than [ref-xml] content models; unlike [ref-xml], XML Schema: Structures applies content model to the valid of both mixed and element-only content.

For detailed information on particles, see [Particles].

Attribute Use[top]

Attribute Use

An attribute use plays a role similar to that of a particle, but for attribute declarations: an attribute declaration within a complex type definition is embedded within an attribute use, which specifies whether the declaration requires or merely allows its attribute, and whether it has a default or fixed value.

Wildcard[top]

Wildcard

A wildcard is a special kind of particle which matches element and attribute information items dependent on their namespace name, independently of their local names.

For detailed information on wildcards, see [Wildcards].

Identity-constraint Definition Components[top]

Identity-constraint Definition Components

An identity-constraint definition is an association between a name and one of several varieties of identity-constraint related to uniqueness and reference. All the varieties use [bib-xpath] expressions to pick out sets of information items relative to particular target element information items which are unique, or a key, or a valid reference, within a specified scope. An element information item is only valid with respect to an element declaration with identity-constraint definitions if those definitions are all satisfied for all the descendants of that element information item which they pick out.

For detailed information on identity-constraint definitions, see [Identity-constraint Definitions].

Group Definition Components[top]

Group Definition Components

There are two kinds of convenience definitions provided to enable the re-use of pieces of complex type definitions: model group definitions and attribute group definitions.

Model Group Definition[top]

Model Group Definition

A model group definition is an association between a name and a model group, enabling re-use of the same model group in several complex type definitions.

For detailed information on model group definitions, see [Model Group Definitions].

Attribute Group Definition[top]

Attribute Group Definition

An attribute group definition is an association between a name and a set of attribute declarations, enabling re-use of the same set in several complex type definitions.

For detailed information on attribute group definitions, see [Attribute Group Definitions].

Annotation Components[top]

Annotation Components

An annotation is information for human and/or mechanical consumers. The interpretation of such information is not defined in this specification.

For detailed information on annotations, see [Annotations].