5.1 The Influence of Serialization Parameters upon the XML Output Method
The Influence of Serialization Parameters upon the XML Output Method
XML Output Method: the version Parameter[top]
XML Output Method: the version Parameter
The version parameter specifies the version of XML
and the version of Namespaces in XML
to
be used for outputting the instance of the data model. If the
processor
serializer
does not support this version of XML,
it SHOULD
MUST use a version of XML that it
does support.
it MUST signal a serialization error.
The version output in the XML declaration (if an XML
declaration is output) SHOULD
MUST correspond to the version of XML that
the
processor
serializer
used for outputting the instance of the data model. The value of the
version parameter SHOULD
MUST match the
[NT-VersionNum]
production of the XML Recommendation [XML].
addGIf the serialized result would contain an
[NT-NCName] that contains a character that is not
permitted by the version of Namespaces in XML specified by the
version parameter, a serialization error results.
The serializer MUST signal the error.
addGIf the serialized result would contain a character
that is not permitted by the version of XML specified by the
version parameter, a serialization error results. The
serializer MUST signal the error.
addGFor example, if the version
parameter has the value 1.0, and the instance of the data
model contains a non-whitespace control character in the range #x1 to
#x1F, a serialization error results.
If the version parameter has the value 1.1
and a comment node in the instance of the data model contains a
non-whitespace control character in the range #x1 to #x1F or a
control character other than NEL in the range #x7F to #x9F, a
serialization error results.
XML Output Method: the encoding Parameter[top]
XML Output Method: the encoding Parameter
The encoding parameter specifies the
preferred
encoding to use for outputting the instance of the data model.
Processors
Serializers
are REQUIRED to support values of UTF-8 and
UTF-16. A serialization error occurs if an output
encoding other than UTF-8 or UTF-16 is
requested and the serializer
does not support that encoding. The
processor
serializer
MUST
MAY signal the error, or
MAY recover by using
UTF-8 or UTF-16 instead. The
processor
serializer
MUST NOT use an encoding whose name does not match the
[NT-EncName]
production of the XML Recommendation [XML].
If no
encoding parameter is specified, then the
processor
serializer
SHOULD
MUST use either
UTF-8 or UTF-16.
When outputting a newline character in the instance of the data model, the
serializer is free to represent it using any character sequence
that will be normalized to a newline character by an XML parser,
unless a specific mapping for the newline character is
provided in a character map: see [character-maps].
When outputting any other character that is defined in the
selected encoding, the character SHOULD
MUST be output
using the correct representation of that character in the selected encoding.
It is possible that the instance of the data model will contain a character that
cannot be represented in the encoding that the
processor
serializer
is using for
output. In this case, if the character occurs in a context where XML
recognizes character references (that is, in the value of an attribute
node or text node), then the character SHOULD
MUST be output as a character
reference. A serialization error occurs if such a character appears in
a context where character references are not allowed (for example if
the character occurs in the name of an element). The
processor
serializer
SHOULD
MUST
signal the error.
addGFor example,
if a text node contains the character LATIN SMALL LETTER E WITH ACUTE (#xE9),
and the value of the encoding parameter is
US-ASCII, the character MUST be serialized as a character
reference. If a comment node contained the same character, a
serialization error would result.
XML Output Method: the indent Parameter[top]
XML Output Method: the indent Parameter
If the indent parameter has the value
yes, then the xml output method MAY output
whitespace in addition to the whitespace in the instance of the data model (possibly
based on whitespace stripped from either the source document or the
stylesheet, in the case of XSLT, or
guided by other means that might depend on the host language,
in the case of an instance of the data model created using some other process)
in order to indent the result nicely; if the
indent parameter has the value no, it
SHOULD
MUST
NOT output any additional whitespace.
If the xml
output method does output additional whitespace, it
SHOULD
MUST use an
algorithm to output additional whitespace that satisfies the
following constraints:
-
Whitespace characters MUST NOT be added adjacent to a text
node that contains non-whitespace characters.
-
Whitespace MAY only be added adjacent to an element node,
that is, immediately before a start tag or immediately after an end
tag.
-
The new whitespace characters MAY replace existing whitespace
characters in the same position, for example a tab MAY be inserted as
a replacement for existing spaces. However, existing whitespace MUST
NOT be removed without such a replacement.
-
Whitespace characters MUST NOT be inserted in a part of the
result document that is controlled by an
xml:space="preserve" attribute.
xml:space attribute with value
preserve. (See [XML] for more information
about the xml:space attribute.)
-
addGWhitespace characters SHOULD NOT be added in
places where the characters would be significant — for example, in the
content of an element whose content model is known to be mixed.
NOTE:
The effect of these rules is to ensure that whitespace
MAY only be
is only
added in places where (a) XSLT's <xsl:strip-space>
declaration could cause it to be removed, and
(b) it does not affect the string value of any element node with
simple content. It is usually not safe to indent document types that include elements
with mixed content.
XML Output Method: the cdata-section-elements Parameter[top]
XML Output Method: the cdata-section-elements Parameter
The cdata-section-elements parameter contains a list
of expanded-QNames. If the expanded-QName of the parent of a text node
is a member of the list, then the text node
SHOULD
MUST be output as a
CDATA section, except in those circumstances
described below.
If the text node contains the sequence of characters
]]>, then the currently open CDATA section
SHOULD
MUST be
closed following the ]] and a new CDATA section opened
before the >.
If the text node contains characters that are not
representable in the character encoding being used to output the
instance of the data model, then the currently open CDATA section
SHOULD
MUST be closed
before such characters, the characters
SHOULD
MUST be output using
character references or entity references, and a new CDATA
section
SHOULD
MUST be opened for any further
characters in the text node.
CDATA sections SHOULD
MUST NOT be used except where they
have been explicitly requested by the user, either by using the
cdata-section-elements parameter, or by using some other
implementation-defined mechanism.
NOTE:
This is phrased to permit an implementor to provide an option that
attempts to preserve CDATA sections present in the source
document.
XML Output Method: the omit-xml-declaration
and standalone
Parameters[top]
XML Output Method: the omit-xml-declaration
and standalone
Parameters
The xml output method
SHOULD
MUST output an XML declaration
unless
if
the omit-xml-declaration parameter has the value
yes
no. The XML declaration
SHOULD
MUST include both version
information and an encoding declaration. If the
standalone parameter
has the value yes or the value
no,
the XML declaration
SHOULD
MUST include a
standalone document declaration with the same value as
the value of the standalone parameter.
Otherwise, it
If the standalone parameter has
the value none, the XML declaration
SHOULD
MUST
NOT include a standalone document declaration; this ensures
that it is both an XML declaration (allowed at the beginning of a
document entity) and a text declaration (allowed at the beginning of
an external general parsed entity).
delEThe omit-xml-declaration parameter
SHOULD
MUST be ignored
if the standalone parameter
has the value yes or the value
no,
or if the
encoding parameter specifies a value other than UTF-8 or
UTF-16.
addEA serialization error results if the
omit-xml-declaration parameter has the value
yes, and
The
processor
serializer
MUST signal the error.
XML Output Method: the doctype-system
and doctype-public Parameters[top]
XML Output Method: the doctype-system
and doctype-public Parameters
If the doctype-system parameter is specified, the
xml output method SHOULD
MUST output a document type
declaration immediately before the first element. The name following
<!DOCTYPE SHOULD
MUST be the name of the first element,
if any. If
the doctype-public parameter is also specified, then the
xml output method SHOULD
MUST output PUBLIC
followed by the public identifier and then the system identifier;
otherwise, it SHOULD
MUST output SYSTEM
followed by the system
identifier. The internal subset SHOULD
MUST be empty. The
doctype-public parameter SHOULD
MUST be ignored unless the
doctype-system parameter is specified.
XML Output Method: the undeclare-namespaces Parameter[top]
XML Output Method: the undeclare-namespaces Parameter
The Data Model allows an element
to have fewer in-scope namespaces
than its parent.
node that binds a non-empty prefix to have
a child element node that does not bind that same prefix.
In
XML 1.1,
Namespaces in XML 1.1
([XMLNAMES11]),
this can be represented
most
accurately by undeclaring
namespaces. If the undeclare-namespaces parameter has the value
yes and
the output method is XML and the version is greater than
1.1
1.0,
serialization
the serializer
SHOULD
MUST undeclare namespaces.
Consider an element x:foo with
three
four
in-scope namespaces
that associate prefixes with URIs as follows:
-
addGx is associated with
http://example.org/x
-
addGy is associated with
http://example.org/y
-
addGz is associated with
http://example.org/z
-
addGxml is associated with
http://www.w3.org/XML/1998/namespace
<x:foo xmlns:x="http://example.org/x"
xmlns:y="http://example.org/y"
xmlns:z="http://example.org/z">
Suppose that it has a child element x:bar with
two
three
in-scope namespaces:
-
addGx is associated with
http://example.org/x
-
addGy is associated with
http://example.org/y
-
addGxml is associated with
http://www.w3.org/XML/1998/namespace
<x:bar xmlns:x="http://example.org/x"
xmlns:y="http://example.org/y">...
If namespace undeclaration is in effect, it will be serialized this way:
<x:foo xmlns:x="http://example.org/x"
xmlns:y="http://example.org/y"
xmlns:z="http://example.org/z">
<x:bar xmlns:z="">...</x:bar>
</x:foo>
In
XML 1.0,
Namespaces in XML ([XMLNAMES]),
namespace undeclaration is not possible.
If the output method is xml,
the value of the undeclare-namespaces
parameter is yes,
and the value of the version parameter is 1.0,
a serialization error results; the
processor
serializer
MUST signal the error.
namespace
undeclaration is not performed, and the undeclare-namespace
parameter is ignored.
XML Output Method: the normalization-form Parameter[top]
XML Output Method: the normalization-form Parameter
The
delEnormalize-unicode
addEnormalization-form
parameter is applicable for the
xml output method.
The values NFC
and none MUST be supported by
the
processor
serializer.
A serialization error results if the value of the
normalization-form parameter specifies a normalization form
that is not supported by the
processor
serializer;
the
processor
serializer
MUST signal the error.
addEIt is a serialization error if the value of the
parameter is fully-normalized and any relevant construct
of the result begins with a combining character. The
processor
serializer
MUST signal the error. See Section 2.13 of [XML11] for the
definition of the relevant constructs of XML.
XML Output Method: Other Parameters[top]
XML Output Method: Other Parameters
The media-type parameter is applicable for the
xml output method.
See [serparam] for more
information.
The use-character-maps parameter is applicable for the
xml output method.
See [character-maps] for
more information.
addGThe byte-order-mark parameter is
applicable for the xml output method. See
[serparam] for more information.
|