Table of contents
Appendices
|
3.7 Lexical Structure
Lexical Structure
When tokenizing, the longest possible token is always returned.
For readability, whitespace may be used in expressions even though not
explicitly allowed by the grammar:
[ExprWhitespace]
may be freely added within
patterns before or after any
[ExprToken]
.
The following special tokenization rules must be applied in the
order specified to disambiguate the
[ExprToken]
grammar:
-
If there is a preceding token and the preceding token is not
one of @, ::, (,
[, , or an
[Operator]
, then a * must be
recognized as a
[MultiplyOperator]
and an NCName must be
recognized as an
[OperatorName]
.
-
If the character following an NCName (possibly after intervening
[ExprWhitespace]
) is (,
then the token must be recognized as a
[NodeType]
or a
[FunctionName]
.
-
If the two characters following an NCName (possibly after intervening
[ExprWhitespace]
) are ::,
then the token must be recognized as an
[AxisName]
.
-
Otherwise, the token must not be recognized as a
[MultiplyOperator]
, an
[OperatorName]
, a
[NodeType]
, a
[FunctionName]
, or an
[AxisName]
.
Expression Lexical Structure
|