Re: An approach to let XML 2.n resources hold multiple entitie
From: "Jeni Tennison" <jeni@j...> > Here's a third possibility: in XPath 2.0, the collection() function > returns a sequence of nodes from a particular URL. In the case of a > file that contains multiple documents, collection("first example") > could return multiple document nodes, so you would use > collection("first example")/* to get the <y> element. ... > Such a document could also act as the input to a transformation. In > XSLT 2.0, the input() function returns a sequence of nodes as an > input, and in the case of a "resource"/"collection document" in the > format you suggest, this could be a sequence of document nodes. So if > your document was acting as the input to a transformation then > input()/* would return the <y> element. Great! Also, I am not suggesting files containing multiple documents, but [the result of dereferencing a URL]'s containing multiple XML entities. "Bill de hÓra" wrote: > This proposal will probably result in encoding weirdness unless it > offers some guidance in that area. There would be a WF constraint that only the first entity had an encoding declaration, the subsequent ones would have an xml declaration only. The whole file/stream would be in a single encoding. (I would prefer "approach" to "proposal": much to early!) From: <AndrewWatt2000@a...> > Would you care to lay out the use case for this suggested change? 1) Log files, or documents where you want to continually append fragments, without wanting to keep the context to end them, for example in a stream. Using that XPath2 input() function Jenny mentioned (without trying to get a proper XPath2 path): <?xml ...?> <!DOCTYPE logs [ <!ENTITY contents SYSTEM "#input()/*[position()!=1]"> ]> <logs>&contents;</logs> <?xml ...?> log 1 <?xml ...?> log 2 ... 2) Incremental or lazy parsing of documents. The parser reads the first document (e.g. into a DOM). When the user agent requests elements in a subsequent entity, the parser continues parsing (or fast scans) to that entity then parses it. (This shows the downside of using text rather than a control character: super fast skipping over entities is not really available--you need to have some simple delimiter-aware/element-stack-aware skipping.) 3) Transmitting a Post Schema Validation Infoset without altering the original document: the PSVI augmentations are added as extra entities to the same resource, thereby not altering the original document's XPaths at all. Or any document where we want to have out-of-line annotations to an existing document, preserve the original document intact, and transmit the whole thing as one resource. 3a) Transmit a RELAX NG, Schematron or XSD schema or XSLT stylesheet along with the document. 3b) A RDDL document, together with the XML resources it references Or any time we want to bundle together different "documents" which each use a different standard schema but act as a whole, and where we want to name and access them as a single resource. 4) To suit transmission of documents over the web, where we want to be able to start rendering the document as soon as we receive it. This is problematic in XML, because if a WF error is found, the document is supposed to fail. Using this multi-entity XML, the user agent does not need to wait till the whole document is received before rendering that top-level chunk. If datacoms are bursty, then starting to render the document does not need to wait until the end. Consider how Acrobat makes pages available as soon as they are received, or HTML's progressive rendering. Any time we have a sequence of pages and once the user has started at the first one, we want to have subsequent ones pre-fetched and available ASAP. Note that the use of the term "entity" here does not imply that they are in anyway tied to XML entity declarations (syntax) or XML entity reference syntax. They could be not declared, but referenced using XInclude. Indeed this kind of XInclude use might be more appropriate for this use case, to avoid WF controversies. 5) To provide a way out of the signing problem, to make it trivial for a document to be sent along with metadata about itself such as checksums. Or where the metadata is not part of the document but application specific: such as where the document does not have Dublin Core elements in its schema, but we want to ship along a Dublin Core metadata file with the document. 6) To decrease the amount of buffering required at the server side. This is the table of contents (TOC) problem. If we want to progressively accumulate elements when transforming a document, then place these first in our document, we have to suspend transmitting our document at that point until we have harvested all the information that is supposed to go their. Instead, with this we can transmit the document first with a reference at the TOC point, then transmit the TOC as a subsequent entity in the same resource. That they are in the same resource means we don't need to make the TOC available at some temporary URI, and the user agent does not need to make a second request nor open another connection. The recipient puts the information in sequence, not the sender. 7) Because entities (reading this to mean "fragments" and avoiding kneejerks based on XML markup declarations) are a form of modularity which gives programmers more flexibility. A Good Thing 8) Where we want to add information as part of a document, but we don't want to check for ID clashes (the other entities can act as SUBDOCs and have their own ID scope if they are not referenced with & as entities.) 9) So that application-settings can be saved along with with a document. For example editor settings: rather than pollute the document with extraneous PIs or elements, the information is tacked on to the end. 10) To allow the storage of deltas, out-of-band with a document. The first entity is the document, and some subsequent unreferenced entity gives deltas. This is for editing and version control. Probably there are more, and I don't know whether any of these are particularly compelling. Cheers Rick
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format