[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Is XML only half finished? The X Refactor
Ooops, format got lost. Try again. Where is XML thriving?
Where is XML not thriving?
What have been the pain points for XML over the years?
So... Is there a way to address these pain points and evolve XML? I think there is, and to clawback many features lost from XML while keeping a neat, simple pipeline that causes the least disruption to current APIs. Here is what I am thinking. XML is evolved into a notional pipeline of up to five steps: XML Macro Processor, Fully Resolved XML Processor, Notation Expander, Validation Processing, and Decorating Post-Processor. Lets call it "The X Refactor". 1) "XML Macro Processor" Full featured macro-processor, taking the features of M4: text substitution, file insertion, conditional text. Just before the advent of XML, Dave Peterson had proposed to the ISO committee enhancing the marked section mechanism with better conditional logic (nots, and, or, etc), so this is not a left-fielddea. (This is an enhanced standalone version of what SGML calls the "Entity Manager". ) Suggestion: Input: bytes. Output: fully resolved XML Unicode text.
Incompatabilities:
2) Fully Resolved XML Processor Stripped back XML processor without encoding handling, DOCTYPE declaration, CDATA sections, entity references, numeric character references. Suggestion: Input: Unicode text. Output: XML event stream.
Benefits:
Incompatabilities:
3) Notation Expander Process the contents of some element and replace delimiters with tags. The processor uses a Notation Definition Specification, which uses regular expressons and reuses the same tag implication fixup as the Error Handling Specification of the Fully Resolved XML processor above. The elements generated are synchronized with the containing element. Element markup inside the notation is allowed or rejected (as a kind of validation) Specialist notation processors are also possible: namely for JSON, and for the QuickFixes (Schematron parse and fixup), and to reconstruct the XML SHORT REF mechanism. Stretching it a bit, and HTML 5 style element housting might go in this stage too. Input: XML Event Stream. Output: XML Event Stream. Benefits: This is to reconstruct the idea of the SHORT-REF>ENTITY-REF->MARKUP mechanism in XML, where in a context you can define that a character like * should be shorthand for entity reference &XXX; and that this entity could contain a start tag <XXX> which would then be closed off by implication or explicitly or by some other shortreffed character.
4) Validation Processing Input: XML stream Output: Enhanced XML Event Stream (PSVI), or [XML input stream, XML validation report langage' This can use any subsequent DTD stage, or XSD, or an combination of the DSDL familiy (RELAX NG, Schematron, CRDL for character range validation, NVRL for namespace remapping, and so on.) Benefits: * The technology for this part of the tool chain is available * Except that there needs to be an "XML" output from validation. Consequenly either a type-enhanced standard SAX (for a Post Schema Validation Infoset), or a dual stream of the input plus an event stream of the validation report linking properties and errors to the original document (i.e. ISO SVRL) 5) Decorating Post Processor This would perform simple transformations steamable insertions into the event stream. (It could also be run before validation if needed.) Suggestion: Input: (enhanced) XML Event Stream, Output: (enhanced) XML Event stream Benefits:
What would it take? 1) Split apart an XML Processor into two parts. Dump DOCTYPE processing. Define and add a marked section logic expressions ( AND | OR | etc) to the Macro processor. Implement as a text pipe or as an InputStream. Add the error recovery. (An existing XML processor will accept Fully Resolved XML as is.) 2) Make some generic notation processor (anotated BNF + tag implication). A standard language should be adopted.. Make specialist processor for math, and XML Quick fix. Allow invocation either by a PI as the first child of the parent to flag the notation, or by some config file. Implement as text pipeline or SAX stream processor. 3) Validation technology exists. But how to sequence it is an open question (that DSDL punted): please not XProc. But does SAX support the PSVI? 4) A simple streaming substitution language would be trivial to define and implement as a SAX Stream. It would be a processing decision to add this, but there is no harm in notating this with a PI. A standard language should be adopted. So I don't see this is very disruptive, at the API level Afterthought: 20 years ago, when we were chopping up SGML to formulate XML, the thought was that we could afford to remove much useful functionality either because (such as with schemas) it could be upgraded into a different stage in the pipeline or (such as with conditional marked sections) because it was a back-end task suited inside servers rather than the wire format (SGML-on-the-Web.) We left the job unfinished: the pipeline is incomplete, and the back-end uses turned out to be the main use-case and has been neglected. The aim is not to reconstruct all of SGML, and certainly not to make a monolithic system with lots of feedback: we don't need an SGML Declration 2.0! But I suggest that filling out the pipeline would support many use cases. On Mon, Feb 12, 2018 at 2:32 PM, Rick Jelliffe <rjelliffe@allette.com.au> wrote:
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|