Re: PI target names
From: Don Park <donpark@q...> >> One can wonder why PIs were specified in the first place, if W3C doesn't >> want anyone to use them. > >What I would like to know is WHY W3C does not want to encourage folks to use >PI. Perhaps I'll agree with them once I hear all the arguments, perhaps >not. I would like to know what the arguments are. We should be careful not to mythologize that everyone involved in W3C is anti-PI. Some people perhaps feel that XML is a technology preview to prototype ideas for HTML; that is a credible view to have (though I don't hold it personally); for them PIs represent a dead-end because it is too late to retrofit them into HTML, they think. * To me PIs represent, above all, a proposal of humility by SGML/XML's designers: to admit that even within the most carefully constructed schema, there may in practise be kinds of tags required that the schema designer has not anticipated or expected. These tags may represent kinds of structures which simply do not fit into the element structure. (And the people who detect these different structures may not have the authority or capability to alter the schema.) * Furthermore, there may be conflicting ideas about interesting points within a document, and that such different ideas should be allowed, but only by using the "low-hanging fruit" of point-based tags, not a full concurrent or asynchronous (wrong word) element tree. For example, in what schema language can you say "a document entity must start with this encoding header"? Or, before the top element you must associated a stylesheet? If entities were forced to start with elements and always to contain at least on element, then we could do away with these kinds of PIs: we could use attributes on elements. There is always a problem that in most DTDs (and in some of the schemas, EDD is an exception) that there are many possible root element types, and it is not possible to define attribute requirements based on tree locations (actually, SGML's attribute LINKing allowed some kinds of variant attribute-requirements based on tree-position): this creates a kind of aberrant category of attributes which belong to tree-locations rather than element types. * PIs also represent a method of extensibility in which the PI tags do not alter validation against the DTD. It would be nice to have a schema declaration language which allowed kinds of validity (or at least some kind of notation well-formedness) of PIs. But we should not think that extensibility was entirely missing from SGML: the trouble with PIs as traditionally practised in SGML was that there was no "target" convention enforced or defined, so the extensibility never was able to get organized. * In the absense of a standard way of pointing to individual character positions (numerical character indexing) there is no standard way to have out-of-line markup inside entities which does not disrupt element structures. PIs provide a way of tagging positions, both to accomplish inline parallel structures, if needed, or as targets for out-of-line markup. Unnormalized Unicode has a big ambiguity problem which makes, for many written scripts, numerical character indexing unreliable or problematic: so it may be useful to have the back-up of being able to index to particular PIs within an element instead. * The classic use of PIs is a tag to hang publication-dependencies on. (SGML also had another kind of attribute, the PI attribute, which allowed you to hang PIs off elements too. I don't know whether this can be simulated by ENTITY attributes, where the entity contains a marked-up PI, but I doubt it.) So, for example, you might decide that all pagebreaks and newlines should be signified by PIs. This simplifies content models no end (I can show examples, but they are for Chinese documents). If HTML had PIs, these could have been used to hide scripts (instead of <!-- which is just plain wrong) and for Server-Side Includes: not having the form of markup meant that comments had to be abused.) * The other big justification for PIs is an analytical one. Of course all the structures in XML can be reduced to LISP S-expressions, or RDF graphs and other intricate webs of arcs. But then the pieces need to be reassembled for the sake of comprehensibility and usability: characters are kept as strings not individual numbers each in their own tag, for example; some arcs are labelled to make a tree-structure (the other arcs are made links or attributes) . So PIs represent part of a theory of document structure or construction that says that element structure != entity structure != notation structure != processing instruction structures (perhaps "!=" is too strong: "need not equal" is better; and of course, XML simplifies this in the interests of parseability). Note that this is not a theory about the data itself; it is about documents/serializations. Rick Jelliffe P.S. It would be nice if the Schema group made declarations to allow numeric character references inside PIs and comments. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format