[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: What are the characteristics of a good type systemfor XML?
Heylas, Andrew, On Tue, 13 May 2003 04:21:16 EDT AndrewWatt2000@a... wrote: > In a message dated 13/05/2003 04:37:12 GMT Daylight Time, > amyzing@t... writes: > > Slogan version: complete, consistent, comprehensible. > > <disclaimer>Comments from here on are very much the late night musings > of a neophyte.</disclaimer> > > My late night list of headline wants was: > > 1. Easy to understand > 2. Practical > 3. Modular / Layered > 4. Facilities to derive new types I'm not quite certain what "practical" means in context. Perhaps "most commonly used types are already defined"? > > Consistent: the system must be rules based. The rules must > > logically follow one another. A good start might be to restrict the > > characterization of types to that relevant to XML (that is, to the > > realm of data transmission and storage, excluding data > > manipulation). > > Isn't that possibly favouring one use case over another? I don't *think* so. I could be wrong, of course. > (I also > > think that a good start is to make xmlstring the ur-type, such that > > all"primitive" types are constrained from that starting point; > > clearly a debatable position) > > Interesting. In my late night whiteboard ruminations I got to the same > spot. Fundamentally all XML data is string data (no surprise there) > but to treat it as fundamentally string data and work from there is, > seen from a W3C XML Schema perspective, a little radical perhaps. It seems to me that the core XML 1.0 spec provides a definition of base-xml-string validation, in its well-formedness constraints for text and attribute nodes. That is, base type validation is equivalent to well-formedness for text and attribute nodes. My sense is that when the spec authors begin talking about "value space", then the discussion may already strayed out of XML's yard and onto a busy street. Is it important to be able to manipulate types? Sure. Is it something that XML can do? No. But as Bob Foster points out elsewhere in this thread, it is something that transformation and query languages can do. Perhaps this means more modularization in type definitions: simple definitions for validation/equality, plus more detail on how this can be plugged into [some language that allows type libraries to be plugged in]. > Of course, I guess there could be those that consider that anything > that can be grasped in 30 minutes is nowhere near "sophisticated" > enough. :) *laugh* Fish on them, then. > > I suppose I could provide a still-more-detailed version, but I'd > > rather pursue my own agenda (is that full disclosure, or the > > magician's wave distracting attention?). > > If I knew what you were trying to say, I could respond. :) Just a warning that I was about to derail the discussion more toward my own ends than directly to the goal of responding to your questions. Either I was admitting it, or by saying that I was doing so, I was distracting attention from my digression. I'm not sure which, myself. > > From my perspective: the base type is xmlstring, a sequence of a > > subset of unicode characters (excluding C1, C0 except HT CR LF, the > > character'<', and the character '&' except as the start of an escape > > sequence). > > So, you are basically applying a regular expression to a string? Well, you certainly *could* do that as a regular expression (which is a form of algorithm), but I don't see why you'd *have* to. You could also do it with BNF (as in XML 1.0 spec), or even do a full unicode lookup table, decorating each position with a boolean "permitted" or "not permitted". > Fundamentally isn't that what types in string-based data are about? > Or, in W3C XML Schema jargon, isn't that what the lexical space is > about? We clearly need more than just lexical space in order to do equality and comparison testing. For the base xmlstring, though, lex is all. > > I would suggest that a standard set of derivations (similar to, but > > perhaps a little more systematic than W3C XML Schema's 'facets'), > > and a standard set of combinators (please more than the extremely > > weak 'list' defined by W3C XML Schema) would also be important, so > > that primitive types may become derived types. > > Maybe this is just another heretical thought but aren't facets just > tightening up the regular expression a little? I think that facets are. If you'd like, though, I can offer an example of the use of the concepts of "atom" and "composed types" (primitive types composed from atoms) to show more clearly what I mean. You can use such a thing, for instance, to provide a single type that can substitute for W3C XML Schema's dateTime, date, gYearMonth, and gYear types (all the "time period" types), but is more expressive. However, defining recurring periods (unifying time, gMonth, gMonthDay, and whatever the other privileged one is) requires setting up a composed-from-atoms abstract base type, and then defining certain sorts of special derivations that can be applied to it to create concrete types (and in so doing, you end up realizing just how incredibly poverty-stricken the recurrent dates in WXS are, btw). > Now, if we had named and hierarchical regular expressions .... :) I think that chasing the regex hare may be misleading you from the algorithmically defined types fox hunt. > It's interesting to explore where W3C "requirements" come from. > Somehow they often tend to be rather complete and firm when they are > first made available for public discussion. Actually, for W3C XML Schema, many of the requirements were perfectly understandable. They were building on work that had gone before, so they *had* to be able to do certain things. DTD types: required. As much expressiveness as XDR: required. Some form of inheritance, such as defined in SOX: required. I think the term is "second system effect." That's what I see, looking at WXS. I think that we can gain valuable lessons from discussing it, and deciding *what* *went* *wrong*. That is, I'm not out to needle folks for the fun of it. I want to see if we can't figure out a way to define types in a more minimalist and more complete fashion by examining the problems that those who tried before ended up stuck with. > I think we agree on some points. For example, that W3C XML Schema > datatypes are not a long-term solution, in part because of poor > layering / modularisation in design. Yup. > I guess we are taking slightly different pragmatic approaches to XSLT > / XPath / XQuery. I see them as "happening anyway", unless the > "revolution" gathers a Oh, I think so too. I think it *might* be possible to convince the committee that it ought to abandon the complete reliance on WXS by pointing out some possible solutions, but given the work that the working group has already put in, I suspect that opening such an architectural issue would be nearly impossible, at this point. Just bad timing, maybe. RNG provided a different way to think about types (even if some of us were fed up with WXS types before, we had great difficulty in expressing why, or what an alternative might look like; the pluggability of types in RNG is a broadening of the horizons). But XPath/XSLT/XQuery already had a path laid out, and is more interested in solving the particular problems found on that path than in finding a different path (even if the alternate path is shorter). Amy! -- Amelia A. Lewis amyzing {at} talsever.com Boxing is a lot like ballet, except that they don't dance, there isn't any music, and they hit each other.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|