|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Why XML data typing is hard (was Re: Internal subset equivalent in
"G. Ken Holman" <gkholman@c...> writes: >> but not >> <value>4,50</value> > Then your example proposed range of values is inappropriate because "4,50" > is a valid float from an I18N point of view. I want to specify, in my DTD, what kind of data my processing system know how to deal with. Apparently, in my example, the processing system does not know how to deal with commas in "value"s. This may or may not be an inadequacy with respect to I18N. Who said anything about "float"? > In Canada Yes, yes. Here too we use commas as decimal ..uh.. points. > And I suppose your regular expression example could be changed to > <!element value #REGEXP:"-?[0-9]*(\.|,)[0-9][0-9]"> (whoops, forgot to escape the dot, didn't I) > I gather from Michael S-McQ in a presentation in Chicago that the regular > expression for a valid date (taking into account days of the month and leap > years) is 4801 characters long. Yes. Some may want to build all of this into a type system that XML parsers need to handle, I suppose with mappings to the various programming languages and machine architectures that may or may not support that type natively. As an alternative, I suggested restricting the *form* instead of the *type* of the content, since a) it's a *lot* simpler to implement, almost trivial b) gives the application a clear indication of what data it needs to understand c) catches errors in data early, avoiding potential run-time errors (Y2K?) d) avoids a lot of complexity that you probably don't need in 90% of the cases Look at the date example. First you need to embed your 4801 character regular expression into parsers that understand xml:type="date". Then you need the parser to provide something useful, a "struct date", a time_t or perhaps a reasonable s-expression, or perhaps some machine specific stuff on your embedded system. And *then* you worry about what to do when people type "01/02/03". Alternatively, you could force people to use "YYYY-MM-DD" by forcing conformance to a regular expression, and have your applications only have to deal with that. And, I think it's pretty obvious that there are a lot of very complex data types out there. What's the format for version numbers, for instance? Or license plates? Are you ready to come up with an xml:type that covers all cases? (And I bet the 4801-definition doesn't even cover Chinese or Mayan calendars, or deal correctly with Muslim dates, or seamlessly integrate Julian and Gregorian.) >> What would the point of using xml:type be? > Perhaps to abstract what is being expressed in markup to allow different > lexical expressions of the same value to be considered valid. My point is that you cannot do that without also providing a correct translation from the lexical expressions of that data type into a native representation of that data type. And that translation may not make a lot of sense, there are architectures and languages without concepts such as "float", for instance. Having all documents be universally understandable and unambigous is a laudable goal, of course. But I don't see it happening. Sorry to be so negative, but at least I didn't mention how I think XML is going to destroy the WWW. Whoops. :-) ~kzm -- If I haven't seen further, it is by standing in the footprints of giants xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








