[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: What are the characteristics of a good type system for XML


characteristics of a good library
From: "Amelia A Lewis" <amyzing@t...>
> >- A type system should be based on a small number of primitive types
(much
> >smaller than those in XML Schema Datatypes) , and all other types should
be
> >defined in terms of these.
>
> Err.  I have said that in the past.  I've reconsidered, though.  I would
say
> that the type system must define the rules for creating and publishing
> primitive types.  Then let the authors and users and implementors of XML
> decide which of those are interesting and useful.  This also means that
> private agreements can adopt less "universal" types that happen to be well
> suited to their particular domain.

Yes, this is necessary. But I'm not sure it is sufficient.

> >-A type system should be extensible, but more than that, there should be
> >ways to introduce new type systems (for simple types).
>
> Agreed.
>
> >- The ways that types can be extended should not be limited to a few
> >predefined parameters, e.g., the facets in XML Schema. A type system
should
> >be able to define its own parameters.
>
> I think we're on the same page.  Apart from defining how to define new
> primitive types, the system ought to also define how to define derivation
> and composition algorithms.
>
> >- A type system should provide, for each type it produces, a function to
> >answer true or false if a given string is valid,
>
> Yes.
>
> > a function to translate a
> >string into an instance of the type defined in terms of the primitive
types
> >mentioned above,
>
> err, no.  I think that's outside the scope of XML.

It's outside the scope of XML parsing and validation, but inside the scope
of XML transform and query languages. Atomic types cannot be entirely opaque
to such languages. At an absolute minimum, users expect to be able to do
arithmetic using values of numeric types as operands.

While I don't agree with their choice or the hard-wired nature of it, I can
certainly see why designers of a query language would jump on a type system
that gave them the numeric types their users would demand. The date types
are scary, but _some_ representation of dates is obviously an important
application requirement. If you want to look at another ugly type system,
check out any SQL dialect. Same reason. A small number of types are
ubiquitous in business and scientific applications; the rest creep in as
implemention details or frozen mistakes.

> It might possibly be
> useful, but you can't really predict, sitting inside the XML world, what
the
> type system you're mapping onto looks like, or how the transform is going
to
> work.  Supposing that a library defines a "date" type, how can it
reasonably
> define the transformation to an instance of that type in Java, C++,
Python,
> Haskell, and Perl as a single function?

That's a good question, but I think it can be answered. (Unfortunately, not
concisely. Next time, maybe. ;-)

It is only necessary to be as universal as the users of a type demand. The
RNG datatype api is defined in terms of Java. Languages that participate in
the CLI can use an existing datatype library directly; in the worst case,
the library must be hand-translated to another language. Even then, the api
is trivial to translate, as it is defined in terms of strings and boolean
tests; all the complexity is in the types. That seems about right.

The RNG api is carefully designed to make no assumptions about types not
necessary to perform validation. Thus, it is reasonable to point to it as an
exemplar of good api design, but it is not reasonable to assert that it
meets the needs of every other application. As noted elsewhere in this
thread, not all types can be collated, but validation only requires an
equality comparison, so the issue is avoided. Query/transform languages,
however, require sorting, so the issue cannot be ducked. Validation does not
need to do arithmetic, either (or at least RNG validation doesn't) but
query/transform languages do.

Extending an api for sorting is trivial. One simply needs a boolean test
whether the type is sortable, and another along the lines you suggest to do
the comparison. Extending an api to allow conversion between string and one
of int, float, double or boolean is also trivial. These are in the
intersection of every language one would bother with, as are arrays of any
of these.

Beyond that, an api must and should be opaque. There are two ways to
approach it. One can declare that there is _no_ way to convert between
string and instance, other than read the definition and write the code. This
is the approach used today. The api for dates begins, "First, write an
ISO8601 parser..." The other is to provide an opaque interface that provides
no more than a means of converting between "instance" and string
representation, together with a way to determine at runtime whether the
interface is available for a given type. Having such interfaces would be
more useful than not having any, even if they are not always available for a
given type and even if availability varies from language to language. It
would open the door just wide enough to allow implementations to slip in;
universality would be determined by availability and demand.

Most non-trivial instance types must carry along with them some sort of
library that provides a means of introspecting and manipulating them. For an
object-oriented language, the instance would be an object and the library a
set of classes necessary to use the object; for a non-object language like
C, the instance would be a struct and the library a set of functions; and so
on. Such a library can be outside the domain of XML for a given type as long
as people think it should be, but it is well inside the domain of
application languages that must use a datatype in a non-trivial way, e.g.,
to format it for localized presentation, to do whatever arithmetic or
algebra the type permits.

The "date" types in XQuery/XPath 2.0 are good examples precisely because
there is no programming language in use today that provides exactly those
types. I would much prefer to have the opaque api and accompanying library,
that I might apply to any language, than to have it hard-wired into each
implementation of XQuery/XSLT.

Bob

> > and a function to translate an instance of the type to a
> >string.
>
> Err.  Well, it *is* a string.  In XML.
>
> Also, you've left out sorting.
>
> I would say, so far as functions go:
>
> For the type gronk:
>
> Given a string, the gronk type specification allows you to determine if
this
> is a representation of a valid instance of type gronk.
>
> boolean gronk(xmlstring);
>
> Given two strings known to be of type gronk (see preceding function),
return
> -1 0 1 to indicate whether the first is smaller than, equal to, or larger
> than the second (an equality function, plus a bit).
>
> [-1,0,1] gronkSort(xmlstring, xmlstring);
>
> Does that help?
>
> Amy!


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.