Re: Fun With Schemas

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

To: "Schuldt, Ron L" <ron.l.schuldt@l...>
Subject: Re: Fun With Schemas
From: john c hardin <john@c...>
Date: Wed, 16 Mar 2005 09:58:21 -0500
Cc: David Lyon <david.lyon@c...>, xml-dev@l...,chris.harding@o...
In-reply-to: <E10C8EF47F42494EB6942801E1F3B75E09F36F90@e...>
Organization: ..:: CrossConnections.ws ::..
References: <E10C8EF47F42494EB6942801E1F3B75E09F36F90@e...>
Reply-to: john@c...
User-agent: Mozilla Thunderbird 1.0 (Windows/20041206)

Thanks Ron - I got in too much of a hurry.

~~~~~~~~~
john c hardin
CIO - crossconnections.ws
313.930.5323 cell
mailto:john@c...

"The new electronic interdependence recreates the world in the image of a global 
village."

     Marshall McLuhan, "Gutenberg Galaxy", 1962



Schuldt, Ron L wrote:
> All,
> 
> One minor correction --- the udefid d.t.2_8 is equivalent to the UDEF name purchase.order.document_identifier 
> 
> In most situations the term "number" maps to the UDEF property term "identifier" but in some situations it would map to the UDEF property term "quantity" - for example number of cases or number of cartons would use the UDEF property term quantity rather than number. 
> 
> Although somewhat out-of-date (e.g., many new extensions have been added to the XML version), the UDEF tree structures can be viewed at http://www.udef.org/specdoc/UDEFv1pt03-July-2003.htm
> 
> Ron Schuldt
> Senior Staff Systems Architect
> Lockheed Martin Enterprise Information Systems
> 11757 W. Ken Caryl Ave.
> #F521 Mail Point DC5694
> Littleton, CO 80127
> 303-977-1414
> ron.l.schuldt@l...
> 
> 
> -----Original Message-----
> From: john c hardin [mailto:john@c...]
> Sent: Tuesday, March 15, 2005 7:24 PM
> To: David Lyon
> Cc: xml-dev@l...; chris.harding@o...; Schuldt, Ron L
> Subject: Re:  Fun With Schemas
> 
> 
> Hey guys, take a look at the http://www.udef.org
> and http://www.opengroup.org/projects/udef/
> 
> It is built to cross domains, in a very format independent way, meaning that it 
> can be included in any format as an attribute, for example:
> 
> <poid udefid=d.t.2_8>  is the same data element as
> 
> <buyerordernumber udefid=d.t.2_8> which is the same as
> 
> <purordnum udefid=d.t.2_8>
> 
> 
> d.t.2_8    is literally translated    purchase.order.document_number
> 
> There are 16 object words, and 18 property words, with trees of qualifiers under 
> each word, making it possible to string together nearly any combination, to 
> define any data element concept (semantics and context). It is infinitely 
> extensible.
> 
> 
> 
> 
> ~~~~~~~~~
> john c hardin
> CIO - crossconnections.ws
> 313.930.5323 cell
> mailto:john@c...
> 
> "The new electronic interdependence recreates the world in the image of a global 
> village."
> 
>      Marshall McLuhan, "Gutenberg Galaxy", 1962
> 
> 
> 
> David Lyon wrote:
> 
>>On Tue, 15 Mar 2005 4:13 pm, Rick Marshall wrote:
>>
>>
>>>i must be missing something here. every day i do battle with
>>>translations from one vocabulary to another. flat files to csv to edi to
>>>xml to printer codes to postscript etc. actually i'm a bit over it all
>>>at the moment.
>>
>>
>>Yes me too. I thought you were one of our respected leaders and
>>teachers for the W3C in Australia. Helping and inspiring us mere mortals 
>>that aren't on the W3C to go forward and do useful and interesting things.
>>
>>It doesn't sound very inspiring or rewarding... nothing much in there 
>>for any Uni graduate learning xml on the list and wanting to hit the work 
>>force and do something useful.
>>
>>No wonder the job market for xml is so bad... if this is your idea
>>of W3C XML fun then I think I am going to be sick....
>>
>>
>>
>>>to do what len has suggested you need a dictionary - (not a data
>>>dictionary, but a dictionary) that says an attribute, element, whatever
>>>in one vocabulary is <something /> in another. possibly rdf is a good
>>>way to express this. then you need a translator that can read an output
>>>schema (and produce valid output). then it needs a schema to describe
>>>the input stream.
>>>
>>>putting it all together:
>>>
>>>1. decide on and maintain internal schema representation for data
>>>2. maintain translator tables from internal schema to output schema
>>>vocabulary (rdf?)
>>>3. maintain output schema
>>>
>>>all 3 should be able to be maintained with some independence (actually
>>>changing 1 or 3 only requires a change to 2 - which is the point)
>>>
>>>then to use it:
>>>
>>>1. convert arbitrary input stream to internal xml schema
>>>2. use schema aware tool (you might have to write this) to load items 1
>>>to 3 above and translate from input to output stream.
>>>
>>>i haven't built such a tool yet, but i've done enough of it by hand to
>>>know that this is the correct broad direction for such things.
>>>
>>>there are some other problems in the "real world" that linguists know
>>>about only too well. syntax and semantics. let's say you can translate
>>>the vocabulary - does the output go together in the same order as the
>>>input. worse are attributes in the input still attributes in the output
>>>or are they now elements? consider translating the location of
>>>adjectives in english and french or verbs in german etc (and that's just
>>>western languages). then there's the problem of semantics - is this the
>>>correct vocabulary choice in this setting? in australia (queensland
>>>actually) xxxx (4x) is a beer, my understanding is that it's a sex aid
>>>in america....
>>>
>>>it would be really interesting to know how those auto translator things
>>>(like google translator) work because they must have tackled many of
>>>these problems.
>>>
>>>as an aside. it would be good if there was a sort of xslt that worked
>>>like this. as the xsl gets bigger, it gets harder to know if you're
>>>producing valid output and harder to change the model.
>>>
>>>rick
>>>
>>>Peter Hunsberger wrote:
>>>
>>>
>>>>On Mon, 14 Mar 2005 15:02:24 -0800, Bob Foster <bob@o...> wrote:
>>>>
>>>>
>>>>>Generating instances from schemas usually just produces one of the
>>>>>infinite number of instances restricted by certain trivial parameters. I
>>>>>don't know of an example where meaningful instances are generated.
>>>>>
>>>>>If a generated document changes automatically depending on the schema it
>>>>>finds at the time of generation yet somehow contains the same
>>>>>"information", there must be a model of the document that is independent
>>>>>of the schema, e.g., something like an ER model. Then the model must be
>>>>>populated: this concrete entity has that relationship to these other
>>>>>concrete entities, etc. Then there must be a mapping from the abstract
>>>>>document model to the elements and attributes used in the schema. When
>>>>>the schema changes, the mapping must change in concert (and there must
>>>>>be a way to prevent changes to the schema that violate the abstract
>>>>>document model, e.g., changing an unbounded relationship to a bounded
>>>>>one).
>>>>>
>>>>>After that, piece of cake. ;-}
>>>>
>>>>Instance traversal is something I didn't touch on but of course is the
>>>>real issue here: what's the data source?  I had assumed the
>>>>application would be traversing some form of relational DB or similar
>>>>and that there was already some natural key structure and
>>>>relationship metadata/data around.  Not necessarily a good
>>>>assumption...
>>>>
>>>>If not, you need some source of control over the data source or
>>>>complete metadata.  If the data is simple and you control it you can
>>>>just add id/idref pairs to it to get simple hierarchical descent
>>>>traversal.  But if the data's that easy to walk then I'm not sure why
>>>>you're doing this.
>>>>
>>>>Beyond that you can make some simplifying assumptions.  The easiest is
>>>>something like assuming every element contains an attribute with the
>>>>same name plus something like "Id" appended and that every referring
>>>>element will include an identically named attribute. That will get you
>>>>lattice like graph traversal and many to many relationships. However,
>>>>unless this is also enforced on the data population side it sounds
>>>>rather fragile...  Then again, we do know you have a Schema that can
>>>>be checked at data population time :-)
>>>>
>>>><snip/>
>>
>>
> 
> 
>

Prev by Date: RE: Advanced XML Schema Training Class?
Next by Date: RE: Aggregated content, fact checking, PICS, Atom/RSS ( was Rigged Aggregators)
Previous by thread: RE: Advanced XML Schema Training Class?
Next by thread: RE: Aggregated content, fact checking, PICS, Atom/RSS ( was Rigged Aggregators)
Index(es):
- Date
- Thread

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >