[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Shredding XML

  • From: "Jim Tivy" <jimt@bluestream.com>
  • To: "'Fraser Goffin'" <goffinf@googlemail.com>,<xml-dev@l...>
  • Date: Mon, 2 Nov 2009 13:57:32 -0800

RE:  Shredding XML
Fraser

I am not entirely hearing firm commitment that you plan to establish an RDB
schema and make it the driving schema.  In other words, what this would mean
is that data elements cannot be put into the RDB unless they exist in the
RDB schema.  For example, if some new data elements show up in some external
XML to be imported then the DBA decides whether to allow them into the
appropriate RDB column or not, or drop them for the time being.

Another option (from the infinite number) would be to let the XML schema
generate the RDB schema and the mapping code.  For your application
programmers using SQL on the RDB this would likely lead to gagging and
hacking and an "out of body experience"  This is not something I would
recommend and if this is what you want then get a database that supports
XQuery and retrain your developers.

But I think you have to choose between these two - the first being what it
sounds like you want - then work backwards from that decision.

Jim

-----Original Message-----
From: Fraser Goffin [mailto:goffinf@googlemail.com] 
Sent: Monday, November 02, 2009 12:22 PM
To: xml-dev@lists.xml.org
Subject: Re:  Shredding XML

Yes Jim, that is spot on.

Whilst there has been much discussion thus far on the technolgies and
techniques of getting data out of the database (and that has been
interesting), the programming for doing so are 'bread and butter' to
our mainframe Cobol and Sapiens guys, so thats not really my problem.

Mine is the task of getting the data from a fairly complex XML content
model into an appropriately factored relational database. The design
of that database is 'green field' but (and thanks to many on this
thread who have posted related papers) this may not be as easy at it
might at first appear, what with impedence mismatches here there and
everywhere ;-)

Its also the case that the XML data doesn't contain enough data
inherently to represent primary or foreign key values for all of the
relationships that are likely to arise. In some cases I MAY be
permitted to generate them myself (say using a UUID) as I 'walk' the
XML, in other cases I MAY be required to get the database to provide
the value(s), not sure yet. The later may increase the complexity
somewhat (sidenote: our DBAs don't allow stored procs (don't ask)  so
I'm going to be doing whole bunches of INSERTs as part of the
tree-walk I suspect)

I'm really interested in the gotchas and best practices. Some have
already been mentioned like the fact that the XML schema may define
optional items and unrestricted length facets and such like. Others
I've seen in reading talk about the mis-match of identity approaches
(although this was talking primarily about OO/Relational mapping but
the idea is similar I suspect). This could be important, since some
messages received may 'relate' to others already loaded and, given
what I said about not having all of the data in the XML to form all of
the keys, this might be a significant problem.

It is my intention to look into other options (we have recently
acquired DB2 v9 which includes pureXML) but as is so often the case,
the immediate project delivery pressures won't allow it. The PM is
very nervous about using any new tech, perhaps justifiably, but my
sense of unease is more to do with the perhaps misplaced assumption
that 'tried and tested' tech like relational databases will always
provide a workable solution, imho sometimes they actually represent
the most significant constraint.

So yes, back to the actual problem. How to come up with a database
design that provides the capability of staging the shredded XML in a
reasonable efficient manner and enables it to be loaded from XML
instances received, again efficiently (ideally without 100's of tables
and joins to negotiate). As far as efficiency of storage, well that
MAY be a concern although perhaps not a huge one so long as the Db
doesn't bloat up too much if normalisation is preferred over extra
tables.

Please add your thoughts and suggestions and experiences as you are
able. Nothing is too trivial (or rude) to mention (i.e. if you want to
say don't do this if you want to keep your sanity, thats ok).

regards

Fraser.

I'm


2009/11/1 Jim Tivy <jimt@bluestream.com>:
> Interesting post, but I am not sure that "now is the time to talk of many
> things".
>
> Let me try to focus:
>
> Proper software execution comes from the choice of appropriate
> actions/technologies to match the driving requirements.  But more
> importantly, the greatest Wisdom, is to frame the driving requirements
> correctly before "going off half cocked" or doing something that is
> unnecessary and unwarranted.
>
> So lets start by framing the requirements again:
>
> Fraser Gofin wrote:
>
> "
> The basics are we receive XML messages from an external trading partner
and
> process those messages, enriching and routing to a number of internal
> subscriber applications. One of these applications is MI and the deal here
> is that they want the data to been put into a relational database so that
> they can create a number of interfaces 'files' which are sent to still
more
> applications.
> "
>
> OR
>
> "
> I am mainly interested in the process of LOADING XML data to a database
> rather than extracting (at least for the purposes of this discussion).
> "
>
> It is possible that the "mother persistent application datamodel" is
> contained in the relational database in all its normalized glory.  If so,
> then, "processing the messages" is simply a "data import" operation.  So
the
> question is, how do I get XML X* to tables T*.  It would strike me that
lots
> of people are doing this.  Are there common techniques and technologies
for
> doing this import?
>
> Fraser, is that a proper framing of the question/requirements?
>
> Jim
>
>
> -----Original Message-----
> From: Petite Abeille [mailto:petite.abeille@gmail.com]
> Sent: Sunday, November 01, 2009 9:56 AM
> To: xml-dev@lists.xml.org
> Subject: Re:  Shredding XML
>
>
> On Oct 29, 2009, at 10:20 PM, Fraser Goffin wrote:
>
>> opinions on the subject of decomposing XML into relational databases
>
> Outside of the most trivial case, this is a major PITA of the same
> epic proportion as the object-relational one:
>
> http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
>
> Good luck.
>
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.