[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Schemaless XML?

  • From: Thomas Passin <list1@tompassin.net>
  • To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
  • Date: Tue, 11 Oct 2016 09:23:09 -0400

Re:  Schemaless XML?
What gives you the idea that knowing a schema allows you to understand and "process" the data contained in a document?

In every toy example you have given for these kind of questions, you have used names (element or attribute) that suggest something to humans. I think you are fooling yourself that automated machine processing would know what to do with them just because *you* think you know e.g., what "temperature" or "type='graduation'" means.

TomP

On 10/11/2016 8:53 AM, Costello, Roger L. wrote:
Hi Folks,

Scenario: You are building an application that receives XML documents
from various sources. The kinds of data in the XML documents are varied.
The XML documents themselves are structured in various ways. Over time,
new XML documents are received, containing new, unanticipated kinds of data.

How will your application handle such diversity?

One approach is to create an XML Schema that models all the various
kinds of XML documents that will be received. When the application needs
to process new XML documents, the XSD is updated. The disadvantage of
this approach is that the processing of the new XML documents will be
delayed as the XSD is updated and as the application is updated to
handle the new data. The advantage of this approach is that the
application knows exactly what the data is and can process it efficiently.

An alternate approach is for the application to go “schemaless.” The
application performs machine learning on the data it receives. I’m not
sure what “machine learning on the data” means. I suspect that it means
that an internal schema (in some form or another) is dynamically
generated. Do you agree? If so, then the approach is not actually
schemaless; rather, there is a dynamically generated schema. Do you
agree? Is machine learning technology sufficiently advanced that it can
classify and understand the data to the same degree as a carefully
crafted schema and carefully crafted application code? Have you gone
schemaless?

/Roger






  • References:

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.