[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: What is the general direction you are seeing these daysto

  • From: Ihe Onwuka <ihe.onwuka@gmail.com>
  • To: "Costello, Roger L." <costello@mitre.org>
  • Date: Wed, 11 Mar 2015 12:11:37 +0000

Re:  What is the general direction you are seeing these daysto
Well  I am very grateful for Peter's comments and intrigued intrigued and enthused enough to investigate the architectures he describes  for my current project. Part of the reason relates to an additional shortcoming I found with eXist - the insistence on a strictly hierarchical collection structure. This forces me into choice I do not want to have to make for a movie repository -  whether a rom/com goes in the romantic collection or the comedy collecton - and of course genre is not the only facet worthy of modelling collections on. How well  the product would support a very large number of collections with each piece of data being able to belong to several is a concern that I wouldn't have in a graph based architecture.

I find

Peter made a very interesting assertion:

 

The analytics should run directly on the data,

not on some extract.

 


less persuasive. First up it depends on where you get your data from. If it originated as HTML then marshalling it into anything but XML usually entails  the (almost certainly) premature imposition of some sort of schema.

Secondly this somewhat overlooks the  significant data management effort latent  in most Big Data projects. At the very least that amount of data will usually have to go through a significant cleansing process. A very vocal section of the analytics community seem to think this is yet another thing they can do with an R library. Rarely is a dissenting voice ever heard yet the no 1 lament of the very same people lament is the time and effort spent on dealing with unclean data.

We (software development people) acquire a data management capability (Oracle etc ) and build BI and/or analytics tools on top of that. They choose their analytics capability and then find they have to bolt a data management function on top of it. Well I think they've got it arse about face. Doing data management with your analytics tool is just as big (if not a bigger) sin as doing analytics with your data management tool.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.