[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML to graph

  • From: Peter Hunsberger <peter.hunsberger@gmail.com>
  • To: ihe.onwuka@gmail.com
  • Date: Wed, 1 Jul 2015 14:23:54 -0500

Re:  XML to graph
That won't be a problem for something like Neo or Titan, load up the data as anonymous nodes. Do a second pass to build the relationships based on the fields you wish to correlate. Individual nodes don't have to share the same fields to be correlated. The issues will be size related.  If you have billions of entries to load use Titan, otherwise Neo will have a faster implementation path and likely scale to 10's of millions. If you're talking those sizes things will be slow (hours or days) until the relationships are built after which sub-second to pull out related data should be possible. The learning curve will be languages like Gremlin or Cypher, though you could also write Java plugins for Neo if need be.

Peter Hunsberger

On Wed, Jul 1, 2015 at 12:04 PM, Ihe Onwuka <ihe.onwuka@gmail.com> wrote:
You will note that the data doesn't have a unique id. Title certainly isn't unique, if you consider how many movies there have been called Batman or Treasure Island.

Now I may encounter data about this movie from another source that covers different facets , for example it's box office takings or movie reviews. 

So it's a classic semantic web application. I want to amalgamate disparate data about the same fact in one entity. As I said I have a transformation that does this but it doesn't scale very well because I have to search the entire movie base to find the best match. To overcome this I have to adopt a mapReduce-ish approach to solve the problem.

The thinking is a graphical representation would eliminate that problem because a graph gives me a persistent data structure  already  indexed for retrieval via several different axes, whereas indexes constructed in the XSLT transformation for the same purpose  are ephemeral and would need to be reconstructed every time you ran the transformation.

On Wed, Jul 1, 2015 at 12:46 PM, Peter Hunsberger <peter.hunsberger@g...> wrote:
Should be pretty straight forward to import that into Neo4J or Titan.  Neo might be simplest, in particular via conversion of the data into JSON.  However, Titan might give you other capabilities such as using Hadoop type processing either for import or for subsequent analytics. Without knowing more about the business requirements can't really give you much more than that...

Peter Hunsberger

On Wed, Jul 1, 2015 at 11:32 AM, Ihe Onwuka <ihe.onwuka@gmail.com> wrote:
I would like  to convert the XML snippet below to a multi-relational graph representation. 
One way is to transform a triple store via RDF. Another which I am less familiar with is to transform to graphML followed by a subsequent import into some graph database tool.

The graphical representation is desirable for processing rather than visualization reasons. Chiefly I have a matching algorthim implemented in XSLT which works fine but doesn't scale well, a problem that I think can be solved with a graphical representation. 

I am keen to hear from my elders and betters on the subject.

<movie title="20000 lieues sous les mers">
<actors>
<person name="Méliès, Georges"/>
</actors>
<alias>
<title title="20,000 Leagues Under the Sea " year="1907"/>
<title title="Amid the Workings of the Deep " year="1907"/>
<title title="Deux cent mille lieues sous les mers " year="1907"/>
<title title="Le cauchemar d'un pêcheur " year="1907"/>
<title title="Under the Seas " year="1907"/>
</alias>
<directors>
<person name="Méliès, Georges"/>
</directors>
<genres>
<tag name="adventure"/>
<tag name="fantasy"/>
<tag name="sci-fi"/>
<tag name="short"/>
</genres>
<keywords>
<tag name="based-on-novel"/>
<tag name="dream"/>
<tag name="fish"/>
<tag name="number-in-title"/>
<tag name="submarine"/>
<tag name="undersea-monster"/>
<tag name="underwater"/>
</keywords>
<producers>
<person name="Méliès, Georges"/>
</producers>
</movie>





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.