[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Differentiating two xml files of size each around500MB.


xml large files
Santosh,

In reply to your query, you will find some details of how DeltaXML 
can deal with large files at 
http://www.deltaxml.com/comparing-large-files.html The file sizes 
tested are up to 60Mb. However, you should be able to process larger 
files on a larger system but you would need to evaluate this on your 
system.

If the DeltaXML output is not what you need, then you can easily 
convert it with XSL to what you do need (some XSL processors will 
deal with large files, but the delta file should be small anyway, 
because DeltaXML does not include unchanged data in the delta file 
unless you specifically ask for this).

We are not aware of any other system that will deal with such large 
files, though the following reference may also be useful to you:

"Detecting Changes in XML documents" - academic paper:
http://www-rocq.inria.fr/~cobena/Publications/xmldiff_ICDE2002final.pdf

This is for presentation at ICDE 2002: Feb 26 - Mar 01, San Jose, 
California - http://www.research.telcordia.com/society/icde2002/

This is research work though, so it will not solve your problem 
today. It also needs mapping files to relate the delta file to the 
originals so you do not get the output you need.

Best regards,
Robin

At 10:40 am +0530 23/3/02, Santoshwt wrote:
>Hi,
>
>We need to differentiate two xml files for their contents.
>The structure of both files will be something similar like -
>
><segment1>
>	<element1>content</element1>
>	<element2>content</element2>
>	<element3>content</element3>
></segment1>
><segment2>
>	<element1>content</element1>
>	<element2>content</element2>
>	<element3>content</element3>
></segment2>
>
>We want to result the difference in the form of segments that are changed,
>or added in new file or deleted from old file.

DeltaXML will give you exactly what you need here.

>
>The file sizes are going to be very hugh. say around 500MB or so per file.
>
>If anyone could throw some light on how to go about it, if any 
>tool's available,
>what will be memory requirements or any other information,
>kindly reply to my Id please.
>
>I've seen deltaxml tool, but our file sizes are massive, and also 
>the output of deltaxml is it's own format.
>
>It's highly urgent.
>
>Thanks.
>
>Regards,
>Santosh
>
>
>
>
>
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>


-- 
-- -----------------------------------------------------------------
Robin La Fontaine, Director, Monsell EDM Ltd
DeltaXML: "Change control for XML in XML"
Tel: +44 1684 592 144 Fax: +44 1684 594 504
Email: robin.lafontaine@d...      http://www.deltaxml.com

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.