XML Editor
Sign up for a WebBoard account Sign Up Keyword Search Search More Options... Options
Chat Rooms Chat Help Help News News Log in to WebBoard Log in Not Logged in
Show tree view Topic
Topic Page 1 2 3 4 5 6 7 8 9 Go to previous topicPrev TopicGo to next topicNext Topic
Postnext
tyler horathSubject: spliting xml files by certain amount
Author: tyler horath
Date: 07 Jun 2008 05:50 PM
Ok so heres my problem and how I am doing it now. I have an xml file with couple hundred thousand products in it. I have an xsl file I use to translate it to the format I need to import it into my store. What I have been doing is using this code in my for each statement..

<xsl:for each select="catalog/product[position() &lt; 1000]>

and then I debug it save it and then change it to..

<xsl:for each select="catalog/product[position() &gt; 1000 and position() &lt; 2000]>

and then save it as another file. I just keep repeating this untill I have split the whole file into 1000's. As you can probably tell, this is very time consuming.

IS there something I can do to automate this proccess and just automatically split the file into 1000's? I cant just split the file because its not in the right format. I need to apply the xslt stylesheet also.

Postnext
James DurningSubject: spliting xml files by certain amount
Author: James Durning
Date: 09 Jun 2008 11:11 AM
Processing time must be a killer too. If you have just a hundred thousand products, you're checking each of those nodes a hundred times!! Meaning 100*100,000 = 10,000,000 checks!
Seriously, I would recommend breaking it out with a scripting language, like perl or php. What do you mean it's not in the right format? Can you not add a <catalog> at the beginning and a </catalog> at the end of each segment?

At least with running the same transformation on multiple files, you can use a CompiledTransform, and speed up your processing time there too.
-----
If you choose not to do it this way, I still recommend using a single stylesheet. Instead use 2 stylesheet parameters:
<xsl:param name="low" select="0">
<xsl:param name="high" select="1000">
Pass in the appropriate values to the process in each run.
Only problem is you have to know how many nodes there in your main file, unless you check your output is empty, through file size or other means.
--
Further, have you noticed you're missing products 1000,2000,3000,4000?
Need a equal sign in the comparison:
<xsl:for each select="catalog/product[position() &gt; $low and position() &lt;= $high]>

Postnext
Tony LavinioSubject: spliting xml files by certain amount
Author: Tony Lavinio
Date: 09 Jun 2008 11:46 AM
The basic idea is that you have two loops: the outer loop
goes by pages, and the inner outputs the items within the
page.

<xsl:for-each select="xxx[position() mod 1000 = 1]">
<xsl:result-document ...>
<xsl:for-each select=". | following-sibling::xxx[position() &lt; 1000]">
do whatever

Postnext
tyler horathSubject: spliting xml files by certain amount
Author: tyler horath
Date: 09 Jun 2008 11:17 PM
Originally Posted: 09 Jun 2008 10:38 PM
Here is my code. When it generates the file, they do not contain anything..

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" indent="yes" encoding="UTF-8" />
<xsl:template match="/">

<AspDotNetStorefrontImportFile>
<xsl:for-each select="catalog/product[position() mod 1000 = 1]">
<xsl:variable name="filename" select="concat('file:///','AA','_', position(),'.xml')"/>
<xsl:result-document href="{$filename}">
<xsl:for-each select=". | following-sibling::catalog/product[position() &lt; 1000]">
<Product>
<Name><xsl:value-of select="name"/></Name>
<ProductTypeRef>Generic Product</ProductTypeRef>
<ManufacturerRef><xsl:value-of select="manufacturer"/></ManufacturerRef>
<DistributorRef><xsl:value-of select="programname"/></DistributorRef>
<CategoryRef/>
<Summary/>
<Description><xsl:value-of select="description"/></Description>
<SEKeywords><xsl:value-of select="keywords"/></SEKeywords>
<SEDescription><xsl:value-of select="concat(substring(description, 1, 100), '...')"/></SEDescription>
<SETitle><xsl:value-of select="name"/></SETitle>
<SKU><xsl:value-of select="sku"/></SKU>
<ManufacturerPartNumber/>
<XmlPackage>product.affiliate.xml.config</XmlPackage>
<ColWidth>4</ColWidth>
<SalesPromptID>1</SalesPromptID>
<Published>1</Published>
<RequiresRegistration>0</RequiresRegistration>
<MiscText><xsl:value-of select="buyurl"/></MiscText>
<TrackInventoryBySizeAndColor>0</TrackInventoryBySizeAndColor>
<ImageFilenameOverride><xsl:value-of select="impressionurl"/></ImageFilenameOverride>
<ExtensionData><xsl:value-of select="imageurl"/></ExtensionData>
<IsAKit/>
<IsAPack/>
<PackSize/>
<ProductVariant>
<Name><xsl:value-of select="name"/></Name>
<IsDefault>1</IsDefault>
<SKUSuffix/>
<ManufacturerPartNumber/>
<Description><xsl:value-of select="description"/></Description>
<SEKeywords><xsl:value-of select="keywords"/></SEKeywords>
<SEDescription><xsl:value-of select="concat(substring(description, 1, 50), '...')"/></SEDescription>
<SETitle><xsl:value-of select="name"/></SETitle>
<Price><xsl:value-of select="price"/></Price>
<SalePrice><xsl:value-of select="saleprice"/></SalePrice>
<MSRP><xsl:value-of select="retailprice"/></MSRP>
<Cost/>
<Weight/>
<Dimensions/>
<Inventory>1000000</Inventory>
<DisplayOrder>1</DisplayOrder>
<Colors/>
<ColorSKUModifiers/>
<Sizes/>
<SizeSKUModifiers/>
<IsTaxable>0</IsTaxable>
<IsShipSeparately>0</IsShipSeparately>
<IsDownload>1</IsDownload>
<DownloadLocation>1</DownloadLocation>
<Published>1</Published>
<ImageFilenameOverride><xsl:value-of select="impressionurl"/></ImageFilenameOverride>
<ExtensionData><xsl:value-of select="imageurl"/></ExtensionData>
</ProductVariant>
</Product>
</xsl:for-each>
</xsl:result-document>
</xsl:for-each>
</AspDotNetStorefrontImportFile>
</xsl:template>
</xsl:stylesheet>

Postnext
Tony LavinioSubject: spliting xml files by certain amount
Author: Tony Lavinio
Date: 09 Jun 2008 11:36 PM
It's hard to answer without seeing your input XML, as there
may be another problem, but this:

<xsl:for-each select=". | following-sibling::catalog/product[position() &lt; 1000]">

won't work. Your context at this point is <product>, so you're
looking for catalog siblings of product with product as children.
Try:

<xsl:for-each select=". | following-sibling::product[position() &lt; 1000]">

If that doesn't work, can you zip together a small input file and
the XSLT and attach them?

Postnext
tyler horathSubject: spliting xml files by certain amount
Author: tyler horath
Date: 10 Jun 2008 12:09 AM
here is sample from the xml

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<product>
<programname>AutoAnything</programname>
<programurl>http://www.AutoAnything.com</programurl>
<catalogname>Winter 468x60</catalogname>
<lastupdated>06/04/2008 03:19:35 PM</lastupdated>
<name>Zymol Concours Glaze 120, Zymol - Car Care Products - Waxes and Glazes</name>
<keywords>Zymol Show Car Wax,Zymol Concours Glaze</keywords>
<description>Zymol Show Car Wax. The Zymol Concours Glaze will prove that you don&#39;t have to own a show car to give your vehicle that &#34;Show Car Shine.&#34; With its unique formula of nutritive oils and rare Brazilian Carnauba, the Concours is designed to protect your vehicle while producing an ultra-brilliant, durable sheen. The secret is in its special carnauba formula. The delicate balance of white and yellow Carnauba provides a visible increase in depth and clarity over other waxes in its class. What&#39;s more, this unique 47% carnauba formula protects your vehicle from any outdoor contaminants and reduces paint corrosion. Therefore, not only does the Concours give your vehicle an enviously glossy finish, it keeps your car, truck or SUV better protected and makes sure your paint remains in tip top condition. You know you are in safe hands when the founders of Zymol designed the wax for their own personal vehicles. Go with Zymol Concours Glaze and discover the and Glazes.</description>
<sku>1063317</sku>
<manufacturer>Zymol</manufacturer>
<manufacturerid>120-1063317</manufacturerid>
<currency>USD</currency>
<saleprice>175.95</saleprice>
<price>175.95</price>
<retailprice>0.00</retailprice>
<fromprice>no</fromprice>
<buyurl>http://www.dpbolvw.net/click-2953976-10356982?url=http%3A%2F%2Flink.mercent.com%2Fredirect.ashx%3Fmr%3AmerchantID%3DAutoAnything%26mr%3AtrackingCode%3D80577EB8-6C32-DD11-873B-0019B9C043EB%26mr%3AtargetUrl%3Dhttp%3A%2F%2Fwww.autoanything.com%2Fproduct_redirect.aspx%253fproduct_id%253d1524</buyurl>
<impressionurl>http://www.tqlkg.com/image-2953976-10356982</impressionurl>
<imageurl>http://images.autoanything.com/images/products/med/car_care_products/zymol_concours_glaze.jpg</imageurl>
<advertisercategory>Auto &amp; Vehicles difference between the ordinary and the extraordinary! Zymol Show Car Wax. Car Care Products, Waxes;Auto Accessories&gt;Car Care Products&gt;Waxes and Glazes</advertisercategory>
<special>no</special>
<gift>no</gift>
<promotionaltext>Free Shipping on 99% of all products!</promotionaltext>
<offline>no</offline>
<online>yes</online>
<instock>yes</instock>
<standardshippingcost>0.0</standardshippingcost>
</product>
</catalog>

Postnext
tyler horathSubject: spliting xml files by certain amount
Author: tyler horath
Date: 10 Jun 2008 12:22 AM
Originally Posted: 10 Jun 2008 12:12 AM
by the way, it still creates 2 files with nothing in them. (there are about 1500 products in this xml file).

Posttop
Tony LavinioSubject: spliting xml files by certain amount
Author: Tony Lavinio
Date: 10 Jun 2008 08:06 AM
I made the change I suggested, and it ran perfectly here.
So that leads to the following questions:

1. What version of Stylus Studio are you using? The build number
should be something like 1050g or 1147c (the latest, just out).
2. Are the files actually 0 bytes long?
3. You're output has no root element, so what you are getting in each
file is a series of elements. Is it possible the program you are using
to view the output can't handle non-well-formed XML?
4. Is the input XML file you sent representative of the actual input?

I've attached the tweaked XSLT (and really, I've only touched that
one thing (well, and re-indented it)) and the XML file I used for
testing (which was yours but with nine more products added).


Documentchop.zip

 
Topic Page 1 2 3 4 5 6 7 8 9 Go to previous topicPrev TopicGo to next topicNext Topic
Download A Free Trial of Stylus Studio 6 XML Professional Edition Today! Powered by Stylus Studio, the world's leading XML IDE for XML, XSLT, XQuery, XML Schema, DTD, XPath, WSDL, XHTML, SQL/XML, and XML Mapping!  
go

Log In Options

Site Map | Privacy Policy | Terms of Use | Trademarks
Stylus Scoop XML Newsletter:
W3C Member
Stylus Studio® and DataDirect XQuery ™are from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2016 All Rights Reserved.