Subject:NOOB Help Author:R Morgan Date:30 Mar 2010 07:25 PM
Hi Please bare with me as I am a noob.
I have a task that I need preform. The task is as follows:
1. Pull down 900+ html product detail pages (each very small pagelets) one at a time.
1.5 Change the product id for each page downloaded b y incrementing it (Product ids range from 51000 to 51994)
2. Remove a link node for the style sheet as it is ill-formed (has no closing tag)
3. Convert to XML
4. Extract product details using xpaths.
5. Append extract data to target xml file for later import in to database.
6. Delay a while so as not to load server.
7. Repeat for next product page.
What I am doing is building a product database of a suppliers website. The process will need to be run later as new products are added. So editing the start and end part numbers will need to be done or placing a check for products that already exist in the target xml file could be done.
Can this or any part of it be automated using Stylus Studio, if so, point me in the right direct to get started.