Subject:Converting HTML to XML Author:georgia pavlou Date:16 Jun 2005 07:48 PM Originally Posted: 16 Jun 2005 07:43 PM
Hello,
I just started learning XSLT and i m trying to convert an HTML document into XML using XSLT. I found many tutorials showing how to achieve the opposite (to convert XML to HTML)!
I actualy know that I have to create regular expressions and match them with XSLT... but I really have hard time in understanding how to do that.!!
Can anybody give me a hint or recomment some material to study?
Thanks,
Georgia
Subject:Converting HTML to XML Author:Minollo I. Date:16 Jun 2005 08:17 PM
XSLT is a language that allows you to transform one or more XML documents into a different document type, be that text, HTML or just different XML.
If you want to transform HTML into XML, you can use the HTML-to-XML document wizard in Stylus Studio (File | Document Wizards... > HTML to XML); once you have done that, you could think about applying an XSLT to the XML document (originally the HTML), and extract values or fragments navigating the tree with pattern matching and XPath.
Or you can also think about using the HTML-to-XSLT wizard (File | Document Wizards... > XSLT Editor tab > HTML to XSLT), which will crate an XSLT document that when run will generate an HTML (almost) identical to the original one; you can use that as a template to make some areas dynamic specifying data coming from yet another XML source.
Subject:Converting HTML to XML Author:Tony Lavinio Date:21 Jun 2005 08:54 AM
Stylus Studio also has Tidy built in as an adapter so that you can
transform HTML to XHTML (which, being a flavor of XML, can be used with
XSLT).
Try File|Open, choose your .html file, and then check the box for
"Convert to XML using Adapter" on the File|Open dialog. You will get
another dialog asking which adapter to use; try the HTML one. You will
then see the XML equivalent of your HTML file.
You can use this same mechanism when selecting the input file for your
XSLT from the scenario dialog.
Subject:Converting HTML to XML Author:Joachim Koester Date:07 Jul 2005 03:51 AM
Hello,
during my visit on these internet pages I found some information about converting html files into XML. I read the information that the conversion process converted html into standard XML. What is standard XML? Means this valid XML. I am using a customized dtd. Fit the converted XML file to my dtd?
Are there more steps necessary to fit the converted XML files to my dtd?
Background: at the moment I migrate all our files into a XML based data base.
My WINWORD files are already converted to XML. Now I had to convert all the existing html files.
I think 'copy and paste' is not a very satisfying way.
Thanks in advance.
Subject:Converting HTML to XML Author:Tony Lavinio Date:07 Jul 2005 10:19 AM
As the previous posts stated, Stylus Studio includes several
mechanisms to convert HTML to XML - either through a wizard
interface, or though an HTML-to-XML adapter. Either should
solve your problem nicely.
Subject:Converting HTML to XML Author:Tony Lavinio Date:08 Jul 2005 11:52 AM
The next step after converting the HTML to XML would be to use XSLT
to transform them to match your DTD. You can use the mapping view
to load the XML from HTML in the left, and your DTD in the right windows,
and then just draw lines to map. You can then use the XSLT that is
generated to transform your other HTML files.