[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Changing a from unstructured HTML to XML

Subject: Changing a from unstructured HTML to XML
From: Evan Leibovitch <evanleibovitch@xxxxxxxxx>
Date: Tue, 21 Sep 2010 09:16:49 -0400
 Changing a from unstructured HTML to XML
Hi all,

First of all, I want to thank the people here for their help in
getting me on my feet, with special thanks to Gerrit. I've been
learning to solve some of my own problems (such as how to get rid of
xmlns="") but there's one -- for a completely different project --
that's stumping me on a conceptual level.

I am working with an HTML input file, and I'd like to group things
better by sections (ultimately, with the intent of using
xml:result-document to create a new file for each section).

What I have is not uncommon:

<h1 class="section">Section Name</h1>
<h1 class="headline">Headline name</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 2</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 3</h1>
[... assorted HTML marked up text ...]
<h1 class="section">Section 2</h1>
<h1 class="headline">Headline 4</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 5</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 6</h1>
[... assorted HTML marked up text ...]

and so on.

What I'd like to end up with is, if possible

<section id="Section Name">
B  <headline id="Headline ">
B B B B  [...marked up text...]
B  </headline id="Headline 2">
B  <headline>
B B B B  [...marked up text...]
B B  </headline>
B  <headline id="Headline 3">
B B B B  [...marked up text...]
B B  </headline>
</section>


Maybe this is "XSL 101"and it must be common in HTML-to-XML
transformations. I would imagine that there must be some techniques to
form a proper tree such that a conventional HTML page is turned to

<body>
<h1>
B B  <h2>
B B B B B  <h3>
B B B B B B B B  <p>Marked up text</p>
B B B B B  </h3>
B B  </h2>
</h1>


but I'm not having much luck finding the techniques to do this.
Certainly the "tidy" implementations are no help. :-)

Any pointers are appreciated. Thanks!

- Evan

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.