[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: HTML to XML

Subject: RE: HTML to XML
From: "Knight, Michel" <michel.knight@xxxxxxxx>
Date: Thu, 28 May 2009 11:29:43 -0400
RE:  HTML to XML
Hi and thx for all of the good info.

I was able to run it and make it quicker with your guidance, so here
what I did:
1.change menu.htm
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"../DTD/xhtml1-strict.dtd">
<!--!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"-->
So it wouldn't go out to grab the xhtml1-strict.dtd
2.Create a directory DTD which is at the root of the xsl and html files
and copy the following file there:
DTD\
	xhtml1-strict.dtd
	xhtml-lat1.ent
	xhtml-lat1_ent
	xhtml-special.ent
	xhtml-symbol.ent

Ran the scripts and it was way faster, could I do anything else?

When I run this scripts I get this :
<title xmlns:xhtml="http://www.w3.org/1999/xhtml">
<meta xmlns:xhtml="http://www.w3.org/1999/xhtml">

How to remove does comment?
xmlns:xhtml="http://www.w3.org/1999/xhtml"

Thx again, appreciated.

Michel

As references
The files:
=============== test.xsl
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xsl:output method="xml" indent="yes" encoding="UTF-8"></xsl:output>
<xsl:template match="xhtml:html">
	<title>
	 <xsl:apply-templates select="xhtml:head//xhtml:title"/>
	</title>
	 <xsl:apply-templates select="//xhtml:meta"/>

</xsl:template>

<xsl:template match="//xhtml:meta">
<meta>
	<name>
	  <xsl:value-of select="@name" />
	</name>
	<content>
	  <xsl:value-of select="@content" />
	</content>
</meta>
</xsl:template>

<xsl:template match="xhtml:html//xhtml:head//xhtml:title">
	<content>
	  <xsl:value-of select="." />
	</content>
</xsl:template>
</xsl:stylesheet>
=================menu.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"../DTD/xhtml1-strict.dtd">
<!--!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"-->
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>

<title>Corporate Services - Organizational Chart - What We Do -
Organization - Transport Canada</title>
<!-- METADATA BEGINS | DEBUT DES METADONNEES -->
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"
/>
<meta name="dcterms.issued" scheme="W3CDTF" content="2002-08-21" />
<meta name="dcterms.modified" scheme="W3CDTF" content="2008-10-21" />
<meta name="dc.subject" scheme="gccore" content="Corporate Services" />
<meta name="dc.language" scheme="ISO639-2/T" content="eng" />
<!-- PROGRESSIVE ENHANCEMENT ENDS | FIN DE L'AMELIORATION PROGRESSIVE
-->
</head>
<body>

<div class="page">
Some content
</div>


</body>
</html>
================================================================

Hi David:
Regarding this I'm using another e-mail now so this should'nt be a
problem anymore.
<https://sera-mtl.cgi.com/exchweb/bin/,DanaInfo=.awfdpenrGm6msm0utPv5ATx
25.3H+redir.asp?URL=http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd>
">
Regards




-----Original Message-----
From: David Carlisle [mailto:davidc@xxxxxxxxx]
Sent: May 27, 2009 6:24 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Cc: Knight, Michel
Subject: Re:  HTML to XML




<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd

unless your xml parser is set up with some kind of catalogue or caching
entity resolver, this could be fetching the entire xhtml dtd for every
file that you parse. That most likely accounts for the slow response
times.


<https://sera-mtl.cgi.com/exchweb/bin/,DanaInfo=.awfdpenrGm6msm0utPv5ATx
25.3H+redir.asp?URL=http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd>
">

is this an artifact added by some mail program? If it is really there in
your input it means your input is not well formed.



David


________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.