[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: HTML to XML

Subject: RE: HTML to XML
From: "Knight, Michel" <michel.knight@xxxxxxx>
Date: Fri, 12 Jun 2009 13:54:56 -0400
 RE: HTML to XML
Oops forgot the subject
-----Original Message-----
From: Knight, Michel
Sent: Friday, June 12, 2009 1:52 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Cc: Knight, Michel
Subject:

Hi,

Sound simple but I'm having problem making it work.
It's complaining about xsl:key  and from there I'm not sure where I should
plug does line, I plan to use the same strategy for multiple other validation,
if I can only get this one working.
I've included
File 1-> html once it got clean
File 2->References I use this to see if the value exist
File3-> my xslt file that should do all of this magic.

<xsl:variable name="file2root" select="doc('file2')/root"/>
<xsl:key name="k" match="references" use="."/>

then

<xsl:template match="meta">
  <xsl:if test="empty(key('k', @content))">
    <xsl:message>error</xsl:message>




File 1: Source File (contact.htm)
==========================================
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<meta name="generator" content="HTML Tidy for Windows (vers 25 January 2008),
see www.w3.org" /><!-- CLF 2.0 TEMPLATE VERSION 1.04 | VERSION 1.04 DU GABARIT
NSI 2.0 -->
<!-- HEADER BEGINS | DEBUT DE L'EN-TETE -->
<!-- TITLE BEGINS | DEBUT DU TITRE -->
<title>Contact Transact - Transport Canada</title>
<!-- TITLE ENDS | FIN DU TITRE -->
<!-- METADATA BEGINS | DEBUT DES METADONNEES -->
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="schema.dc" href="http://purl.org/dc/elements/1.1/" />
<link rel="schema.dcterms" href="http://purl.org/dc/terms/" />
<meta name="dc.description" content="Contact Transact" />
<meta name="description" content="Corporate Services (CS) is part of the
Department's Administration Business Line, along with Communications and
General Counsel. CS's role is to provide efficient and effective support
services and functional expertise - in finance, administration, technology and
information management, human resources and executive services - that respond
to departmental needs." />
<meta name="keywords" content="corporate, services, corporate services, audit
reports, audit, CS, departement, support services, functional expertise,
finance and administration, technology and information management, human
resources, executive services, acts and regulations, Canada's transportation
system, departmental audit reports, evaluation services,departmental business,
departmental business, reporting tools required by Parliament, Transport
Canada library, regional offices, Corporate Services programs, policies and
standards" />
<meta name="dc.creator" content="Government of Canada - Transport Canada -
Transact" />
<meta name="dc.title" content="Contact Transact" />
<meta name="dcterms.issued" scheme="W3CDTF" content="2009-01-06" />
<meta name="dcterms.modified" scheme="W3CDTF" content="2009-01-06" />
<meta name="dc.subject" scheme="gccore" content="Contact Transact" />
<meta name="dc.language" scheme="ISO639-2/T" content="eng" /><!-- NAVIGATION
METADATA BEGINS | DEBUT DES METADONNEES DE NAVIGATION -->
<!-- #include
virtual="/CLF-NSI/includes-final/en/ia/organization/what_we_do/org_chart/brea
dcrumb.inc" -->
<meta name="gc.tc.navigation.echelon.4" content="Corporate Services;
URL=/corporate-services/menu.htm" />
<meta name="gc.tc.navigation.echelon.5" content="Contact Transact; URL=[NONE]"
/><!-- NAVIGATION METADATA ENDS | FIN DES METADONNEES DE NAVIGATION -->
<!-- METADATA ENDS | FIN DES METADONNEES -->
<!-- ALTERNATIVE LANGUAGE LINK | LIEN ALTERNATIVE -->
<link rel="alternate" type="text/html" hreflang="fr"
href="/services-generaux/transact/contactez.htm" title="Contactez les Services
gC)nC)raux" /><!--#include virtual="/CLF-NSI/v2-1_04/includes/2css.inc" -->
<!-- PROGRESSIVE ENHANCEMENT BEGINS | DEBUT DE L'AMELIORATION PROGRESSIVE -->

<script src="/CLF-NSI/v2-1_04/scripts/pe-ap.js" type="text/javascript">
</script>
<script type="text/javascript">
                /* <![CDATA[ */
                var params = {
                        lng:"eng",
                        pngfix:"/CLF-NSI/v2-1_04/images/inv.gif"
                };
                PE.progress(params);
                /* ]]> */
</script><!-- PROGRESSIVE ENHANCEMENT ENDS | FIN DE L'AMELIORATION PROGRESSIVE
-->
</head>
<body>
<!--#include virtual="/CLF-NSI/v2-1_04/includes/header-eng.inc" -->
<div class="colLayout"><!-- THREE COLUMN LAYOUT BEGINS | DEBUT DE LA MISE EN
PAGE DE TROIS COLONNES -->
<!-- LEFT SIDE MENU BEGINS | DEBUT DU MENU LATERAL GAUCHE -->
<div class="left"><!-- SIDE MENU TITLE BEGINS | DEBUT DU TITRE DU MENU LATERAL
-->
<h1 class="navaid"><a name="il" id="il">Institutional links</a></h1>
<!-- SIDE MENU TITLE ENDS | FIN DU TITRE DU MENU LATERAL -->
<!-- #include
virtual="/CLF-NSI/includes-final/en/ia/organization/what_we_do/nav.inc"
--></div>
<!-- LEFT SIDE MENU ENDS | FIN DU MENU LATERAL GAUCHE -->
<!-- CONTENT BEGINS | DEBUT DU CONTENU -->
<div class="center"><!-- Optional Navigation Menu -->
<!-- #include virtual="/corporate-services/includes/cs-right.inc" -->
<!-- End of Optional Navigation Menu -->
<h1 class="flexible"><a name="cont" id="cont"><!-- CONTENT TITLE BEGINS |
DEBUT DU TITRE DU CONTENU -->
 Contact Transact <!-- CONTENT TITLE ENDS | FIN DU TITRE DU CONTENU
--></a></h1>
<p>Your comments are important to us and we will address them as quickly as
possible.</p>
<p>If you cannot find the answer to your question on any of the pages referred
to above, please fill in the following <a href="#form">form</a> or contact us
at:</p>
<ul>
<li>Email: <strong>webfeedback@xxxxxxxx</strong></li>
<li>Phone: <strong>1-866-949-2262</strong></li>
<li>TTY:<strong>1-888-675-6863</strong></li>
<li>Fax: <strong>613-954-4731 / 613-998-8620</strong></li>
<li>Mailing Address:<br />
<strong>Transport Canada<br />
330 Sparks Street<br />
Ottawa, ON<br />
K1A 0N5</strong></li>
</ul>
<p>When contacting us by phone, please have the following information ready so
that a Transport Canada representative can assist you more efficiently:</p>
<ul>
<li>user name;</li>
<li>customer account number; and,</li>
<li>brief description of query.</li>
</ul>
<p>When you are commenting on a specific page, please include the URL (Web
address).</p>
<!-- adds the privacy disclaimer and personal information bank number -->
<!-- #include virtual="/includes/en/pib_privacy_PPU-079_e.inc" -->
<p><a name="form" id="form"></a></p>
<form method="post" action="/CLF-NSI/v2-1_04/feedback/feedback.asp">
<div><input type="hidden" value="e" name="x_lang" /> <input type="hidden"
value="webfeedback@xxxxxxxx" name="x_mailto" /> <input type="hidden"
value="Transact" name="x_subject" /> <input type="hidden"
value="/corporate-services/transact/confirm/menu.htm" name="x_acknowledge"
/></div>
<div class="fc-tbx"><label for="Comments">Comments and Questions:</label><br
/>
<textarea name="Comments" id="Comments" rows="11" cols="50">
</textarea></div>
<div class="fc-tbx"><label for="Name">Name:</label> (Optional)<br />
<input type="text" size="41" maxlength="400" name="x_name" id="Name" /></div>
<div class="fc-tbx"><label for="Title">Title:</label> (Optional)<br />
<input type="text" size="41" maxlength="400" name="x_title" id="Title"
/></div>
<div class="fc-tbx"><label for="Organization">Organization:</label>
(Optional)<br />
<input type="text" size="41" maxlength="400" name="Organization"
id="Organization" /></div>
<div class="fc-tbx"><label for="email">E-mail address:</label> (Optional)<br
/>
<input type="text" size="41" maxlength="400" name="x_email" id="email"
/></div>
<div class="fc-tbx"><label for="Telephone">Telephone:</label> (Optional)<br
/>
<input type="text" size="41" maxlength="400" name="Telephone" id="Telephone"
/></div>
<div class="fc-tbx"><label for="prov"><strong>You live in:
(Optional)</strong></label><br />
<select name="Province or territory" id="prov">
<optgroup label="Province or territory">
<option label="Alberta" value="Alberta">Alberta</option>
<option label="British Columbia" value="British Columbia">British
Columbia</option>
<option label="Manitoba" value="Manitoba ">Manitoba</option>
<option label="New Brunswick" value="New Brunswick">New Brunswick</option>
<option label="Newfoundland and Labrador" value="Newfoundland and Labrador
">Newfoundland and Labrador</option>
<option label="Northwest Territories" value="Northwest Territories">Northwest
Territories</option>
<option label="Nova Scotia" value="Nova Scotia">Nova Scotia</option>
<option label="Nunavut" value="Nunavut">Nunavut</option>
<option label="Ontario" value="Ontario">Ontario</option>
<option label="Prince Edward Island" value="Prince Edward Island">Prince
Edward Island</option>
<option label="Quebec" value="Quebec">Quebec</option>
<option label="Saskatchewan" value="Saskatchewan">Saskatchewan</option>
<option label="Yukon Territory" value="Yukon Territory">Yukon
Territory</option>
</optgroup>
</select><br />
<br />
<div class="fc-tbx"><input type="submit" value="Submit" /> <input type="reset"
value="Clear" /></div>
</div>
</form>
</div>
<!-- CONTENT ENDS | FIN DU CONTENU -->
<!-- THREE COLUMN LAYOUT ENDS | FIN DE LA MISE EN PAGE DE TROIS COLONNES -->
<!-- FOOTER BEGINS | DEBUT DU PIED DE LA PAGE -->
<!-- #include virtual="/CLF-NSI/v2-1_04/includes/en/footerbottom.inc" -->
<!-- FOOTER ENDS | FIN DU PIED DE LA PAGE --></div>
</body>
</html>
==================================================
File 2: validation file(ref.xml)
 <root>
 	<references>Corporate Services</references>
 	<references>airport Services</references>
 	<references>train Services</references>
 	<references>Naval Services</references>
</root>
=================================================
File 3: my xsl file( myXslt.xsl)
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml" exclude-result-prefixes="xhtml" >
<xsl:output method="xml" indent="yes" encoding="windows-1252"
exclude-result-prefixes="xhtml"></xsl:output>
<xsl:strip-space elements="*"/>

<xsl:template match="xhtml:html">

<root>
<filename>
	<xsl:value-of select="base-uri()"></xsl:value-of>
</filename>
	<matchingfile>
		<xsl:apply-templates select="xhtml:head/xhtml:link[3]/@href"/>
	</matchingfile>
<title>
	 <xsl:apply-templates select="xhtml:head//xhtml:title"/>
</title>
	<metadata>
		<xsl:apply-templates select="//xhtml:meta"/>
	</metadata>
</root>
</xsl:template>
<!--*****************************************-->
<xsl:template match="xhtml:head/xhtml:link[3]/@href"
exclude-result-prefixes="xhtml">
		<xsl:value-of select="."/>
</xsl:template>
<!--****************** LINK ***********************-->
<xsl:template match="xhtml:head//xhtml:link">
	<xsl:copy-of select="."/>
</xsl:template>
<!--***************** META ************************-->
 <xsl:template match="//xhtml:meta">
<!--meta name="dc.subject" scheme="gccore" content="Corporate Services" /-->
<xsl:if test ="@name">
<meta>
	<name>
	  <xsl:value-of select="@name" />
	</name>

    <xsl:if test="@scheme">
		<scheme>
			<xsl:if test="@name = 'dc.subject'" >
				<xsl:if test="@scheme = 'gccore'">
					Validate gccore look at validation_gccore.xml
				</xsl:if>
				<xsl:if test="@scheme='gctct'">
					Validate gccore look at validation_gctct.xml
				</xsl:if>
			</xsl:if>
			<xsl:value-of select="@scheme"/>
		</scheme>
    </xsl:if>

	<content>
<xsl:if test="lower-case(@name) != 'dc.description' and lower-case(@name) !=
'keywords' and @name != 'dc.date' ">
		<xsl:value-of select="@content" />
			</xsl:if>
		<!-- dc.description rules -->
		<!--2.dc.description   limit to 250 at word (ie if text is 255 characters,
and 250 falls on a letter, not eol, chop back to the last space)-->
			<xsl:if test="lower-case(@name) = 'dc.description' ">
				<xsl:choose>
					<xsl:when test="string-length(@content) &gt;= 250 ">
			<xsl:value-of select="substring(@content,1,250)"/>
					</xsl:when>
					<xsl:otherwise><xsl:value-of select="@content" /> </xsl:otherwise>
				</xsl:choose>
			</xsl:if>

	<!-- keywords rules -->
    <!--3.meta name="keywords" content=Insirer les mots-clis en frangais |
Insert the French keywords   Remove it -->
			<xsl:if test="lower-case(@name) = 'keywords' ">
				<xsl:variable name="theString">
							<xsl:text>Insert the French keywords</xsl:text>
				</xsl:variable>
				<xsl:variable name="stringKeyword">
						<xsl:value-of select="normalize-space(@content)"></xsl:value-of>
				</xsl:variable>
		<!-- compare to see if it match, if so remove(render empty value) -->
				<xsl:choose>
					<xsl:when test="ends-with($stringKeyword,$theString)">
						<xsl:text></xsl:text>
					</xsl:when>
					<xsl:when test="starts-with($stringKeyword,$theString)">
						<xsl:text></xsl:text>
					</xsl:when>
		<!--Not a match so render the attribute content -->
					<xsl:otherwise><xsl:value-of select="@content" /> </xsl:otherwise>
				</xsl:choose>
			</xsl:if>
		<!-- end of keywords validation -->

		<!-- dc.date rules -->
		<!-- Validation of the Date -->
			<xsl:if test="lower-case(@name)='dc.date'">
				<xsl:variable name="theDate">
						<xsl:value-of select="replace(@content,' ','')"></xsl:value-of>
				</xsl:variable>
				<xsl:choose>
					<xsl:when
test="matches($theDate,'^[1-2][0-9]+-[0-9][0-9]+-[0-9][0-9]+$')">
						<xsl:value-of select="$theDate"></xsl:value-of>
					</xsl:when>
				<xsl:otherwise></xsl:otherwise><!-- Bad Date -->
				</xsl:choose>
			</xsl:if>

	</content>
</meta>
</xsl:if>
</xsl:template>
  <!--*****************************************-->
  <xsl:template match="xhtml:html//xhtml:head//xhtml:title">
	  <xsl:value-of select="." />
  </xsl:template>
<!--*****************************************-->
</xsl:stylesheet>


======================================
Michel Knight
CGI
275 Slater Street
16th FloorOttawa,Ontario K1P5H9
www.cgi.com

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.