[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Converting delimited text WITH <br> to string

Subject: RE: Converting delimited text WITH <br> to string
From: <Jarno.Elovirta@xxxxxxxxx>
Date: Tue, 22 Jun 2004 08:06:05 +0300
xsl text parsing
Hi,

> Friends,
> I guess I missed the answer to this one. I have read a lot of FAQs,
> but I have not found my particular answer.
> 
> All I want to do is to compare an XML file with a text file.
> 
> My desire is to convert the text file into a string then compare the
> data in it to the XML nodes. However, the text file always gets
> parsing errors.

Bacause you're trying to parse something that's not XML with an XML parser.

> The text file has is exported from a OLD database, but the fields do
> have <br> and other sloppy html in them.

Then you have to clean it first by removing the HTML tags, or by converting the "document" into XML (XMLized HTML or XHTML).
 
> I would edit them, but there are over 300 of them all in different
> folders (lucky for me they are on the same server).
> 
> Here is the URL.
> 
> http://lcweb2.loc.gov/music/ftp/951201/06180001/ftscript.data

Exactly what do you want to compare, and what does the XML you want to compare with look like.

> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE xsl:stylesheet [ 
> <!ENTITY lll SYSTEM
> "http://lcweb2.loc.gov/music/ftp/951201/06180001/ftscript.data">
> <!ENTITY nbsp "&#x20;">
> <!ELEMENT br (EMPTY)>
> <!ELEMENT BR (EMPTY)>

Declaring the elements will not help you with the parsing errors, because the file is not XML.

> ]>
> 
> <xsl:stylesheet 
> 	version="1.0" 
> 	xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
> 	xmlns:xs="http://www.w3.org/2001/XMLSchema" 
> 	xmlns:html="http://www.w3.org/1999/xhtml" 
> 	exclude-result-prefixes="html xs" 
>   xmlns:saxon="http://icl.com/saxon"
>   extension-element-prefixes="saxon"
> >
> <xsl:output 
> 	version="1.0" 
> 	method="html" 
> 	indent="yes" 
> 	encoding="utf-8" 
> 	omit-xml-declaration="no" 
> 	standalone="no" 
> 	media-type="text" 
> 	cdata-section-elements="br"
> />
> 
> <xsl:template match="/">
> <X>
> <xsl:copy-of
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//*"/>
> <xsl:apply-templates
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//text()
> | *"/>
> <xsl:copy-of
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//text()"/>

What should the above do? Or rather, what do you want the above to do?

> <xsl:text>&lll;</xsl:text>
> </X>
> </xsl:template>
> 
> </xsl:stylesheet>

Cheers,

Jarno - Cubanate: Transit

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.