|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Converting delimited text WITH <br> to string
Hi,
> Friends,
> I guess I missed the answer to this one. I have read a lot of FAQs,
> but I have not found my particular answer.
>
> All I want to do is to compare an XML file with a text file.
>
> My desire is to convert the text file into a string then compare the
> data in it to the XML nodes. However, the text file always gets
> parsing errors.
Bacause you're trying to parse something that's not XML with an XML parser.
> The text file has is exported from a OLD database, but the fields do
> have <br> and other sloppy html in them.
Then you have to clean it first by removing the HTML tags, or by converting the "document" into XML (XMLized HTML or XHTML).
> I would edit them, but there are over 300 of them all in different
> folders (lucky for me they are on the same server).
>
> Here is the URL.
>
> http://lcweb2.loc.gov/music/ftp/951201/06180001/ftscript.data
Exactly what do you want to compare, and what does the XML you want to compare with look like.
> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE xsl:stylesheet [
> <!ENTITY lll SYSTEM
> "http://lcweb2.loc.gov/music/ftp/951201/06180001/ftscript.data">
> <!ENTITY nbsp " ">
> <!ELEMENT br (EMPTY)>
> <!ELEMENT BR (EMPTY)>
Declaring the elements will not help you with the parsing errors, because the file is not XML.
> ]>
>
> <xsl:stylesheet
> version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:html="http://www.w3.org/1999/xhtml"
> exclude-result-prefixes="html xs"
> xmlns:saxon="http://icl.com/saxon"
> extension-element-prefixes="saxon"
> >
> <xsl:output
> version="1.0"
> method="html"
> indent="yes"
> encoding="utf-8"
> omit-xml-declaration="no"
> standalone="no"
> media-type="text"
> cdata-section-elements="br"
> />
>
> <xsl:template match="/">
> <X>
> <xsl:copy-of
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//*"/>
> <xsl:apply-templates
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//text()
> | *"/>
> <xsl:copy-of
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//text()"/>
What should the above do? Or rather, what do you want the above to do?
> <xsl:text>&lll;</xsl:text>
> </X>
> </xsl:template>
>
> </xsl:stylesheet>
Cheers,
Jarno - Cubanate: Transit
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








