[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Getting values
Hi Sergio, > Thanks. In fact, I would like to parse the following HTML file: > > <html> > <head> > <meta name ="Embassy of the US in Caracas, Venezuela"/> > <meta name = "DC.Title" content = "Polycyclic aromatic hydrocarbon > contamination"/> > <meta name = "dc.Creator" content = "Gogh, Vincent van"/> > <meta name = "DC.Source" content = "http://a.b.org/manon/"/> > </head> > </html> > > And I would like the output was: > > <?xml version="1.0" encoding="UTF-8"?> > <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > xmlns:dc="http://purl.org/dc/elements/1.0/" > xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#"> > > <rdf:Description rdf:about="http://a.b.org/manon/"> > <------------------------------------------------ I'M GETTING PROBLEMS TO > GET THIS VALUE > <dc:Title>Polycyclic aromatic hydrocarbon contamination</dc:Title> > <dc:Creator>Gogh, Vincent van</dc:Creator> One of your problems -- the reason your XSLT looks so messy -- is that you're trying to generate the tags in the output that you're after by hand, as serialised text. You will find it a lot easier to write XSLT if you think in terms of processing and creating elements rather than in terms of creating tags. First, you need your stylesheet to declare the namespaces that you want to use in your output. If you declare the namespaces in the <xsl:stylesheet> element then you can use them throughout the stylesheet: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#"> ... </xsl:stylesheet> Second, you need your stylesheet to produce an rdf:RDF element. You can do this in a template matching the root node if you want: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#"> <xsl:template match="/"> <rdf:RDF> ... </rdf:RDF> </xsl:template> </xsl:stylesheet> Notice how you don't have to add the namespace declaration attributes to the <rdf:RDF> element; that will be done automatically for you. Within the <rdf:RDF> element you want to create a <rdf:Description> element. The <rdf:Description> element has a rdf:about attribute whose value is the value of content attribute of the <meta> element in your source document whose name attribute has the value 'DC.Source'. You can get a node set of all the <meta> elements in your document with: /html/head/meta You can then filter that list with a predicate, held in []s, that is only true for the <meta> element whose name attribute has the value 'DC.Source': /html/head/meta[@name = 'DC.Source'] >From that <meta> element, you can step on to the content attribute to get the value that you're after: /html/head/meta[@name = 'DC.Source']/@content This is the path that you need to use to get the value of the rdf:about attribute. You can create that value using an attribute value template -- when you create an attribute literally, anything inside {}s is evaluated and the result inserted into the attribute value. So use: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#"> <xsl:template match="/"> <rdf:RDF> <rdf:Description rdf:about="{/html/head/meta[@name = 'DC.Source']/@content}"> ... </rdf:Description> </rdf:RDF> </xsl:template> </xsl:stylesheet> To get the content of the <rdf:Description> element, you want to apply templates to the <meta> elements: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#"> <xsl:template match="/"> <rdf:RDF> <rdf:Description rdf:about="{/html/head/meta[@name = 'DC.Source']/@content}"> <xsl:apply-templates select="/html/head/meta" /> </rdf:Description> </rdf:RDF> </xsl:template> ... </xsl:stylesheet> You then need some templates to match the <meta> elements. What I suggest you do is say that you want to ignore most <meta> elements, do something else with those whose name attribute starts with 'DC.' or 'dc.' and ignore the <meta> element whose name is 'DC.Source'. You can do this with three templates with different priorities: if more than one matches a particular <meta> element, the highest priority template wins: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#"> <xsl:template match="/"> <rdf:RDF> <rdf:Description rdf:about="{/html/head/meta[@name = 'DC.Source']/@content}"> <xsl:apply-templates select="/html/head/meta" /> </rdf:Description> </rdf:RDF> </xsl:template> <xsl:template match="meta" priority="0" /> <xsl:template match="meta[starts-with(@name, 'dc.') or starts-with(@name, 'DC.')]" priority="1"> ... </xsl:template> <xsl:template match="meta[@name = 'DC.Source']" priority="2" /> </xsl:stylesheet> Within the second of these templates, you want to create elements whose name is the same as the value of the name attribute of the matched <meta> element, after the first '.'. You can do this with an <xsl:element> instruction, whose name attribute is an attribute value template, as follows: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#"> <xsl:template match="/"> <rdf:RDF> <rdf:Description rdf:about="{/html/head/meta[@name = 'DC.Source']/@content}"> <xsl:apply-templates select="/html/head/meta" /> </rdf:Description> </rdf:RDF> </xsl:template> <xsl:template match="meta" priority="0" /> <xsl:template match="meta[starts-with(@name, 'dc.') or starts-with(@name, 'DC.')]" priority="1"> <xsl:element name="{substring-after(@name, '.')}"> <xsl:value-of select="@content" /> </xsl:element> </xsl:template> <xsl:template match="meta[@name = 'DC.Source']" priority="2" /> </xsl:stylesheet> And that's you're completed stylesheet. Let us know if you have any questions. Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|