[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] UNSUBSCRIBE UNSUBSCRIBE UNSUBSCRIBE Re: Question for the XPath
UNSUBSCRIBE Uche Ogbuji wrote: > > Given the following XML in a DOM document > > > > <foo> > > bar > > <![CDATA[ > > baz > > ]]> > > quux > > </foo> > > > > and the following XPath > > > > //text() > > > > what should be the resulting DOM nodes and why? I can think of two answers but they both have problems. > > > > PS: Why is http://www.w3.org/TR/2002/WD-DOM-Level-3-XPath-20020712/ returning a 404 when it is linked from http://www.w3.org/DOM/ ? > > > > XPath is defined against a certain model of an XML document. The section that > answers your question is 5.7: > > "Character data is grouped into text nodes. As much character data as possible > is grouped into each text node: a text node never has an immediately following > or preceding sibling that is a text node. The string-value of a text node is > the character data. A text node always has at least one character of data. > > "Each character within a CDATA section is treated as character data. Thus, > <![CDATA[<]]> in the source document will treated the same as <. Both will > result in a single < character in a text node in the tree. Thus, a CDATA > section is treated as if the <![CDATA[ and ]]> were removed and every > occurrence of < and & were replaced by < and & respectively." > > Therefore to a conforming XPath processor, > > <foo> > bar > <![CDATA[ > baz > ]]> > quux > </foo> > > Is precesely the same as > > <foo> > bar > baz > quux > </foo> > > i.e. one element node with one text node child. > > There is actually an open bug against 4XPath right now that it leaks a bit in > this performance. e.g. in some cases, it can return a text node child of an > attribute when operating on a DOM (this is so in DOM but not XPath). Your pos > is a handy reminder for me to fix this bug. > > As an illustration, here's a session with 4XPath does (interactive Python > prompt): > > >>> DOC = """<foo> > ... bar > ... <![CDATA[ > ... baz > ... ]]> > ... quux > ... </foo>""" > >>> from Ft.Xml.Domlette import NonvalidatingReader > >>> doc = NonvalidatingReader.parseString(DOC, "http://dummybaseuri.com") > >>> from Ft.Xml.XPath import Evaluate > >>> result = Evaluate("//text()", contextNode=doc) > >>> print result > [<cText at 0x81ae434>] > >>> print result[0].data > > bar > > baz > > quux > > >>> > > -- > Uche Ogbuji Fourthought, Inc. > http://uche.ogbuji.net http://4Suite.org http://fourthought.com > Track chair, XML/Web Services One Boston: http://www.xmlconference.com/ > The many heads of XML modeling - http://adtmag.com/article.asp?id=6393 > Will XML live up to its promise? - http://www-106.ibm.com/developerworks/xml/li > brary/x-think11.html > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl> -- Edward R Gloor QWEST Communications W - (303) 244-1348 P - (303) 852-8644
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|