[Home] [By Thread] [By Date] [Recent Entries]
True. I was just reacting to the idea that a protocol necessarily implied high overhead. I wonder what HT does to avoid crossing thread boundaries? Re-invent co-routines? Bob Foster Michael Kay wrote: > I think HT said that there's always an overhead if you have to cross a > thread boundary, and that they try to avoid it whenever possible. Crossing a > process or machine boundary would be far worse. > > Michael Kay > > >>-----Original Message----- >>From: Bob Foster [mailto:bob@o...] >>Sent: 13 July 2005 22:18 >>To: Michael Kay >>Cc: xml-dev@l... >>Subject: Re: Another XML parsing idea? Was: Re: >> XML Hangover) >> >>I don't know the internals (maybe someone can comment) but I believe >>Markup Technology has a protocol for passing PSVI around. It seems >>pretty darn fast. >> >>Bob Foster >> >>Michael Kay wrote: >> > A protocol implies sending and receiving messages >>typically across a >>process >> > boundary or even a machine boundary. This would raise the >>cost of XML >> > parsing by a couple of orders of magnitude. >> > >> > Michael Kay >> > >> > >> >>-----Original Message----- >> >>From: Mukul Gandhi [mailto:mukul_gandhi@y...] >> >>Sent: 13 July 2005 20:01 >> >>To: Michael Kay; 'Pete Cordell'; xml-dev@l... >> >>Subject: Another XML parsing idea? Was: Re: >> >> XML Hangover) >> >> >> >>Today, we have a paradigm in XML parsing of using APIs >> >>like SAX or DOM. I was thinking of another approach to >> >>parse XML documents. >> >> >> >>Can we have a protocol (instead of API) that will talk >> >>between a application and the XML parser? This shall >> >>make using a XML parser interoperable to the calling >> >>application.. We could achieve this "we could have a >> >>Microsoft XML parser serving Java program's XML >> >>parsing request.." >> >> >> >>Just now we have APIs like SAX and DOM and proprietary >> >>Microsoft APIs.. Had we had some protocol similar to >> >>HTTP, that talked between a application and parser, it >> >>may help interoperability.. >> >> >> >>Is this sensible thinking? Is this idea conceptually >> >>similar to StAX or .NET XmlReader parsing approach? >> >> >> >>Regards, >> >>Mukul >> >> >> >>--- Michael Kay <mike@s...> wrote: >> >> >> >> >> >>>The URL got truncated >> >>> >> >>> >> >> >> >> >>>>http://www.idealliance.org/proceedings/xml04/papers/111/mhk- >> >>paper.html >> >> >> >>>with ".html" at the end. >> >>> >> >>>Michael Kay >> >>> >> >>> >> >>>>-----Original Message----- >> >>>>From: Mukul Gandhi [mailto:mukul_gandhi@y...] >> >>> >> >>>>Sent: 13 July 2005 10:02 >> >>>>To: Michael Kay; 'Pete Cordell'; >> >>> >> >>>xml-dev@l... >> >>> >> >>>>Subject: RE: XSL for non-XML input (Was: >> >>> >> >>>Re: >> >>> >> >>>> XML Hangover) >> >>>> >> >>>>Hi Mike, >> >>>> I get error >> >>>>HTTP 404 - File not found >> >>>> >> >>>>--- Michael Kay <mike@s...> wrote: >> >>>> >> >> >>>>http://www.idealliance.org/proceedings/xml04/papers/111/mhk- >> >>paper.htm >> >> >> >>>>Regards, >> >>>>Mukul >> >>>> >> >>>> >> >>>>> >> >>>>>Michael Kay >> >>>>> >> >>>>> >> >>>>>Going further, observing the idea of using out >> >>> >> >>>of >> >>> >> >>>>>band data (e.g. schema) to >> >>>>>provide extra information to complete 'binary >> >>> >> >>>XML', >> >>> >> >>>>>could XSL (with suitable >> >>>>>front ends) work on say an ASN.1 encoded X.509 >> >>>>>certificate (and ASN.1 >> >>>>>message definition) and produce, say, a PDF >> >>> >> >>>output? >> >>> >> >>>>> >> >>>>>Not that I have a need to do that right now! >> >>> >> >>>I'm >> >>> >> >>>>>just interested to know >> >>>>>whether XSL can be used as a kind of universal >> >>> >> >>>data >> >>> >> >>>>>translator. >> >>>>> >> >>>>>Thanks, >> >>>>> >> >>>>>Pete. >> >>>>>-- >> >>>>>============================================= >> >>>>>Pete Cordell >> >>>>>Tech-Know-Ware Ltd >> >>>>> >> >>>> >> >>----------------------------------------------------------------- >> >> >> >>>>> for XML to C++ data >> >>> >> >>>binding >> >>> >> >>>>>visit >> >>>>> >> >>>>>http://www.tech-know-ware.com/lmx >> >>>>> (or >> >>> >> >>>http://www.xml2cpp.com) >> >>> >> >>>>>============================================= >> >>>>> >> >>>>> >> >>>>>----- Original Message ----- >> >>>>>From: Michael Kay <mailto:mike@s...> >> >>>>>To: 'Joe Schaffner' >> >>> >> >>><mailto:schaffner.joe@g...> >> >>> >> >>>>> ; >> >>>>>xml-dev@l... >> >>>>>Sent: Monday, July 11, 2005 9:00 PM >> >>>>>Subject: RE: XML Hangover >> >>>>> >> >>>>> >> >>>>> >> >>>>>I've been reading the XML litterature. It's >> >>> >> >>>great. >> >>> >> >>>>>Just a few comments: >> >>>>> >> >>>>>Welcome on board. It's refreshing to get >> >>> >> >>>thoughtful >> >>> >> >>>>>comments from someone >> >>>>>who's new to the game. >> >>>>> >> >>>>>XSL - XML Stylesheets is divided into two parts, >> >>>>>XSL-T and XSL-FO. >> >>>>> >> >>>>>The T part deals with templates and translation. >> >>>>>Since HTML is valid XML, I >> >>>>>guess I can parse my HTML using XSL-T to produce >> >>> >> >>>XML >> >>> >> >>>>>and vice versa. I don't >> >>>>>understand why XSL-T refers to "nodes in an >> >>> >> >>>output >> >>> >> >>>>>tree". This suggests some >> >>>>>kind of internal representation, but XML is >> >>>>>perfectly good representation >> >>>>>language. Don't <templates> merely write XML >> >>> >> >>>text to >> >>> >> >>>>>stdout? >> >>>>> >> >>>>>No, the result tree is completely abstract, >> >>> >> >>>there is >> >>> >> >>>>>no suggestion of an >> >>>>>internal representation. In fact, for many XSLT >> >>>>>processors, the "result >> >>>>>tree" is represented internally as a stream of >> >>>>>events, not as a linked >> >>>>>collection of objects in memory. This concept of >> >>>>>writing a tree, rather than >> >>>>>writing text, however is extremely important. >> >>>>>Firstly, it defines a >> >>>>>separation of the information content of an XML >> >>>>>document from the accidental >> >>>>>aspects of its lexical representation - >> >>> >> >>>something >> >>> >> >>>>>that is sadly missing from >> >>>>>the XML spec itself. In turn, this gives you a >> >>> >> >>>basis >> >>> >> >>>>>for defining a concise >> >>>>>set of operators that are in some sense >> >>> >> >>>complete, >> >>> >> >>>>>composable and exhibit >> >>>>>closure. In practical terms, it gives you the >> >>>>>ability to write a series of >> >>>>>transformations - a pipeline - in which the >> >>>>>expensive steps of serializing >> >>>>>and parsing intermediate results can be >> >>> >> >>>eliminated. >> >>> >> >>>>> >> >>>>>Roughly, the process seems to work like this: >> >>> >> >>>the T >> >>> >> >>>>>processor does a >> >>>>>recursive descent of the source XML. At each >> >>> >> >>>node it >> >>> >> >>>>>evaluates the set of >> >>>>>templates. Those templates which match the name >> >>> >> >>>of >> >>> >> >>>>>the "current" tag are >> >>>>>processed, in some order. The template writes >> >>> >> >>>text, >> >>> >> >>>>>that's why it's called a >> >>>>>"template. The recursive descent is continued >> >>> >> >>>with >> >>> >> >>>>>an <apply-templates> tag >> >>>>>inside the template. This allows you to balance >> >>>>>output. >> >>>>> >> >>>>>It doesn't have to do a recursive descent of the >> >>>>>source XML: that's up to >> >>>>>the application, though a recursive descent is >> >>> >> >>>the >> >>> >> >>>>>most common design >> >>>>>pattern. And it definitely doesn't write text: >> >>>>>people who create a mental >> >>>>>model of writing text eventually get a rude >> >>>>>awakening, usually when they >> >>>>>first try to tackle grouping problems. >> >>>>> >> >>>>>If no matches are found, the T processor >> >>> >> >>>continues >> >>> >> >>>>>the descent. >> >>>>> >> >>>>>There is a <template> tag (I forget what) which >> >>> >> >>>will >> >>> >> >>>>>select arbitrary paths >> >>>>>in the souce tree, and there are tags which >> >>> >> >>>iterate >> >>> >> >>>>>through the result. >> >>>>> >> >>>>>Again, it's best to think of the stylesheet as >> >>>>>containing nodes >> >>>>>(representing instructions) rather than tags. >> >>>>>Consider >> >>>>> >> >>>>><xsl:element name="x"><xsl:value-of >> >>>>>select="."/></xsl:element> >> >>>>> >> >>>>>There are three tags there, but four nodes, and >> >>> >> >>>only >> >>> >> >>>>>two instructions. The >> >>>>>semantics of the language are described in terms >> >>> >> >>>of >> >>> >> >>>>>the two instructions, >> >>>>>not the three tags. >> >>>>> >> >>>>> This will allow me to build up a result "tree" >> >>>>>which is not a mirror image >> >>>>>of the source, something I need to do if I'm >> >>>>>rearranging sections of the >> >>>>>input document. Rather than buffering >> >>> >> >>>intermediate >> >>> >> >>>>>structures, the T >> >>>>>processor does multiple passes based on these >> >>> >> >>>tags, >> >>> >> >>>>>and creates the output >> >>>>>on-the-fly. Cool. >> >>>>> >> >>>>> ... . >> >>>>> >> >>>>>I assume there is nothing stopping me from using >> >>>>>XSL-T to transform my HTML >> >>>>>to PDF, but it seems best to output XSL-FO then >> >>>>>create a PDF using some kind >> >>>>>of tool. What is that tool? >> >>>>> >> >>>>>It's an XSL-FO processor. Examples are FOP, >> >>> >> >>>RenderX, >> >>> >> >>>>>Antenna House. >> >>>>> >> >>>>>Are there FO plug-ins available for my browsers? >> >>> >> >>>>> >> >>>>>No, people are by-and-large using (X)HTML/CSS >> >>> >> >>>for >> >>> >> >>>>>the browser, XSL-FO/PDF >> >>>>>for the printed page. >> >>>>> >> >>>>>Does this technology work? >> >>>>> >> >>>>>Absolutely yes. >> >>>>> >> >>>>>Michael Kay >> >>>>>http://www.saxonica.com/
|

Cart



