|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Fast text output from SAX?
Bob Wyman wrote: > ... > > Just like you, I groaned when I saw the suggestion that you >could take "wire-protocol" and then just stuff it into memory. This > > Not wire-protocol, wire-format = 'the same format as on the wire, in a file, etc.', the data payload in other words. I am asserting that it is possible to construct a data format that is efficient for desired operations that is also compact in memory and therefore can be input and output as-is without transformation. The hard part is allowing in-place modifications to be efficient to do and not result in much or any space overhead. Everything else is done or could easily be done with other formats. If any data format is self-describing in the XML sense, I can write a library that allows me to traverse its structure and retrieve data in an XPath style. ASN.1 and similar IDL systems usually compile into data-specific code, but even for these formats I could devise metadata that a general purpose library could use to traverse the resulting structures in an XPath style to retrieve and convert values. >might work with text, but it sure as heck doesn't work with binary >formats or anything that contains an address or offset. The > > I can think of several ways to represent an offset that is independant of a particular architecture and I'm sure you can too. >distinctions between wire-protocol, in-memory-format, and >on-disk-format, are fundamental. Every proposal that I've ever seen > > Why would the wire format (not wire protocol) and on-disk-format be different? I'm not talking about the wire-protocol; to the application the transport just takes a stream of bytes, possibly in chunks, and returns the same. >for a "common" format for use in two or more of these contexts has >ended up failing for one reason or another. As far as stuffing >wire-protocol into memory goes: Let me just say that *NOBODY* is ever >going to write to *MY* address space without a great deal of checking > > What are you thinking here? Who would be writing into your address space? DMA from the network directly to application memory? (This does have some use in high end computing situations, but that's not what I'm talking about.) My proposal consists of loading a block (or string of blocks) of data into a buffer, traversing and reading or modifying that data with a library, and later possibly writing the resulting buffer out. What strikes you as dangerous about that? When you load a buffer of data and feed it to a gzip library to decompress it, isn't that the same situation on a bulk scale? >going on... Also, if this problem was as simple as just replacing >direct addresses with relative addresses, don't people realize that we > > That's not what I am doing; my recent example was a proof of concept and proof of existance of a solution that met the specific requirements being dicussed: avoidance of parsing and serialization as a separate step. That doesn't mean that a solution with a relative reference would be bad, but my main methods are not relative addresses. Please read about my approach at: http://esxml.org and do point out my errors. >probably would have figured this out a few decades ago? As an >industry, we're not so stupid that we would missed something so >obvious... Some times, the obvious solution is *SO* obvious that it >must be flawed. > > Better famous last words have seldom been spoken. :-) I see advances every day that cause me to ponder the same question. I've been programming a fairly long time and, besides horsepower, there are a lot of things the royal we should have thought of 20 years ago. I think I was even independantly first on several very popular ideas, but I didn't act publicly on those. I can't guaruntee that the best future example of my approach will be super efficient and an obvious choice, but I have aggregated enough solutions in my current design that I have convinced myself that it is possible. I would rather release code than talk about it once I have some design decisions, this last week notwithstanding. ;-) Later. > ..... > > bob wyman > > sdw -- swilliams@h... http://www.hpti.com Per: sdw@l... http://sdw.st Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








