|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Parsing efficiency? - why not 'compile'????
On Wednesday 26 February 2003 14:19, David Megginson wrote: > Alaric B. Snell writes: > > If your system is sitting idling waiting for data over the network, then > > a more compact representation would be a winner! > > We'll look forward to your test results. If your system spends lots of its time waiting for networking, do you disagree that reducing the bandwidth utilisation would reduce the service round trip time and increase the maximum throughput? Note that, of course, smaller packets won't reduce the latency of the link due to speed of light limitations, but they will reduce the latency caused by bandwidth limitations. And the maximum number of transfers your network can handle in a second is directly related to the message size; if you can halve the size of the message, you can fit twice as many through your pipe in unit time. So to go back to emperical test results... The ASN.1/XML interop people found that, for data-oriented XML, savings of 80% are common; eg, messages being one fifth the size. Per-packet overheads aside, that would imply that you can fit about five times as many ASN.1/PER encoded messages down a given network connection in a second as you can XML messages. Let's take an example of a stereotypical poster-child Web service... some kind of online store. It has a message you can send it to request a stock search, given keywords and a price range, returning a list of stock descriptions. And it has a message you can send it to place an order, containing delivery addresses and one or more invoice lines, returning a basic success / failure code. The latter operation will happen less often than the former, and will probably involve more time-consuming operations such as checking availability of all the items, checking the credit account, filing the order in the database, and making a printer in a warehouse start printing out a packing slip / manifest for dispatching to commense, so let's focus on the former. The request message only needs to contain a keyword string and two prices; in XML that might be: <search> <keywords>pink floyd</keywords> <prices min="5.99" max="20.00" /> </search> Total size = 76 bytes plus the highly variable length of the keyword string. In PER, that would probably be a byte or two for the length of the keywords (going up to two bytes, from memory, if it's more than 128 characters due to variable length integer storage? Something like that), then the currency values would actually be stored as numbers of pence in the same format - probably two or three bytes each. Total size = 6 bytes plus the highly variable length of the keyword string. But the response would look like this in XML: <search-response> <result sku="GH234" price="6.50">Dark Side of the Moon</result> <result sku="KK234" price="7.50">Wish You Were Here</result> </search-response> Size: 37 bytes + 43 bytes per result + description text length I think in PER that would be another variable-length integer for the number of results returned (called it one byte if we want less than a hundred results), then (for each fixed-length SKU) five bytes plus two bytes of price, one or two bytes of description length, then the description. Size: 1 byte + 9 bytes per result + description text length. In the PER cases, the resulting encoding will be almost entirely the description texts, while in my XML exmaple the description text was smaller than the XML surrounding it. If we say the descriptions are likely to be 20 bytes long, then we have a loss of 36 bytes of overhead (probably negiligible in the long run) but a reduction in mean per-result size from 60 bytes each to 30 bytes each, a halving. So we could be servicing twice as many customers at once from a given Internet link, until the database can't handle all the keyword searches any more. Looking at it another way, consider Google's XML interface. If they got a similar 50% reduction in size from using PER (considering that most of the search results consist of URLs and descriptions as opposed to the structure of the listing, ranking scores, etc) then, if their XML interface became predominantly used, they could halve their bandwidth costs. I'm sure their search algorithm is more resource-intensive than parsing and producing XML, but their bandwidth usage must be *astronomical*! > David ABS -- A city is like a large, complex, rabbit - ARP
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








