Re: Speed in Languages and Browser Architectures

From: noah_mendelsohn@u...
To: Elliotte Harold <elharo@m...>
Date: Fri, 2 Mar 2007 16:42:24 -0500

Play the video

Elliotte Rusty Harold writes:

> If performance were really a concern, and String proved to be the real 
> bottleneck, it's entirely possible someone could write an XML API based 
> on bytes rather than strings. So far I don't think anyone's really had 
> the motivation to do so. Either it hasn't been shown that strings are 
> the problem, or they're not a big enough problem that anyone wants to 
> take the time to fix them.

Yes, but as I wrote earlier, that doesn't necessarily get you out of the 
woods performance-wise.  Bytes give you the low level access, but there 
still is the issue of bounds checks, and also whether things like scanning 
and pattern matching get translated into whatever are the best performing 
instruction sequences on the particular hardware.  I have seen some Java 
parsers written at the byte level in Java that compete pretty well with 
optimized C parsers, but usually in the end the C is at least as fast 
usually faster.   Note that always in Java, a lot can depend on which 
built in methods have been optimized on a particular VM for a particular 
platform.

Also, regarding APIs:  while it's true that byte-oriented APIs can often 
be faster for XML than string-oriented ones, we've usually found in our 
work that the fastest ones work with some sort of string pool (not 
necessarily based on java.lang.Strings) and some sort of handle to 
represent the strings.  So, to indicate that I found the tag <myElement>, 
I don't pass you either a string or a byte array, I pass you some sort of 
handle (typically an integer) that we've agreed refers to the string 
"myElement".  If you do that in the native encoding of the input, and 
carry the convention all the way up to the deseriliazer, the consuming 
application, etc., things get faster.  This is a case where having some 
sort of schema can help, as it usually gives you a hint in advance as to 
at least some of what needs to be in the string pool.  Sometimes you can 
use compile-time constants for well-known strings.  Without a schema or 
DTD, all you can do is build up the pool at runtime, or if you have reason 
to believe that multiple instances will use similar vocabularies, share 
the pool from one run to the next.

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

References:
- Re: Speed in Languages and Browser Architectures
  - From: Elliotte Harold <elharo@m...>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >