[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Speed in Languages and Browser Architectures
Rick Marshall writes: > If I write a Java VM in C why can I write a program in Java that can run > faster that a functionally equivalent program in C? (or assembler for > that matter). > > Is there a permathread for this? > > The implication is that Java can create a sequence of instructions that > C can't. Great question (though I'm still waiting for someone to say that this discussion has strayed too far from XML and so should go elsewhere. In the meantime, this stuff happens to interest me, so): To understand why there's no contradiction in Java sometimes being faster than C, even if the runtime is written in C, you need to explore some aspects of how optimizers work. I can think of at least two language characteristics that sometimes (not always) work in favor of Java for optimization: The Java language is in some ways more constrained --------------------------------------------------- First of all, optimizers need to prove theorems about your program, >> and that's often easier to do if your language is less expressive, not more<<. One classic barrier to the optimization of C programs is the relatively unconstrained semantics and of pointers. So, consider (my C is a bit rusty, so please excuse minor syntax errors): float *p; // claimed to point to a float int a; int b; ...random code here, including some that might set p... a=1; somefunc(p); b = a+a; You'd like to believe that a good optimizer would notice that a=1, and so be able to determine at compile time that b will always be 2. The problem is that in the absence of other information, we really don't know what p points to. It's just possible that in that random code, someone did: p = &a; // p points to a and that in somefunc the pointer was used to overwrite a: *((int *)p = 6; Even though this is highly unlikely, the optimizer can't prove it isn't happening. Ask someone who writes optimizers for C and they'll tell you that one of the toughest problems is keeping a seemingly innocent reference to a pointer (in this case in the call to somefunc) from defeating optimizations all over the program. Java is strongly typed, and uses references instead of pointers. The equivalent Java might be: java.lang.Float floatRef; // definitely references a Float int a; int b; ...random code here, including some that might set floatRef... a = 1; somefunc(floatRef); b = a+a; Note that the C and Java are very, very similar programs in terms of the data structures and code you'd want to generate, but in Java, an optimizer can >prove< that somefunc can't affect the value of a. Bingo, you can compute the value of both a and b at compile time. These same features may cause C to be much faster in other cases. If you're decoding a network packet, and the same bytes may contain either an int or a float, that's a trival cast in C. It's at best tricky in Java. Part of what you gain is that the Java optimizer can prove some things that the C optimizer can't. So, it's a tradeoff, but you can show why the Java can be faster. Committing to optimizations early --------------------------------- This is not inherent in the languages, but it's inherent in how they're typically deployed. C compilers are typically run in advance of program execution. They must therefore commit to optimizations in advance, and typically without knowledge of program execution statistics (which branches are taken often, which loops run long or short, which virtual function calls mostly resolve to the same method repeatedly.) Because Java JITs are executed at runtime, they can gamble on high risk optimizations, and can watch the program's execution. One common trick is to do a very aggressive optimization that only works if your gamble is correct, but to guard it with a check. So, rather than generating code that will run fast for any value of variable "i", for example, you can generate code that won't work at all if i<0, but guard it with an: if (i<0) reoptimize(); check. A C optimizer would have to statically generate code that work work for all values of a variable. Now, when all is said and done, C is often faster than Java. Type safety can cost you. Code that makes numerous array accesses implies a very large number of bounds checks, and you are relying on the optimizer to find ways to skip most of them. Bringing this discussion back to XML: that's one of the reasons it can be hard to write a fast XML parser in Java. The inner loops tend to be working their way through arrays of input bytes, and depending on how exactly you write the code, the optimizer may or may not be smart enough to avoid adding an array bounds check to each one. Furthermore, code that optimizes well in one VM may not do well in another. In C, you lose type safety, but you put the bounds checks exactly where you need them yourself (and if you're lucky, the optimizer will find a way to skip optimize even some of the ones you put in.) So, yes: Java runtimes and compilers can be written in C, and in certain cases they will execute a given Java program faster than its C equivalent. Noah -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|