Java performance changes

Overview

What was true for JDK 1.0 is always true for JDK 6. However older documentation does not reflect this.
Its worth noting that most simple benchmarks do not test the performance of the code in an multi-threaded environments. In the examples I have provided, these are the results for running four threads.

Notes on performance.

" optimised out" means the JVM realising that the operation does nothing, optimised the code to nothing and reported over 1 trillion operations/second.
The results over 10,000,000 or over 10 billion operations/second are probably partially optimised out and may be a test of how quickly the JVM optimised out the test rather than the actual speed of the test.

Myth: Looking down is faster than looping up.

In Java 5 and Java 6 there is no difference between looping down or up. Even looping up or down an empty loop is much the same.

Empty looping Speed
Looping down using multiple threads 2,894,000 K/second.
Looping up using multiple threads 2,095,000 K/second.

The difference here is less than 0.5 of a clock cycle. For any loop which does real work this difference is trivial.

However, for trival non-empty loop, the performance is the same.

counting down
int[] num = { 0 };
for (int i = 1000 * 1000; i > 0; i--)
   num[0]++;
counting up
int[] num = { 0 };
for (int i = 0; i < 1000 * 1000; i++)
   num[0]++;
Trivial loop Linux JDK 6u5 Linux JDK 5u11 PC JDK 6u5
Looping down using multiple threads 18,892,205 K/second. 20,222,136 K/second. 1,438,000 K/second.
Looping up using multiple threads 19,229,078 K/second. 20,037,658 K/second. 1,453,000 K/second.

Legend: Calling Math.max(a,b) is 7 times slower than (a > b) ? a : b. This is the cost of a method call.

The Linux server is a blade with two dual core AMD 64 2.4 GHz processors. Performed in 2008.
The PC is a Windows XP workstation with a two Xeon 3 GHz PC. Performed in 2008.
Small methods can be inlined. This reduces or removes the impact of a method call.

Getting the maximum value Linux JDK 6u5 Linux JDK 5u11 PC JDK 6u5
Calling Math.max(1, i) optimised out optimised out 1,609,900 K/second.
Using (1 >= i) ? 1 : i 50,494,586 K/second. 30,304,184 K/second. 1,610,400 K/second.

Note: because JDK 5 and 6 on Linux optimised out Math.max(), it appeared that Math.max() was around 90x faster than the "? :" trigraph.

Legend: Other slow operations.

The Sparc 20, JDK 1.1.4 was running Solaris. These test were perform in 1998.
Obviously the hardware is significantly faster but the JVM is also smarter.
Timings are in K operations/second.

Linux JDK 6u5 Linux JDK 5u11 PC JDK 6u5 Sparc 20 JDK 1.1.4 code operation
optimised out optimised out 1,276,609 147,058 b = (i & 0x100) != 0 get element of int bits
optimised out 1,474,653 112,965 314 b = bitSet.get(3); get element of Bitset
optimised out optimised out 2,127,880 20,000 obj = objs[1]; get element of Array
optimised out optimised out 956,285 5,263 str.charAt(5); get element of String
71,680,816 169,723 334,761 361 buf.charAt(5); get element of StringBuffer
optimised out optimised out 979,260 n/a buf.charAt(5); get element of StringBuilder
527,364 197,577 282,910 337 objs2.get(1); get element of Vector
optimised out optimised out 291,080 n/a objs2.get(1); get element of ArrayList
80,075 82,585 56,850 241 hash.get("a"); get element of Hashtable
217,602 214,757 85,371 n/a hash.get("a"); get element of LinkedHashMap
1,093,388 1,510,255 71,339 336 bitset.set(3); set element of Bitset
22,626,994 36,676,586 356,209 5,555 objs[1] = obj; set element of Array
67,216,056 181,385 326,831 355 buf.setCharAt(5,' '); set element of StringBuffer
75,454,991 50,275,535 982,204 n/a buf.setCharAt(5,' '); set element of StringBuilder
826,486 189,073 178,017 308 objs2.set(1, "hi"); set element of Vector
50,438,000 50,339,000 107,019 n/a objs2.set(1, "hi"); set element of ArrayList
98,863 78,711 38,065 237 hash.put("a", obj); put element of Hashtable
298,666 160,212 40,210 n/a hash.put("a", obj); put element of LinkedHashMap
When profiling or performance tuning it is best to test a real application with real usage on your target system. An application can perform very differently on different systems, even for the same version of Java.

Legend: Speed of creating objects and arrays is very slow.

Timings are in K operations/second.

Linux JDK 6u5 Linux JDK 5u11 PC JDK 6u5 code operation
218,942 216,935 106,851 new Object(); Create a simple object
55,989 53,555 14,472 new int[10]; Create an array
1,931 1,193 863 new Exception(); Create an Exception
3,126 3,296 631 new LinkedHashMap(map); Create a map with 10 String keys
63,439 59,795 17,875 new TenFields( ... ); Create an object with ten fields using a constructor
63,529 60,606 18,205 new TenFields(); setX() x 10 Create an object with ten fields and ten setters

Creating Exceptions are still relatively slow. However it you are creating around 1 million exception per second, perhaps you could make them more exceptional.

Legend: Use StringBuffer instead of + String concatenation.

Even for early version of the JDK, this was the same thing. However + was clearer and therefore better.
However, from Java 5, The String + uses StringBuilder which is more efficient than StringBuffer.
In fact the compiler will inline and simplify constants so that string concatenation can be removed at compile time.

Solution.

From Java 5, replace StringBuffer with StringBuilder, unless it is a field. BTW: StringBuffer is unlikely to be a good choice for a field shared between threads.

Legend: Synchronized methods are 50 times slower than non-synchronized methods.

With Java 6, when a lock is typically gained my the same thread, a synchronized method is about 1.06 - 3 times slower.

Legend: Reusing object improves performance.

In Java 5 and 6, Object pools can confuse the GC. For this reason, it can be faster and simpler to remove the object pool. Note: the GC tries to recycle objects for you in any case.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.