Java Concurrency and the Executor Framework

Abstract

On the Java forums I have observed quite a few questions to which ExecutorService and ScheduledExecutorService would make things much simpler if used the right way. The executor services are the recommended standard Java API to develop background period processing.

When can I parallelise my code?

Provided you have a section of code which runs for a reasonable amount of time e.g. 10 micro-seconds or longer, and is largely independent of the rest of the applications e.g. it waiting on a socket connection, or perform an expensive CPU function, then you have a good candidate to parallelise your code.

ExecutorService and ScheduledExecutorService, what are they for?

These services allow you to create background tasks and manage them in a simple manner. They can also be used to ensure you are making best use of all the cores in you machine, manage a thread, create a thread pool of recycled threads and schedule recurring tasks.

A simple example.

A simple service
public static final ScheduledExecutorService SERVICE = Executors.newSingleThreadScheduledExecutor();

A single thread thread pool is simpler to use

This is a singleton you should be aware of the how this could impact your application. It is included here for simplicity.
Perform a task once is the background
SERVICE.submit(new Runnable() {
    public void run() {
        doTask();
    }
});

// complete all tasks and finish.
SERVICE.shutdown();
Groovy: Perform 10 tasks
ScheduledExecutorService SERVICE = Executors.newSingleThreadScheduledExecutor();

for (int i = 0; i < 10; i++) {
  final int i2 = i;
  SERVICE.submit({ System.out.println("Hello world " + i2) });
}

SERVICE.shutdown();

Getting a result from a background task

You can perform a task and retrieve the result later. You have to wait for the result, which reduces the benefit of performing the task in the background.

This is useful if you have a pool of threads, or you want to put a time out of the task being performed.

I suggest the task should take all its inputs and result a single result. i.e. it is functional and without side effects. This simplifies any multi-threaded issues by making it clear which values are used and the result given
perform a task and wait for the result.
final Input inputs = ...
Future<String> future = SERVICE.submit(new Callable<String>() {
    public String call() throws Exception {
        return doTask(inputs);
    }
});
// later
try {
    String result = future.get();
    System.out.println("result= "+result);
} catch (InterruptedException e) {
    LOG.log(Level.SEVERE, "Unhandled exception", e);
} catch (ExecutionException e) {
    LOG.log(Level.SEVERE, "Unhandled exception", e);
}
The exception thrown by the task is captured by the Future and not printed. Unless you examine the future, there is no way to know an exception was thrown.
Groovy: Simplified Callable
Future<String> future = SERVICE.submit({ return doTask(i2)} as Callable);

Performing a repeating task.

perform a repeating task every 500 ms.
Future repeating = SERVICE.scheduleAtFixedRate(new Runnable() {
    public void run() {
        try {
            doRepeatingTask(inputs);
        } catch(Exception e) {
            e.printStackTrace();
        }
    }
}, 0, 500, TimeUnit.MILLISECONDS);
// later
repeating.cancel(false);

The repeating task should handle its own exceptions and decide whether the exception should be re-thrown (stopping the repeating task) or handled.

I have wrapped the task in a try catch block. If you don't do this, the exception will be placed in the future and most likely discarded. Worse your repeating task stops, possibly without and warning.

A task which passes on its result.

performing a task and handling a result without waiting for it.
final JLabel label = new JLabel();
SERVICE.submit(new Runnable() {
    public void run() {
        String result;
        try {
            result = doTask(inputs);
        } catch (Exception e) {
            result = e.toString();
        }
        final String result2 = result;
        SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                label.setText(result2);
            }
        });
    }
});
Groovy: perform a background task which updates Swing
JLabel label = new JLabel();
SERVICE.submit({
  String result
  try {
    result = doTask(inputs);
  } catch (Exception e) {
    result = e.toString();
  }
  SwingUtilities.invokeLater({ label.setText(result) })
});

Thread pools tuned for specific purposes.

A thread pool which has an unlimited number of threads.

This pool creates threads as needed and retires threads if they haven't been used for a minute.

ExecutorService UNLIMITED_THREAD_POOL = Executors.newCachedThreadPool();

Creating a thread pool which will use all cores.

If you have a CPU intensive process, the optimal number of threads may be the number of cores on your system, so that each core is performing one task at a time.

A thread pool which matches the number of cores
int proc = Runtime.getRuntime().availableProcessors();
ScheduledExecutorService service = Executors.newScheduledThreadPool(proc);

Gotchas

There are some common mistakes in multi-threaded code. These stem from practices I consider to be suspect even in single threaded code.

Gotcha: Creating a lazy singleton without proper synchronization.

It is widely considered that creating singletons is something which should be limited. However, you may not avoid them entirely.

A poorly implemented lazy singleton can result in multiple such singletons.

Poor implementation of a Singleton
public static Singleton getInstance() {
  if (instance == null)
    instance = new Singleton();
  return instance;
}

The problem is that multiple threads can see instance as null and thus create multiple Singleton. This happens rarely which makes it a hard to trace bug.

Improved lazy Singleton
public static synchronized Singleton getInstance() {
  if (instance == null)
    instance = new Singleton();
  return instance;
}

This uses locking which you might be concerned as a performance hit. (See Myths below) Instead you can avoid the use of a lock by using an inner class.

Lazy Singleton with an implicit lock
public Singleton {
  static class SingletonHolder {
    static Singleton INSTANCE = new Singleton();
  }

  public static Singleton getInstance() {
    return SingletonHolder.INSTANCE;
  }
}

The JVM guarantees that each class is loaded in a single-threaded thread-safe manner and only once.

Summary: Don't use a singleton if you can avoid it.

Gotacha: SimpleDateFormat is not thread safe.

SimpleDateFormat uses fields to perform its calculations so

private static final SimpleDateFormat DATE_FORMAT = new SimpleDateFormat();

// can produce corrupt results
System.out.println(DATE_FORMAT.format(date));

Solution: Use a thread local SimpleDateFormat or Joda time's DateTimeFormat. The later uses local variable instead of fields to perform calculations.

Summary: Don't use a field when you can use a local variable.

Multi-threading myths.

Multi-threading is not a way to make everything go faster. It is a way of increasing CPU utilisation which can make your application faster but can make it slower. Using multiple threads can make your application easier to manage but make your code more complex in the process.

Myth: Can multiple threads make my hard drive go faster?

Generally, no. Hard drives are optimised for high throughput sequential access. Attempt to access it more than once at a time and you will see performance suffer. The optimal thread size is 1 per physical drive.

Solid State Drives might not follow this rule as they don't have moving parts

Myth: More threads, the greater the improvement.

Up to a point, this is correct. However this point is very low. The number of cores is often an optimal number e.g. 2 or 4 in most machines. If a core has more than one thread, each thread adds overhead reducing the processing throughput.

Myth: A general purpose parallelising compiler is just around the corner.

Compilers are getting smarter. However, people have been looking at this issue for decades and while I expect to see greater support for parallelising code, it won't happen automatically. Why not? Because for real world programs, its so easy to get it wrong and end up with an application which is slower and unreliable.

Until a compiler can statically determine whether a section of code is thread-safe (something much easier in functional languages) and determine how to break up a program so it is faster and not just wasting CPU resources, you won't have a compiler/JVM which will do this automatically.

Myth: Using locks makes my program so much slower!

Locks and synchronization are an overhead, but very rarely impacts the performance of your application. Like all performance issues, you should write code for clarity first, then performance tune using realistic data. Using a good profiler will help! Many free profilers can help you. You are unlikely to find that locking is a bottle neck in your application.

Myth: Multi-threading has to be a black art.

Multi-threaded programs to run into issues that single-thread program don't. This doesn't mean they have to be complicated. If you use thread pools and self contained tasks, you will get very good use of a second core in your system, if not all the cores in your system, without much complexity.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.