How to Initialize a Java HashMap with Reasonable Values?

BY MARKUS SPRUNCK

This article describes the need to initialize a HashMap with fitting default values. It is important to know something about the internal implementation of a HashMap to select the best default values for initialization.

How a Java HashMap is implemented?

Code like:

HashMap<String, Integer> mapJdK = new HashMap<String, Integer>();

is in principal valid Java code, but the hash map is not initialized with a reasonable values.

In this case the Java implementation has to resize the number of buckets with increasing number of data in the hash map. In Java the default load factor is 0.75 and the default initial capacity of a hash map is 16. This means that the threshold for the next resize equates like:

DEFAULT_INITIAL_CAPACITY = 16;

DEFAULT_LOAD_FACTOR = 0.75;

THRESHOLD = DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR;

The hash map will automatically rehash the contents of the hash map into a new array with a larger capacity. This happens when the number of data in the hash map exceeds the THRESHOLD. Each resize leads to a doubled capacity 16*0.75, 32*0.75, 64*0.75, etc. The following diagram depicts this:

Figure 1: Growth of HashMap

Setting the initial capacity to just a high value is no solution, because an iteration over collection requires time proportional to the capacity of the hash map. It is important not to set the initial capacity too high if you iterate over the values.

How to initialize a Java HashMap with reasonable values?

The minimal initial capacity would be (number of data)/0.75+1.

int capacity = (int) ((expected_maximal_number_of_data)/0.75+1);

HashMap<String, Integer> mapJdK = new HashMap<String, Integer>(capacity);

For small hash maps there is no big difference, but for situations with really large hash maps (e.g. 100 MB size) it can seriously increase the needed memory and the rehashing causes a slowdown of the application.