Before proceeding with concurrent hash map let’s recap the
synchronization in map.
Java API provide three different implementation of synchronized
MAP.
1.     
Hashtable
2.     
Collections.synchronizedMap(Map)
3.     
ConcurrentHashMap
HastTable is synchronized data structure, but it have only one
lock per table, so if a thread put a lock on hashtable than whole table will be
locked and no other thread will access it.
To synchronize a HashMap use synchronizeMap utility method of
Collections class. This again put lock on complete HasMap.
ConcurrentHasMap provide higher concurrency level as compare to
synchronized HashMap and HasTable. By default ConcurrentHashMap is divide in 16
segment. So at a time 16 threads can work on ConcurrentHashMap while in
HashTable and Collections.synchronizedMap(Map) at a time only one thread can
put a lock. So we can say that performance may be marginally poorer when only a
single thread accesses the Map at a time, but significantly better when
multiple threads access the map concurrently. In ConcurrentHasMap threads will
lock only portion of the data which are being updated while other portion of
data can be accessed by other threads.
ConcurrentHashMap doesn’t allow null keys or null values.
The main reason that nulls aren't allowed in ConcurrentMaps
(ConcurrentHashMaps, ConcurrentSkipListMaps) is that ambiguities because  if map.get(key) returns null, you can't
detect whether the key explicitly maps to null or the key isn't mapped. See
below code
if (map.containsKey(k)) {
   return map.get(k);
}
else
{
   throw new KeyNotPresentException();
}
It might be possible that key k might be deleted in between the
get(k) and containsKey(k) calls. As a result, the code will return null as
opposed to KeyNotPresentException (Expected Result if key is not present).
Code Snippet – 
import java.util.concurrent.ConcurrentHashMap;
/**
 * Simple example of ConcurrentHasMap
 * @author Manoj
 */
public class ConcurrentHashMapTest {
  public static void main(String[] args) {
   ConcurrentHashMap<Integer, String> chm = new ConcurrentHashMap<Integer, String>();
            chm.put(1, "One");
            chm.put(2, "Two");
            chm.forEach((K,V) -> System.out.println(K+ "|" +V));
   }
}
ConcurrentHashMap internal implementaion – 
ConcurrenHashMap
provide constructor to provide custom values of initial capacity, load factor
and concurrency level.
ConcurrentHashMap m = new
ConcurrentHashMap(initialCapacity , loadFactor, concurrencyLevel);
ConcurrentHashMap m = new
ConcurrentHashMap(100 , 0.50f, 5);
Initial
capacity
is 100, it means CHM make sure it has space for adding 100 key-value pairs
after creation.
Default
capacity of Hashmap is 2^4 = 16 buckets.
Load
factor
is 0. 50f, it means when average number of elements per map exceeds 50f (intital
capacity * load factor = 100 * 0. 50f = 50f) at that time map size will be
increased and existing items in map are rehashed to put in new larger size map.
Load Factor is a measure, which
decides when exactly to increase the hashmap capacity(buckets) to maintain get
and put operation complexity of O(1).
Default
load factor of Hashmap is 0.75f (i.e 75% of current map size).
Concurrency
level
is 5, it means at any given point of time Segment array size will be 5 or
greater than 5, so that 5 threads can able to parallely write to a map.
The actual number of segments a
ConcurrentHashMap instance will maintain is equal to the next power of two of
the specified concurrencyLevel means in above details concurrency level is 5
than no of segment will be 8. Suppose if concurrency level is 90 than no of
segment will be 128
ConcurrentHashMap creates segments
lazily as necessary. Only one segment is created at construction time so wthin
the constructor, a single Segment, s0, is created.
however, that segments is an
immutable (final) instance field, which already guarantees transient visibility
of s0 after the final field freeze 
How concurrency works?
In above customize
ConcurrentHashMap there are 8 segment. In ConcurrentHashMap it is defined as
static final inner class segment which extends Reentrant lock and implements
Serilizable interface. Each segment use single lock to consistently update its
element flushing all the changes to main memory. Each segment manage its
internal  hashtable of size 2x >= (capacity/no
of segment)
How to store value? How put ()
method works?
First concurrentHashMap calculate
the hashvalue of key and search it if it is found than simply update its value.
If it’s a new entry than first it will find the segment into which the entry
should be written. Segment could be existing or newly created than segment.put
method will be invoked.
The key cannot be NULL, it is used
to locate the segment index and then the actual bucket within the segment 
The upper bits will be used to
locate the segment index and the lower bits to locate the table index within
the segment. This is why re-hashing function is different from that of HashMap
as the hash bits need to be spread better to regularize both segment and index
locations.
In Segment.put() method – with the
help of hashing of key index is determined in the Segment.table than we make an
entry of static final class HashEntry<K,V>
 Code snippet of HashEntry<K, V> class is
– 
static final class
HashEntry<K,V> {
            final K key;
            final int hash;
            volatile V value;
            final HashEntry<K,V> next;
            HashEntry(K key, int hash, HashEntry<K,V>
next, V value) {
            this.key = key;
            this.hash = hash;
            this.next = next;
            this.value
= value;
            }
Before making an entry thread get
lock on the segment If the lock cannot be acquired, The scanAndLockForPut()
method uses a while loop to optimistically acquire the segment lock. Each time
it is unable to acquire the lock, the applicable linked list is traversed, 1
element per lock acquisition failure. If the current element of traversal has
the same key as the one being added, the traversal ends
Alternatively, if the key is not
found in the linked list (after a complete traversal), a new HashEntry is speculatively
created on behalf of put.
We don’t need lock for get()
method(read operation) because in HashEntry class value field is volatile so we
will always get the updated value lock will not be required.
 
 
Nice
ReplyDeleteGreat collection, very usefull thanks
ReplyDelete