Thursday 19 June 2014

JAVA MEMORY MANAGEMENT

JAVA MEMORY MANAGEMENT

Low Latency -  One fast answer
High Throughput – All answers as fast as possible
JVM – Write once, run anywhere




Minor GC – moves the live objects from Eden space to Survivor spaces. If the objects are old enough then they are promoted to Old Generation space.




TenuringThreshold -Number of times the object has been copied between Survivor spaces. Default 7





Avoid large object allocations – which may directly go in Old Gen pace


The Objects will be placed in Finalizer queue and the queue will not necessary run everytime a GC cycle runs.





CMS – is good and runs currently along with application threads
No Compaction of data– we have to use free list and has fragmentation of heap




When Minor GC runs, it will free-up the space in Eden and Survivor1 and copies the live copies to Survivor2.



It stores as logically continuous area then physically


When Minor GC, happens



After Minor GC completion





The Eden space and survivor is empty now, the remaining live objects are copied to other survivor space region and the live objects which are promoted are copied to Old Gen region.





JVM parameters in Java   - JVM option it can be divided into two parts:

1)    JVM Options that begin with -X are non-standard (thy are not guaranteed to be supported on all JVM implementations), and are subject to change without notice in subsequent releases of the JDK.
2)    JVM Options or parameters which are specified with -XX are not stable and are not recommended for casual use. These options are subject to change without notice also.


Stack size -> -Xss
Heap size -> -Xms[starting size] –Xmx[maximum size]



Multithreading



What is the difference between processes and threads?
A process is an execution of a program but a thread is a single execution sequence within the process.
A process can contain multiple threads. A thread is sometimes called a lightweight process.

Threads can be created by either
·                     Extending the Thread class.
·                     Implementing the Runnable interface.
·                     Using the Executor framework (this creates a thread pool) 
 
The state chart diagram below describes the thread states.
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiykSC2W9TNQ6ZKtBU3foJXBBF-icJBEAq_UuR7FunpAbktcBpi-XWIwJui3FKGC8hW3IR-b7OaSioENw8Fs6khjkBKXoaqhhw3bekvz2siW_jZpvdBqvV_fypbwLeNSlGh45DsPvrFnDYu/s1600/thread_states.JPG
Sleeping – a.k.a. Timed-waiting state

Java Thread

The wait( ), notify (), and notifyAll( ) methods are used to provide an efficient way for threads to communicate with each other. These are called only in synchronized block.
·         Calling wait causes the current thread to wait to acquire the lock of the Object, and
·         calling notify/notifyAll relinquishes the lock and notify the threads waiting for that lock.
·         Thread.yeild() - Causes the currently executing thread object to temporarily pause and allow other threads to execute.
·         t.join( ) allows the current thread to wait indefinitely until thread “t” is finished
t.join (5000) allows the current thread to wait for thread “t” to finish but does not wait longer than 5 seconds.
sleep() – Keep the current thread into sleep hence this is in Thread class, whereas wait( ), notify (), and notifyAll( ) are in Object class because it has to communicate between the threads.

sleep():
               It is a static method on Thread class. It makes the current thread into the
               "Not Runnable" state for specified amount of time. During this time, the thread
               keeps the lock (monitors) it has acquired.
              
wait():
               It is a method on Object class. It makes the current thread into the "Not Runnable"
               state. Wait is called on a object, not a thread. Before calling wait() method, the
               object should be synchronized, means the object should be inside synchronized block.
               The call to wait() releases the acquired lock.


Q: Why thread communication methods wait(), notify() and notifyAll() are in Object class?
 In Java every Object has a monitor and wait, notify methods are used to wait for the Object monitor or to notify other threads that Object monitor is free now. There is no monitor on threads in java and synchronization can be used with any Object, that’s why it’s part of Object class so that every class in java has these essential methods for inter thread communication.

Q: Why Thread sleep() and yield() methods are static?
Thread sleep() and yield() methods work on the currently executing thread. So there is no point in invoking these methods on some other threads that are in wait state. That’s why these methods are made static so that when this method is called statically, it works on the current executing thread and avoid confusion to the programmers who might think that they can invoke these methods on some non-running threads.

Q:How can we achieve thread safety in Java?
There are several ways to achieve thread safety in java – synchronization, atomic concurrent classes, implementing concurrent Lock interface, using volatile keyword, using immutable classes and Thread safe classes. Learn more at thread safety tutorial.

Q:What is Java Thread Dump, How can we get Java Thread dump of a Program?
Thread dump is list of all the threads active in the JVM, thread dumps are very helpful in analyzing bottlenecks in the application and analyzing deadlock situations. There are many ways using which we can generate Thread dump – Using Profiler, Kill -3 command, jstack tool etc.
Read this post to know more about generating thread dump in java.

Q:What is BlockingQueue? How can we implement Producer-Consumer problem using Blocking Queue?
java.util.concurrent.BlockingQueue is a Queue that supports operations that wait for the queue to become non-empty when retrieving and removing an element, and wait for space to become available in the queue when adding an element.
BlockingQueue doesn’t accept null values and throw NullPointerException if you try to store null value in the queue.
BlockingQueue implementations are thread-safe. All queuing methods are atomic in nature and use internal locks or other forms of concurrency control.

ConcurrentHashMap is the class that is similar to HashMap but works fine when you try to modify your map at runtime.


Callable is similar to Runnable interface but it can return any Object and able to throw Exception.
Using Future we can find out the status of the Callable task and get the returned Object. It provides get() method that can wait for the Callable to finish and then return the result.
Check this post for Callable Future Example.

Disadvantage of Synchronization:
·         can cause deadlocks when two threads are waiting on each other to do something
·         synchronized code has the overhead of acquiring lock, which can adversely affect the performance.

volatile variable is one whose value is always written to and read from "main memory". That means that different threads can access the variable.
As of Java 5, accessing a volatile variable creates a memory barrier: it effectively synchronizes all cached copies of variables with main memory, just as entering or exiting a synchronized block that synchronizes on a given object.

Thread Scheduler is the Operating System service that allocates the CPU time to the available runnable threads.
Time Slicing is the process to divide the available CPU time to the available runnable threads.
Context Switching is the process of storing and restoring of CPU state so that Thread execution can be resumed from the same point at a later point of time. 



Thread safety is the process to make our program safe to use in multithreaded environment, there are different ways through which we can make our program thread safe.

  • Synchronization is the easiest and most widely used tool for thread safety in java.
  • Use of Atomic Wrapper classes from java.util.concurrent.atomic package. For example AtomicInteger
  • Use of locks from java.util.concurrent.locks package.
  • Using thread safe collection classes, check this post for usage of ConcurrentHashMap for thread safety.
  • Using volatile keyword with variables to make every thread read the data from memory, not read from thread cache.

JVM parameters in Java   - JVM option it can be divided into two parts:

1)    JVM Options that begin with -X are non-standard (thy are not guaranteed to be supported on all JVM implementations), and are subject to change without notice in subsequent releases of the JDK.
2)    JVM Options or parameters which are specified with -XX are not stable and are not recommended for casual use. These options are subject to change without notice also.

Frequently used JVM parameters for heap, GC and debugging

Stack size -> -Xss
Heap size -> -Xms[starting size] –Xmx[maximum size]

Q. What is a ThreadLocal class?  
A. ThreadLocal is a handy class for simplifying development of 
thread-safe concurrent programs by making the object stored in this class not sharable between threads.
ThreadLocal class encapsulates non-thread-safe classes to be safely used in a multi-threaded environment and also allows you to create per-thread-singleton. 

References:


LOW LATENCY

Low Latency
Low latency allows human-unnoticeable delays between an input being processed and the corresponding output providing real time characteristics. 
Low latency is a function of many things, the two most important ones being:
·         network latency - i.e. the time taken on the network to transmit/receive messages.
·         processing latency - i.e. the time taken by your application to act on a message/event.

Classloading is a sequential process that involves IO to disk. Make sure all the classes for your main transaction flows are loaded upfront and that they never get evicted from the perm generation.
Model your business domain and ensure all your algorithms are O(1) or at least O(log n). This is probably the biggest cause of performance issues in my experience. Make sure you have performance tests to cover the main cases.
Cache misses are your biggest cost to performance. Use algorithms that are cache friendly and set affinity to processor cores either with taskset or numactl for a JVM or JNI for individual threads.
Understand Object Allocation and Garbage Collection. This is a massive topic, but basically you want to avoid GC pauses (often caused by the Stop the World nature of various GC collections). Specialist GC collectors like the Azul collector can in many cases solve this problem for you out of the box, but for most people they need to understand how to tune the Sun/Oracle GCs (CMS, G1, etc).
Use in-memory computing. Memory is cheap, you can have tera byte of data in memory.
Generally you want lock free algorithms and I/O. Even the most well designed concurrent application (that uses locks) is at risk of blocking, blocking in low latency is generally bad :-).
Be aware of the impact of logging. logging binary or using coded output that you can parse off line is probably a good idea.
Use mechanical sympathy - Refer lmax disruptor, excellent framework.
Reduce the number of methods to the stack chain. Try to implement the code only in a single method…
Here's my short-notes about Low Latency which I have got exposure to.
I used to consider some of the below factors while building the Low Latency application:
  • Clear understanding of Java Memory Management will help to tune Garbage Collection.
  • Always ensure most of the time there's a 'Cache Hit' rather than a 'Cache Miss' for better performance.
  • Applied flyweight design pattern to create a pool of shared objects, which helps minimizes the number of objects created.
  • Optimize I/O operations
  • Use threads and ensure to have multiple core processor’s as well
  • Use StringBuffer/StringBuilder classes instead of immutable String objects
  • Use ArrayLists, HashMap as opposed to Vector, Hashtable where possible; as the methods in ArrayList, HashMap are not synchronized









JAVA COLLECTIONS

I have captured the important notes on Java Collectionsthrough various reference across the web which may be useful to others to have quick refresh.

The key interfaces used by the collections framework are List, Set and Map.
The List and Set extends the Collection interface.
  • Map is an interface for a key-value map
  • HashMap is a Map that uses a hash table for its implementation
  • Hashtable is a synchronized version of HashMap
  • TreeMap uses a tree to implement a map
  • ConcurrentHashMap allows for multiple threads to access it at the same time safely
  • LinkedHashMap preserves the iteration order that things were inserted in (others don’t provide a fixed iteration order)

Vector/Hashtable
ArrayList /HashMap
It’s synchronized by default.
It’s not synchronized
 Hashtable doesn’t allow nulls.
HashMap allows null values as key and value.
Enumerations returned by Vector/Hashtable isn't fail-fast.
Iterator in the ArrayList/HashMap is fail-fast
Hashtable can’t retain the order.
HashMap can retain the order of objects in the Collection with key-value pair by swaping to LinkedHashMap.
temporarily synchronize collections as needed:
Map myMap = Collections.synchronizedMap (myMap);
List myList = Collections.synchronizedList (myList);

Array
List/Stack
It’s faster, if you know the size upfront
Can grow/Shrink from their initial size as needed.

These are more abstract than arrays and access is restricted.
For example, a stack allows access to only last item inserted.

Initial Capacity default is 10
It increase by  factor of 1.5 

Inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. 


Array
ArrayList
Array is a fixed length data structure
ArrayList is a variable length Collection class
You can not change length of Array once created in Java
ArrayList re-size itself when gets full depending upon capacity and load factor
Array can contain both primitives and Objects in Java
you can not store primitives in ArrayList, it can only contain Objects

Though automatic resize of ArrayList may slow down insertion a bit 


Set (HashSet , TreeSet)

List (ArrayList, LinkedList, Vector)
A Set is a collection with unique elements and prevents
duplication within the collection.
Elements are unordered. Need to implement SortedSet.
A List is a collection with an ordered sequence of elements
and may contain duplicates.
HashSet and TreeSet are implementations of a Set interface.
ArrayList, LinkedList and Vector are implementations of a List interface. (i.e. an index based)
A TreeSet implements the SortedSet


ArrayList use RandomAccess interface , its access elements randomly And Linked list use Queue interface , it adding and deleting elements from both side so insertion and deletion elements are fast

ArrayList
LinkedList
It's implemented using resizable array
It's implemented using doublyLinkedList. 
Array provides O(1) performance for get(index) method but remove is costly in ArrayList as you need to rearrange all elements

LinkedList doesn't provide Random or index based access and you need to iterate over linked list to retrieve any element which is of order O(n).

Adding and Removing is very slow  as it has to rearrange all elements.
Adding and Remove are easy and fast in LinkedList. adding is O(1) operation in LinkedList.

Linked lists enable you to insert objects at the head or in the middle of the list without shuffling all subsequent elements along.
Fast for RandomAccess
Slow

TreeSet
LinkedHashSet
HashSet
TreeSet maintains sorting order.
For Insertion order
HashSet does not maintain any order
TreeSet is a SortedSet implementation which allows it to keep elements in the sorted order defined by either Comparable or Comparator interface.


TreeSet is bit slower because of sorting operation it needs to perform on each insertion.
LinkedHashSet is second on performance or almost similar to HashSet.
HashSet is fastest.
TreeSet provides guaranteed O(log(n)) time for common operations like add, remove and contains
 LinkedHashSet offer constant time performance e.g. O(1) for add, contains and remove given hash function uniformly distribute elements in bucket.
 HashSet offer constant time performance e.g. O(1) for add, contains and remove given hash function uniformly distribute elements in bucket.
TreeSet is backed up by NavigableMap in Java and by default it uses TreeMap
LinkedHashSet is implemented using HashSet and LinkedList
HashSet is backed by an HashMap instance
TreeSet doesn't allow null
LinkedHashSet allows null
HashSet allows null
TreeSet uses compareTo() method for maintaining ordering.
LinkedHashSet uses equals() method in Java for comparison
HashSet uses equals() method in Java for comparison

Map
A map can contain duplicate values, but the keys in a map must be distinct.
HashMap, WeakHashMap, TreeMap and Hashtable are implementations of a Map
LinkedHashMap extends HashMap and implements Map
A TreeMap is an ordered HashMap, which implements the SortedMap interface.

Property
HashMap
TreeMap
LinkedHashMap
Order
No guarantee will remain constant over time
Sorted according to the natural ordering
Insertion-order
Get/Put/remove/ containskey
O(1)
O(log(n))

Because the TreeMap stores the keys based on their natural ordering, the hashCode method is not used at all.

Each element put into the TreeMap rebalances the tree, so that searching, deletion, and
further insertion can always be performed as efficiently as possible: O(log n) .
O(1) +  expense of maintaining linkedlist.
Interfaces
Map
NavigableMap
SortedMap
Map
Map
Null
Values/Keys
Allowed
Only values
Allowed
Implementation
Buckets
Red-Black tree
Double-linked buckets

HashMap is implemented as a hash table, and there is no ordering on keys or values.
TreeMap is implemented based on red-black tree structure, and it is ordered by the key.
LinkedHashMap preserves the insertion order


  
Iterator
ListIterator
Enables you to traverse through a collection in the forward direction only, for obtaining or removing elements.
Allows bidirectional traversal of list and also allows the modification of elements.

Collection ordering
SortedSet and SortedMap interfaces maintain sorted order.
The classes, which implement the Comparable interface, impose natural order.
By implementing Comparable, sorting an array of objects or a collection (List etc) is as simple as:
Arrays.sort(myArray);
Collections.sort(myCollection);


Comparable interface
Comparator interface
The “Comparable” allows itself to compare with another
similar object (i.e. A class that implements Comparable
becomes an object to be compared with).
The Comparator is used to compare two different objects.
The method compareTo() is specified in the interface.
int compare(Object o1, Object o2)
Not possible to sort by multiple fields.
Can have more control by having own Comparator class.  Here we can sort by multiple fields [first by id and then by name…]




 References:
Hashing