All you need to know about JVM
What is JVM? What it does? Why do we need it? How it works? When it comes in picture?
Let’s know about the myth behind JVM. When you compile a .java file by ‘javac filename.java’ on command line then .class file get created, Till then java is platform independent. Now when we run the java code by ‘java filename’ on command line then JVM comes in picture by creating it’s own instance, which is platform dependent. JVM does following 4 things-
- Loading the .class file
- Verifying the code
- Executing the byte code
- Providing run-time environment.
But, in between these steps it does so many works which I am going to explain now. Here it’s basic architecture to get an Idea about JVM-
See there is one thing name as ‘Class loader’ which is going to load your .class file and all other .class file present in JRE which your program needs. We will discuss about it later. Now let’s look at more detailed image—
So now, let’s talk about everything one by one — Class Loader Subsystem -
Class Loader does work in three phase-
So How does class loader works?
- Class Loader follows delegation Hierarchy Principle.
- Whenever JVM comes accross a particular class it will first check weather the corresponding class is already loaded or not.
- If it is already loaded in Method area then JVM will use that loaded class.
- If it is not already Loaded then JVM will request Class Loader Subsystem to load that particular class,then Class Loader Subsystem handovers the request to Application Class Loader.
- Application Class Loader delegate the request to Extension Class Loader and Extension Class loader in turn delegate to Bootstrap Class Loader.
- Bootstrap Class Loader searched in Bootstrap class path (jdk/jre/lib) If the specified class is available then it will load it.Otherwise Bootstrap class Loader delegate it to Extenstion Class Loader.
- Extension Class Loader searched in the Extensino Class path(jdk/jre/lib/ext).If the specified class is available then it will load it,Otherwise it will delegate the request to Application Class Loader.
- Application Class Loader searches in Application class path.If the specified class path is already available it will load otherwise it will give Runtime Exception : ClassNotFoundException.
This phase is at the heart of loading and contains three stages in it.
a) ByteCode verification — The class loader does a number of checks on the ByteCode of the class to ensure that it is well formed and well behaved.
b) Class preparation — This stage prepares the necessary data structures that represent fields, methods, and implemented interfaces that are defined within each class. (Sounds better ?)
c) Resolving — In this stage, the class loader loads all the other classes referenced by a particular class. The classes can be referenced in a number of ways: Super classes, Interfaces, Fields, Method signatures, Local variables used in methods etc.
During this phase, any static initializers contained within a class are executed. At the end of this phase, static fields are initialized to their default values. (That’s it really).
Now Let’s come to run time data areas —
Basic rule of storing java program in memory is -
- Objects and Instance variables are stored in heap memory area.
- local variables live in stack.
Further exploring these, list of run-time data areas in java virtual machine are —
- Program counter register ( PC Register )
- Java virtual machine stacks ( JVM Stack )
- Method area
- Run-time constant pool
- Native method stack
in which some are created on java virtual machine start up and some are per thread which are created on when a thread is created and destroy when thread completes. Let’s discuss one by one.
Program counter register :
We know that JVM can support many thread of execution at once. Each JVM thread has its own PC Register. At any point of time while executing the code of single method (referred as current method), if it is not a native method, pc register address of JVM instruction currently being executed. For native method, value of pc register is undefined.
Java virtual machine stacks :
Each JVM thread has a private JVM stack, created at the time of thread creation. JVM stack holds data as a Frame, when a new method will be invoked a frame will be pushed into the jvm stack and popped when method execution completes. No other data, except frame, can be push/popped into JVM stack.
- Frames : Frames are used to store partial results, return value from method, perform dynamic linking and dispatch exceptions. Lifetime of a frame is equal to method execution time, A frame will be pushed to JVM stack when a method will be invoked and distroyed when method execution completes. At a point of time, Only one frame at a given time can be in active state per thread, which is frame for function being executed. Each frame has its own Array of local variables, Operand stack and Reference to run-time constant pool.
- Array of local variables : All local variables of method are stored in this array, single local variable can hold any value of type byte, char, short, int, float, reference, return address and boolean while pair of local variable can hold long and double. local variables are also used to pass parameters when method is invoked. Where local variable 0 is always used to pass reference of object on which instance method is invoked. (this) Any other parameters will be stored from 1.
- Operand stack : Operand stack holds the operand used by operators to perform operations. Each entry on the operand stack can hold a value of any Java Virtual Machine type. Java Virtual Machine instructions take operands from the operand stack, operate on them, and push the result back onto the operand stack. The operand stack is also used to prepare parameters to be passed to methods and to receive method results. For example, iadd instruction will add two integer values, so it will pop top two integer values from operand stack and will push result into operand stack after adding them.
- Reference to runtime constant pool (Dynamic linking) : In class file structure, flow of method that will be invoked and variable to be accessed is stored as symbolic reference. Dynamic linking translates this symbolic reference to concrete method reference and loads the class as required. This reference to runtime constant pool is stored in frame to perform dynamic linking at run-time.
Heap is a runtime data area in memory where Objects, Instance variable and Arrays are stored. Heap memory storage may be of fixed size or may be expanded as required by computation, and this memory does not need to be contiguous. Heap is created at virtual machine startup and Objects which are no more referenced will be deallocated from memory by automatic garbage collector, objects are never explicitly deallocated.We will have some explanation about garbage collector in updated post.
Method area :
Method area is created on virtual machine startup, shared among all java virtual machine threads and it is logically part of heap area. It stores per-class structures such as the run-time constant pool, field and method data, and the code for methods and constructors. This can be of fixed size or can be expanded as per requirement of computation.
Run-time constant pool :
Run-time constant pool is per class/interface structure, It is runtime representation of constant_pool table generated at compile time which is stored in class file. Constant pool contains several literal values or symbolic references that will be resolved at runtime. Runtime constant pool for class/interface is allocated from method area and constructed when class or interface is created by JVM.
Native Method Stack:
A Native Method Stack stores similar data elements as a JVM Stack and it is used to help executing native (non-Java) methods. To play with a Native Method Stack, we need to integrate some native program codes into Java applications. When a thread invokes a native method, it enters a new world in which the structures and security restrictions of the Java virtual machine no longer hamper its freedom. A native method can likely access the runtime data areas of the virtual machine (it depends upon the native method interface), but can also do anything else it wants. It may use registers inside the native processor, allocate memory on any number of native heaps, or use any kind of stack.
Exceptional condition in JVM runtime data areas : There are two exceptions associated with JVM runtime data areas -
- OutOfMemoryError is applicable for all of the above listed except PC register. OutOfMemoryError will be thrown if memory expansion of any of the memory area will be attempted but enough memory is not available to allocate.
- StackOverflowError is applicable for Native Method Stack and Java Virtual Machine Stack. StackOverflowError will be thrown If the computation in a thread requires a larger stack than is permitted.
Now let’s talk about Execution Engine —
The abstract execution engine runs by executing bytecodes one instruction at a time. This process takes place for each thread (execution engine instance) of the application running in the Java virtual machine. An execution engine fetches an opcode and, if that opcode has operands, fetches the operands. It executes the action requested by the opcode and its operands, then fetches another opcode. Execution of bytecodes continues until a thread completes either by returning from its starting method or by not catching a thrown exception.
From time to time, the execution engine may encounter an instruction that requests a native method invocation. On such occasions, the execution engine will dutifully attempt to invoke that native method. When the native method returns (if it completes normally, not by throwing an exception), the execution engine will continue executing the next instruction in the bytecode stream. Part of the job of executing an instruction is determining the next instruction to execute. An execution engine determines the next opcode to fetch in one of three ways. For many instructions, the next opcode to execute directly follows the current opcode and its operands, if any, in the bytecode stream.
Interpreter — It interprets the current instruction in the byte-code and executes the native method using the Native method interface(JNI) calls .so or .dll’s etc.
JIT Compiler(Plays very important task in java program execution)
If there is a byte-code that is getting interpreted again and again then JIT compiler will keep the machine code ready for that instruction and it will not be interpreted always.(Kind of cache I thin)
Keeps an eye on the byte-code running and analyzes them. It provides the byte code to the JIT compiler.
Cleans up unused classes, objects and memory areas. I will write more about it in next update.
Various execution techniques that may be used by an implementation — interpreting, just-in-time compiling, adaptive optimization. One of the most interesting — and speedy — execution techniques is adaptive optimization. The adaptive optimization technique, which is used by several existing Java virtual machine implementations, including Sun’s Hotspot virtual machine, borrows from techniques used by earlier virtual machine implementations. The original JVMs interpreted bytecodes one at a time. Second-generation JVMs added a JIT compiler, which compiles each method to native code upon first execution, then executes the native code. Thereafter, whenever the method is called, the native code is executed. Adaptive optimizers, taking advantage of information available only at run-time, attempt to combine bytecode interpretation and compilation to native in the way that will yield optimum performance.
Native Method Interface
Java virtual machine implementations aren’t required to support any particular native method interface. Some implementations may support no native method interfaces at all. Others may support several, each geared towards a different purpose.
Sun’s Java Native Interface, or JNI, is geared towards portability. JNI is designed so it can be supported by any implementation of the Java virtual machine, no matter what garbage collection technique or object representation the implementation uses. This in turn enables developers to link the same (JNI compatible) native method binaries to any JNI-supporting virtual machine implementation on a particular host platform.
Implementation designers can choose to create proprietary native method interfaces in addition to, or instead of, JNI. To achieve its portability, the JNI uses a lot of indirection through pointers to pointers and pointers to functions. To obtain the ultimate in performance, designers of an implementation may decide to offer their own low-level native method interface that is tied closely to the structure of their particular implementation. Designers could also decide to offer a higher-level native method interface than JNI, such as one that brings Java objects into a component software model.
To do useful work, a native method must be able to interact to some degree with the internal state of the Java virtual machine instance. For example, a native method interface may allow native methods to do some or all of the following:
- Pass and return data
- Access instance variables or invoke methods in objects on the garbage-collected heap
- Access class variables or invoke class methods
- Accessing arrays
- Lock an object on the heap for exclusive use by the current thread
- Create new objects on the garbage-collected heap
- Load new classes
- Throw new exceptions
- Catch exceptions thrown by Java methods that the native method invoked
- Catch asynchronous exceptions thrown by the virtual machine
- Indicate to the garbage collector that it no longer needs to use a particular object
Designing a native method interface that offers these services can be complicated. The design needs to ensure that the garbage collector doesn’t free any objects that are being used by native methods. If an implementation’s garbage collector moves objects to keep heap fragmentation at a minimum, the native method interface design must make sure that either:
- an object can be moved after its reference has been passed to a native method, or
- any objects whose references have been passed to a native method are pinned until the native method returns or otherwise indicates it is done with the objects As you can see, native method interfaces are very intertwined with the inner workings of a Java virtual machine.
It is a collection of predefined programs for other languages. If an implementation’s native method interface uses a C-linkage model, then the native method stacks are C stacks. When a C program invokes a C function, the stack operates in a certain way. The arguments to the function are pushed onto the stack in a certain order. The return value is passed back to the invoking function in a certain way. This would be the behaviour of the of native method stacks in that implementation.
Here is the last single image which explains everything about JVM —
That’s all guys. Thanks for reading.