Final Keyword and JVM Memory Impact
For a majority of developers, the keyword final is a well-known concept when we talk about composing a Java class. In general, it forbids the overriding of classes and methods, as well as changes a variable that has been already initialized. However, this is only one part (and I think the smaller part) of the final keyword definition. The other part is the connection to JVM memory, and that's actually what this post is about!
I 'd like to discuss this topic further by providing some examples, but if you want to really dive deeper, then I recommend Chapter 17 from Java Language Specification. Let's dive right in!
Why Should I Bother Adding the Final Keyword?
I know that I'm not going to change the initialized value in my very small class, so why should care about this keyword; isn't it just blowing up my concise Java source file?
NO! There is other cool stuff from the final keyword.
- Guarantees visibility in a multi-threaded application
- Safe initialization for objects, arrays, and collections
And please, don't argue that you don't use threads in your application — it is very likely that you use a framework that actually does, just under the hood.
How I Actually Reason With Object Initialization
Let's step back a bit from the final keyword and demonstrate what object initialization might look like. It helps us understand where racing between threads may appear.
The picture above shows a small Java class and a description of how the initialization of that class could be done in reality. There is no concept of classes in a native language, which means the program needs to be expressed by instructions, one after another.
In this case, we can see that the object itself was created in the first line, and then, we populated the fields and published the reference to make it available and accessible. Everything looks fine, but there are several important considerations:
- To make your program faster, the JVM and CPU can reorder the instructions to delay store instruction on some variable as late as possible. A very simple explanation would be: JVM and CPU can do everything with your code until the PROGRAM ORDER is satisfied. This means that the JVM can move store and read instructions but the behavior of your program must remain unchanged (we can go deeper into that in a different blog post).
So far, so good, but why am I supposed to be scared when the behavior of my program cannot be changed?
- PROGRAM ORDER is followed only in a single thread. This means that if you run your application in a multi-threaded environment, then you have to inform the JVM that there is a possibility to complete a race condition when the JVM starts optimizing your code.
- In a nutshell, it means that your second thread can see a partially initialized object, let's say, because a step "publish reference" (the last step from the picture) can be reordered with the field assignment "temp.y," and then, the second thread will see a reference to the object with
null
value instead of the value "2."
Let's Introduce the Term: Safe Publication
The question is: How can we publish the shared object (a global variable) and be sure that all threads in our program will see the properly initialized object?
- Initialize an object reference from a static initializer
- This is a typical singleton holder pattern
- The reason why this is safe is that because the JVM always performs object initialization on a single thread and ensures that the initialization happens before everything that uses that class (no other synchronization needed)
- Use
volatile
or java.util.concurrent.atomic.Reference - Use locks to guard the global variable
- Or ... use the final field and never leak THIS from the constructor of the given class!
How Specification Ensures Safe Publication Using a Final Field
We already mentioned that the JVM or CPU is able to reorder everything and make your program faster. This means that we have to have some mechanism to inform the entire environment — please, don't optimize this part.
- JVM adds a synthetic freeze action (yes, this is an official name) before a constructor completes
- Freeze action is added at runtime, and there is no bytecode instruction for this
- Freeze ensures that the second thread sees null or an object reference with a fully constructed field, which was marked as final
- On a CPU level, the freeze is implemented as a store barrier (sfence instruction on x86)
- Other operations after the freeze are racing between two threads and must be properly synchronized
The JVM ensures that the second thread sees the state of the final field corresponding to the freeze action — no more, no less.
First, let's agree on how I described steps of the execution:
- Variable
temp
is the given object that we want to publish. - Using
nfv = temp
, we expose the publication of a reference of our object. -
<freeze value>
means the point of the execution where the JVM inserts the freeze action.
On the picture above, we can see a very simple example of what can go wrong. We've already mentioned exactly this case which fails because of reordering the linetemp.value = 1
after the object publication nfv = temp
.
On the contrary, the example on the right side shows fully and properly exposed our object to all threads. Freeze ensures that there is no reordering, which means that the publication of our object can be executed only when our field is properly pushed to a memory where it can be shared with other threads (Main Memory, or shared via Cache Coherence).
These examples show that the final field can help us correctly expose, even collections and arrays:
- Collections and arrays are properly exposed, even with all the values inside.
- The values inside the collection are properly exposed even if they don't have final fields
- BUT if we add a new value after the construction of our object, then this item is not supposed to be exposed properly, and Thread 2 can see the collection in both possible states.
The Composition
class shows a simple fact: Even if an object with non-final fields is exposed using a final field, then the inner object itself becomes properly exposed.
This example is very tricky. Don't count on the behavior of an implementation, always code against the specification!
OpenJDK Hotspot behaves in a way that even if only one field is marked as final, it ensures that the freeze action is added "at the end of our constructor," and flushing the CPU write buffers ensures that all other fields become visible to other threads. It works; you can check out other examples on my GitHub (link below) with JStress and give it a shot.
ConcurrentHashMap
ensures that all values added to our Map are visible immediately to threads (internally synchronized).
However, it does not mean that the field holding the map, which is not marked as final, is visible to other threads after ConcurrentUtil
construction. In this case, we can see the object with null value instead of initialized ConcurrentHashMap
.
What If I Cannot Use Final Fields...
We sometimes reach the situation where we are not able to modify a class and change all of its fields to final. So, is there any other way to expose the class properly with all the non-final fields?
We can use approaches that ensure we get safe publication for free:
- Static Initializer (Singleton Holder pattern)
- Thread-confinement
- Run a block of code (task, job, HTTP request, ...) only on one dedicated thread
- Frameworks with a matured and well-document thread model, e.g. Event Loops
- Stack-confinement
- An object is reachable only through local variables
- ThreadLocal
- Maintains a per-thread value
Or, we can use some synchronization tools:
- Synchronization block (implicit lock)
- Explicit locks, volatile
The java.util.concurrent package, which does not ensure that a non-final field itself will be properly visible, contains objects that are used by those classes and makes sure they are properly exposed and visible to other threads. The classes below passes the object from one thread to another via some internal memory barrier (e.g. unsafe, or volatile field) that means that the object, which is created and written to any structure below by Thread A and then is read by Thread B, is fully constructed even if the object contains non-final fields.
- CAS Operations
-
Atomic*
,*Adder
,Atomic*FieldUpdater
- java.util.concurrent package
-
SynchronizedMap
,ConcurrentHashMap
-
CopyOnWriteArray(List|Set)
-
Synchronized(List|Set)
-
BlockingQueue
,ConcurrentLinkedQueue
-
VarHandles
?Unsafe
?
Thank you for reading my article and please leave comments below. If you would like to be notified about new posts, then start following me on Twitter: @p_bouda.