Validation in Java Applications
Often, I have seen projects that didn't appear to have any conscious strategy for data validation. Their teams worked under the great pressure of deadlines, unclear requirements, and just didn't have enough time to make validation in a proper and consistent way. So, data validation code could be found everywhere — in Javascript snippets, Java screen controllers, business logic beans, domain model entities, database constraints, and triggers. This code was full of if-else statements, throwing different unchecked exceptions, and making it hard to find a place where data could be validated. So, after a while, when the project grew up enough, it became quite hard and expensive to keep these validations consistent and following requirements, which, as I've said, are often fuzzy.
Is there a path for data validation in an elegant, standard, and concise way? Is there a way that doesn't fall into unreadability, helps us to keep most of the data validation logic together, and has most of the code already done for us by developers of popular Java frameworks?
Yes, there is.
For us, developers of the CUBA Platform, it is very important to let our users follow the best practices. We believe that the validation code should be:
- Reusable and following the DRY principle;
- Expressed in a clear and natural way;
- Placed in the place where developers expect it to see;
- Able to check data from different data sources: user input, SOAP or REST calls, etc.
- Aware of concurrency;
- Called implicitly by the application, without the need to call the checks manually;
- Showing clear, localized messages to a user using concise designed dialogs;
- And, following standards.
In this article, I'll be using an application based on the CUBA Platform for all examples. However, since CUBA is based on Spring and EclipseLink, most examples will work for any other Java framework that supports JPA and the bean validation standard.
DB Constraints Validations
Perhaps, the most common and straightforward way of data validation uses DB-level constraints, such as a required flag ('not null' fields), string length, unique indexes, and so on. This way is very natural for enterprise applications, as this class of software is usually heavily data-centric. However, even here, developers do mistakes, defining constraints separately for each tier of an application. This problem is often caused by splitting responsibilities between developers.
Let's take an example that most of you have probably faced. If a spec says that the passport field should have 10 digits in its number, most likely, it will be checked everywhere — by the DB architect in DDL, by the backend developer in the corresponding Entity and REST services, and finally, by the UI developer in client source code. Later on, this requirement changes and the size of the field grows up to 15 digits. Tech Support changes the DB constraint, but for a user, it means nothing since the client side check will not be passed anyway.
Everybody knows the way to avoid this problem — validations must be centralized! In CUBA, this central point of such kind of validation is JPA annotations over entities. Based on this meta information, CUBA Studio generates the correct DDL scripts and applies corresponding validators on the client side.
If JPA annotations get changed, CUBA updates DDL scripts and generates migration scripts, so next time you deploy your project, new JPA-based limitations will be applied to your application's UI and DB.
Despite simplicity and implementation that spans up to DB level and is completely bullet-proof, JPA annotations are limited by the simplest cases that can be expressed in DDL standard without involving DB-specific triggers or stored procedures. So, JPA-based constraints can ensure that the entity field is unique or mandatory or can define a maximum length for a varchar column. Also, you can define a unique constraint to the combination of columns with the @UniqueConstraint
annotation. But, this is pretty much it.
However, in the cases that require more complex validation logic, like checking for maximum and minimum values of a field, validating with an expression, or performing a custom check that is specific to your application, we need to utilize the well-known approach called Bean validation.
Bean Validation
All we know, Bean validation is a good practice to follow standards, which normally have a long lifecycle and are battle-proven on thousands of projects. Java Bean validation is an approach that is set in stone in JSR 380, 349 and 303, and their implementations: Hibernate Validator and Apache BVal.
Although this approach is familiar to many developers, it's benefits are often underestimated. This is an easy way to add data validations even for legacy projects, which allows you to express your validations in a clear, straightforward, and reliable way that is as close to your business logic as possible.
Using the Bean validation approach brings a lot of benefits to your project:
- Validation logic is concentrated near your domain model, defining the value and method, and the bean constraint is done in a natural way that allows bringing an OOP approach to the next level.
- The Bean validation standard provides many validation annotations out of the box, like:
@NotNull
,@Size
,@Min
,@Max
,@Pattern
,@Email
,@Past
, and a less standard like,@URL@Length
, might@ScriptAssert,
and many others. - You are not limited by predefined constraints and can define your own constraint annotations. You can make a new annotation by combining others or making a brand new one and defining a Java class that will be served as a validator.
- For example, looking at our previous example, we can define a class-level annotation
@ValidPassportNumber
to check that the passport number follows the right format, which depends on the country field value. - You can put constraints not just on fields and classes but also on methods and method parameters. This is called "validation by contract" and is the topic of a later section.
The CUBA Platform, as well as some other frameworks, calls these Bean validations automatically when a user submits the data, so the user would get the error message instantly if validation fails, and you don't need to worry about running these Bean validators manually.
Let's take a look at the passport number example once again, but this time, we'd like to add a couple additional constraints on the entity:
- Person name should have a length of two or more and be a well-formed name. Regexp is quite complex, but Charles Ogier de Batz de Castelmore Comte d'Artagnan passes the check and R2D2 does not
- Person height should be in interval: 0 < height <= 300 centimeters
- Email string should be a properly formatted email address.
So, with all these checks, the Person class looks like this:
I think that the usage of standard annotations, like @NotNull
, @DecimalMin
, @Length
, @Pattern
, and others is quite clear and doesn't need a lot of comments. Let's see how custom @ValidPassportNumber
annotation is implemented.
Our brand new @ValidPassportNumber
checks that thePerson#passportNumber
match the regexp pattern specific to each country defined by Person#country
.
First, following the documentation (CUBA or Hibernate docs are good references), we need to mark our entity class with this new annotation and pass groups parameter to it, where UiCrossFieldChecks.class
says that the check should be called after checking all individual fields on the cross-field check stage, and the Default.class
keeps the constraint in the default validation group.
The annotation definition looks like this:
@Target(ElementType.TYPE)
defines that the target of this runtime annotation is a class, and @Constraint(validatedBy = ... )
states that the annotation implementation is in the ValidPassportNumberValidator
class that implements the ConstraintValidator<...>
interface and has the validation code in the isValid(...)
method, in which the code does the actual check in a straightforward way:
That's it. With the CUBA platform, we don't need to write a line of code more than that to get our custom validation working and giving messages to a user if they made a mistake. Nothing too complex, right?
Now, let's check how all this stuff works. CUBA has a few extra goodies — it not only shows error messages to a user but also highlights form fields that haven't passed single-field Bean validations with nice red lines:
Isn't this a neat thing? You have nice error UI feedback in the user's browser just after adding a couple Java annotations to your domain model entities.
Concluding this section, let's briefly list once again what pluses Bean validation for entities has:
- It is clear and readable;
- It allows us to define value constraints right in the domain classes;
- It is extendable and customizable;
- It is integrated with many popular ORMs and the checks are called automatically before changes are saved to a database;
- Some frameworks also run Bean validation automatically when the user submits data in the UI (but if not, it's not hard to call the
Validator
interface manually); - Bean validation is a well-known standard, so there is a lot of documentation on the Internet about it.
But, what shall we do if we need to set a constraint onto a method, a constructor, or some REST endpoint to validate data coming from an external system? Or, if we want to check the method parameters values in a declarative way without writing boring code full of if-else statements in each method, do we need to have such a check?
The answer is simple: Bean validation can be applied to methods as well!
Validation by Contract
Sometimes, we need to make another step and go beyond the application data model state validation. Many methods might benefit from automatic parameters and return values validation. This might be required not just when we need to check data coming to a REST or SOAP endpoint but also when we want to express preconditions and postconditions for method calls. This is so that we can be sure that the input data has been checked before the method body is executed, that the return values are in the expected range, or we want to declaratively express parameters boundaries for better readability.
With Bean validation, constraints can be applied to the parameters and return values of a method or constructors of any Java type to check for their calls' preconditions and postconditions. This approach has several advantages over traditional ways of checking the correctness of parameters and return values:
- The checks don't need to be performed manually in an imperative way (e.g. by throwing
IllegalArgumentException
or similar methods). We rather specify constraints declaratively, so we have more readable and expressive code; - Constraints are reusable, configurable, and customizable — we don't need to write validation code every time we need to do the checks. Less code equals fewer bugs.
- If a class or method returns a value or method parameter that is marked with the
@Validated
annotation, the constraints check would be done automatically by the framework on every method call. - If an executable is marked with the
@Documented
annotation, then it's pre- and postconditions would be included in the generated JavaDoc.
As the result with the 'validation by contract' approach, we have a clearer, more concise code, which makes it easier to support and understand.
Let's look at what it looks like for a REST controller interface in the CUBA app. The PersonApiService
interface allows us to get a list of persons from the DB with the getPersons()
method and to add a new person to the DB using the addNewPerson(...)
call. And remember: Bean validation is inheritable! In other words, if you annotate some class or field or method with a constraint, all descendants that extend or implement this class or interface would be affected by the same constraint check.
Does this code snippet look pretty clear and readable to you? (With the exception of the @RequiredView("_local")
annotation, which is specific for the CUBA platform and checks that returned Person object has all fields loaded from the PASSPORTNUMBER_PERSON
table). The
@Valid
annotation specifies that every object in the collection is returned by the getPersons()
method and needs to be validated against the Person class constraints, as well.
CUBA makes these methods available at the next endpoints:
- /app/rest/v2/services/passportnumber_PersonApiService/getPersons
- /app/rest/v2/services/passportnumber_PersonApiService/addNewPerson
Let's open the Postman app and ensure that validation is working as expected:
You might have noticed that the example above doesn't validate the passport number. This is because it requires a cross-parameter validation of theaddNewPerson
method since the passportNumber
validation regexp pattern depends from the country value. Such cross-parameter checks are a direct equivalent to the class-level constraints for entities!
Cross-parameter validation is supported by JSR 349 and 380; you can consult Hibernate documentation on how to implement custom cross-parameter validators for the class/interface methods.
Beyond Bean Validation
Nothing is perfect, and Bean validation has some limitations, as well:
- Sometimes, you just want to validate a complex object graph state before saving changes to the database. For example, you might need to ensure that all items from an order made by a customer of your e-commerce system could be fit into the shipping boxes. This is quite a heavy operation and doing such checks every time users add new items to their orders isn't the best idea. Hence, such a check might need to be called just once before the order object and its
OrderItem
objects are saved to the database. - Some checks have to be made inside the transaction. For example, the e-commerce system should check if there are enough items in stock to fulfill the order before committing it to the database. Such a check could be done only from inside the transaction because the system is concurrent and quantities in stock could be changed at any time.
The CUBA platform offers two mechanisms to validate data before commit, which are called entity listeners and transaction listeners. Let's look at them a bit more closely.
Entity Listeners
Entity listeners in CUBA are quite similar to PreInsertEvent, PreUpdateEvent, and PredDeleteEvent listeners that JPA offers to a developer. Both mechanisms allow us to check entity objects before or after they get persisted to a database.
It's not hard to define and wire up an entity listener in CUBA; we need to do two things:
- Create a managed bean that implements one of the entity listener interfaces. For validation purposes, three of these interfaces are important:
-
BeforeDeleteEntityListener
-
BeforeInsertEntityListener
-
BeforeUpdateEntityListener
-
- Annotate the entity object that plans to track with the
@Listeners
annotation.
That's it!
In comparison with the JPA standard (JSR 338, chapter 3.5), CUBA platform's listener interfaces are typed, so you don't need to cast an object argument to start working with the entity. The CUBA platform adds the possibility of entities associated with the current one or calls the EntityManager
to load and change any other entities. All such changes would invoke appropriate entity listener calls, as well.
Also, the CUBA platform supports soft deletion, a feature when entities in the DB are marked as deleted without deleting their records from the DB. So, for soft deletion, the CUBA platform would call the BeforeDeleteEntityListener
/AfterDeleteEntityListener
listeners while standard implementations would call thePreUpdate
/PostUpdate
listeners.
Let's look at the example. An event listener Bean connects to an entity class with just one line of code: annotation @Listeners
that accepts a name of the entity listener class:
And, the entity listener implementation may look like this:
Entity listeners are a great choice when you:
- Need to make a data check inside a transaction before the entity object gets persisted to a DB
- Need to check data in the DB during the validation process, for example, to check that we have enough goods in stock to accept the order
- Need to traverse not just a given entity object, like
Order
, but visit the object that is in the association or composition with the entity, likeOrderItems
objects for theOrder
entity - Want to track insert/update/delete operations for just some of your entity classes. For example, you want to track such events only for the
Order
andOrderItem
entities and don't need to validate changes in other entity classes during the transaction.
Transaction Listeners
CUBA transaction listeners work in a transactional context, but in comparison with entity listeners, they get called for every database transaction.
This gives them the ultimate power — nothing can pass their attention, but the same power also gives them their biggest weaknesses:
- They are harder to write
- They can downgrade performance significantly if performing too many unneeded checks
- They need to be written much more carefully — a bug in a transaction listener might even prevent an application from bootstrapping;
So, transaction listeners are a good solution when you need to inspect many different types of entities with the same algorithm, like feeding data to a custom fraud detector that serves all your business objects.
Let's look at the example that checks if an entity is annotated with the @FraudDetectionFlag
annotation.If yes, it runs the fraud detector to validate it. Once again, please note that this method is called before every DB transaction gets committed in the system, so the code has to try to check as little objects as fast as it can.
To become a transaction listener, a managed Bean should just implement the BeforeCommitTransactionListener
interface and implement the beforeCommit
method. Transaction listeners are wired up automatically when the application starts. CUBA registers all classes that implement the BeforeCommitTransactionListener
or AfterCompleteTransactionListener
as the transaction listeners.
Conclusion
Bean validation (JPA 303, 349 and 980) is an approach that could serve as a concrete foundation for 95% of the data validation cases that happen in an enterprise project. The big advantage of such an approach is that most of your validation logic is concentrated right in your domain model classes. So, it is easy to be found, easy to be read, and easy to be supported. Spring, CUBA, and many other libraries are aware of these standards and call the validation checks automatically during UI input, validated method calls, or ORM persistence process, so validation works like a charm from a developer's perspective.
Some software engineers see validation that impacts an applications domain models as being somewhat invasive and complex – they say that making data checks at the UI level is a good enough strategy. However, I believe that having multiple validation points in UI controls and controllers can be a problematic approach. In addition, validation methods that we discussed here are not perceived as invasive when they are integrated with a framework that is aware of the Bean validators and listeners, and they integrate them to the client level automatically.
In the end, let's formulate a rule of thumb to choose the best validation method:
- JPA validation has limited functionality, but it is a great choice for the simplest constraints on entity classes if such constraints can be mapped to DDL.
- Bean Validation is a flexible, concise, declarative, reusable, and readable way to cover most of the checks that you could have in your domain model classes. This is the best choice, in most cases, once you don't need to run validations inside a transaction.
- Validation by Contract is a Bean validation for method calls. You can use it when you need to check input and output parameters of a method, for example, in a REST call handler.
- Entity listeners: although they are not as declarative as the Bean validation annotations, they are a great place to check big object's graphs or make a check that needs to be done inside a database transaction. For example, when you need to read some data from the DB to make a decision, Hibernate has analogs of such listeners.
- Transaction listeners are a dangerous yet ultimate weapon that works inside the transactional context. Use it when you need to decide at runtime what objects have to be validated or when you need to check different types of your entities against the same validation algorithm.
I hope that this article taught you more about different the validation methods available in Java enterprise applications and gave you a couple ideas on how to improve the architecture of projects that you are working on.