Understanding Aggregates in Domain-Driven Design
Key Takeaways
- Aggregates should be based on domain invariants.
- Aggregates should be modified with their invariants completely consistent within a single transaction.
- Qualify associations by adding constraints to reduce technical complexity.
- Aggregates represent domain concepts, not just generic collections of domain objects.
- Aim for smaller aggregates to reduce transactional locking and reduce consistency complexities
Introduction
In domain driven design, a domain model’s structure is composed of entities and value objects that represent concepts in the problem domain. But, handling associations between domain objects is the main reason for complexity and confusion. If you've ever worked on large applications, you should see lots of complex domain objects.
If your design does not have any clear notion of simplifying techniques, these associations may grow out of control, and if your object model has a large network of associations, association objects of one object might result in loading large clusters of objects in memory. With modeling like this, there is just no limit to which area of the domain model might be affected. Even though in the real world, at the highest level of your system, all of these things really do interrelate, we need to be able to separate them to keep the complexity of the system in check.
In this case, simplifying complex object graphs is mandatory; They complicate implementation and maintenance. Since the main purpose of the domain model is to support invariants and uses cases rather than user interfaces and domain models are not the same as data models. They should have more communicative and more applicable associations and be consistently constrained as much as possible. The main target is keeping the relationships between domain objects simple and aligned with domain invariants.
This is the main reason why designing relationships between domain objects is equally as important as designing the domain objects themselves. Even when all associations in a model are justified, a large model still has technical challenges, making it difficult to choose transactional and consistency boundaries that both reflect the problem domain and perform well. It is important that you guarantee the consistency of changes in a model with complex associations.
The main issue is how we should represent every conceivable relationship possible in our object model. Where do we draw the line between whether or not to create a reference? If I have a reference between two entities, how should you handle persistence? Do updates cascade?
What Is an Aggregate in Domain-Driven Design
Aggregates consist of one or more entities and value objects based on domain model invariants that change together. We need to treat them as a unit for data changes, and we need to consider the entire aggregate consistency before we apply changes. Every aggregate must have an aggregate root that is the parent of all members of aggregate, and it is possible to have an aggregate that consists of one object. In this case, that object would still be the aggregate root.
An aggregate root is an entity that has been chosen as the gateway into the aggregate. An aggregate root coordinates all changes to the aggregate, ensuring that clients cannot put the aggregate into an inconsistent state. It upholds all invariants of the aggregate by delegating to other entities and value objects in the aggregate cluster.
Also, Aggregates help us simplify the domain model by gathering multiple domain objects under a single abstraction around domain invariants that act as consistency and concurrency boundaries. This concept includes several implications. First of all, it is a conceptual whole meaning that it represents a cohesive notion of the domain model. Every aggregate has a set of invariants which it maintains during its lifetime. This means that at a given time, an aggregate should reside in a valid state.
Data changes in aggregate should follow ACID: atomic, consistent, isolated, and durable. when considering whether a particular object should be treated as an aggregate root you should think about whether deleting it should cascade, in other words, if you need to also delete other objects in the aggregate hierarchy, if so it's likely the object should be considered as an aggregate root. Another way to think about whether it makes sense to have an object as an aggregate root is to ask does it makes sense to have just this object detached from its parents?
Just like entities and value objects, there are no objective traits that make particular boundaries for aggregate and they fully depend on the domain model you are working in. Actually, The most important rule to define a boundary for your aggregate cluster is that the boundary should be base on domain invariants. Domain invariants are business rules that must always be consistent. The consistency boundary logically asserts that everything inside adheres to a specific set of business invariant rules no matter what operation is performed. The consistency of everything out of the boundary is irrelevant to the aggregate.
Entities inside the same aggregate should be highly cohesive whereas entities in different aggregates should maintain loos coupling among each other, it is good practice to ask your self following question according to your invariants does an aggregate make sense without some other entities, if it does then it should probably be the root of its own aggregates; otherwise, it should be a part of some other existing aggregate. Since entities are encapsulated within aggregates.
They make up the design and implementation of an aggregate's behaviors. But It would rather have most of the behaviors tied to value objects rather than entities. One of the things I’d encourage is to keep entities free of behavior where possible, since identity is already a big burden to bear, and have behavior expressed in the value objects. So as more behavior needs to be added, I would try to model that as behaviors on new or existing value objects where possible.
As you go along with the development process you will receive more information about the domain. if you think the boundaries you selected initially do not play well with the problem you are solving, do not hesitate to change them. domain modeling is an iterative process so do not expect to find proper boundaries for all of your aggregates right away.
Your aggregates should not be influenced by your data model. Associations between domain objects are not the same as database relationships. Data models need to represent each has-a relationship to support referential integrity and build reports for business intelligence. An aggregate represents a concept in your domain and is not a container for items. when including domain objects in aggregate, don’t simply focus on the has-a relationship; justify each grouping and ensure that each object is required to define the behavior of the aggregate instead of just being related to the aggregate.
For example, in an online retailer system, a customer should have at least an address that is needed for shipment. The customer entity is the aggregate root and the address entity is part of customer aggregate. But, In an online webinar system, the user can have an address if he/she asks for a physical invoice. But the customer is an aggregate with a single entity. In both scenarios, there is, has-a relationship but use cases and business invariants are different, That's why the second aggregate has a single entity.
Consistency is mandatory in aggregates. There are two types of consistency, transactional consistency, and eventual consistency. Transactional consistency is considered immediate and atomic. Thus, aggregates are synonymous with transactional consistency. A properly designed Aggregate is one that can be modified in any way required by the business with its invariants completely consistent within a single transaction. Limitations of only one transaction per aggregate instance seem very strict. However, it is a rule of thumb and should be addressed in most cases.
The higher the number of aggregates being modified in a single transaction the greater the chance of a concurrency failure. be aware of creating aggregate that is too large. it might be tempting to include aggregates more entities. If we have a single big aggregate, that might seem simpler than having two separate aggregates, but it does not work well, the reason is the more big your aggregates are it is harder to maintain their consistency and handle conflicts when several transactions try to update parts of single aggregate at once.
Finding proper boundaries is a tricky question and basically is a trade-off between the simplicity of the model and its performance characteristics, just to give you some heuristics in the projects I was a member of most of the aggregates contains one or two entities and I do not remember aggregate with more than three aggregates. It is wonderful to have a single entity aggregate, so do not try to gather entities artificially. but I should mention that this heuristic does not contain value objects, you can have lots of value objects as you want.
Associations that can be traversed in more than one direction also increase complexity. In this case, ORMs are not useful, most of the time they make it easy to have a bi-directional association. According to ubiquitous language, your associations must be constrained to a unidirectional. That is fundamental for reducing complexity in your domain model. Unnecessary associations are the true killer of the performance and consistency of your domain model.
Even your domain model associations are unidirectional, It does not mean all of the remained associations are useful, sometimes you need to prevent extra work by qualifying the relationship between domain objects. Like if you want to have all comments of a post, it is mandatory to filter only approved comments. Often, for reporting purposes your requirements might change, you need all comments or multi-direction relations. this will not have any business with your domain model and it is a different story.