Aggregate - DDD in Ruby on Rails

Paweł Strzałkowski

Chief Technology Officer

In everyday life, the term "aggregate" has multiple meanings. This fact often leads Domain-Driven Design wannabes to confusion. It is an extremely important and needed modeling concept, which is too often forgotten in the realm of Ruby on Rails. Let's see how it may be useful to you, when implementing RoR applications.

"An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes. Each AGGREGATE has a root and a boundary. The boundary defines what is inside the AGGREGATE. The root is a single, specific ENTITY contained in the AGGREGATE. The root is the only member of the AGGREGATE that outside objects are allowed to hold references to."

— Eric Evans, Domain-Driven Design: Tackling Complexity in the Heart of Software

Group of objects

As you know from the article about Entities, we shouldn't treat objects as bags for data. They should encapsulate behaviour and protect their inner implementation details. An aggregate takes that to the next level.

Aggregate root and boundary

An aggregate is a group of objects (entities and associated value objects), which guards business rules. In the DDD nomenclature, these rules are called invariants. Aggregate is a conceptual and an organizational being. You may go through an entire codebase and see no explicit mention of the “aggregate” term. However, you will see entities which are called only by one entity. You will see others making sure that rules are followed. Like in the example above - Rate is used only by the Customer. You will also see namespaces and code organization patterns to support this approach.

Invariants

Every time an aggregate receives a command, it checks whether the requested change leads to a valid state. For the state to be valid, it has to be allowed by invariants. For example:

a client has less than 3 cars rented
a room is booked by up to one client
an order is approved only if the total price of items is lower than $300
up to one of the assigned processes is in processing state at a time

With each change request, one of two things happens:

the change is allowed by invariants and therefore the aggregate is modified,
the change breaks some of the invariants and is rejected as a whole.

In RoR, we are used to putting validations at (ActiveRecord) model level. We put form object validations on top of that and database level validations below. Such a sandwich is supposed to keep our data consistent. However, it's not what DDD is about. Here, we aim towards making the state and data consistent by applying business logic.

Business processes are not invariants

Consider the following scenario: A client chooses a new laptop. The current promo says that a discount coupon for a next order will be granted after the purchase. The order is finalized and payed for.

When the payment is complete, we have to:

save the fact the payment has been made
progress order's state
create shipment package
generate a discount coupon
assign the coupon to the client

Often, these changes are handled in a single request and persisted in one transaction. Effectively, they are modeled as a single, vast aggregate. Just imagine how much data has to be gathered and handled by the database within such a transaction. What's also important, the transaction effectively locks all handled objects for writing for concurrent processes (see the next chapters for details).

Keep the aggregates small

How small? The smallest aggregate is a lone entity, which automatically becomes its root and the boundary. It is usually not possible to model a real domain using such simple building blocks.

A change, to be consistent, has to be performed in a transaction. It has to be atomic. Otherwise, a processing or an infrastructure malfunction would leave the application in an inconsistent state. The proper size of an aggregate is the needed size of consistency boundary.

Transactions are natural for Ruby on Rails developers and well handled by the framework. There is a lot to say about db transactions, which goes far beyond the scope of this article. But one thing is certain - the smaller and shorter a transaction is, the better. It's ok to atomically update an order and its items. But it's not ok, when we lock and update tens or even hundreds of objects at a time.

References to identities of other aggregate roots are ok

The fact that an aggregate contains a reference to another aggregate, doesn't automatically merge them into one. An aggregate root or entities inside the boundary may reference another aggregate. However, Ruby on Rails encourages programmers to tightly couple entities using associations like :belongs_to or :has_many. It enables loading entities inside of other entities and prevents thoughtful modeling altogether.

The coupling is far lower when entities use identity references to other aggregates. In RoR, it means that an entity may hold another_entity_id attribute to reference another aggregate. However, it should not use belong_to :another_entity when that other entity is outside of its aggregate boundaries. It prevents loading and updating elements of other aggregates.

Eventual consistency

Going back to our client and the discount coupon. It may be the case that:

payment has to completed to progress order state from "ordered" to "payed"
an order may be in "Payed" state only if there is no payment due

We have to update both at the same time to keep the state consistent.

But it's almost certain that discount coupon generation is not bound to this process. A client doesn't need to receive the discount in the very millisecond the order is payed. The payment should not be rejected because of a discount coupon module disfunction. This part may be performed asynchronously. Either by the means of a scheduled job (ie. with Sidekiq) or a Pub/Sub mechanism.

Concurrent changes

Aggregates are all about data integrity and state consistency. In order to be consistent, an aggregate cannot be modified by two concurrent processes at the same time.

Let's analyze the simplest of aggregates - a client can have up to two addresses. A client object has a behaviour of add_address(address) defined. Inside, there is an invariant check:


def add_address(address)
  raise TooManyAddressesError if addresses.size > 1

  addresses.push(address)
end

Let us imagine that two concurrent processes load the same Address object. Let's assume that there had been one address added so far. Each process sees a single address and allows adding a new one. When both save the state, three addresses are persisted.

To avoid this situation, take a look at the idea of Optimistic Locking. It disallows saving of the second of the two concurrent processes. The second one would raise ActiveRecord::StaleObjectError and reject the second transaction. It's very easy to introduce optimistic locking to a RoR application and the benefit is tremendous.

Aggregate root

Aggregate is a cluster of objects. However, for the rest of the system it is visible as a single being. The entry point to an aggregate is a single entity, which provides all the behaviour of the group. Any outside element may only hold the reference (ie. database identity) to the root of an aggregate. Entities inside of an aggregate boundaries may hold references to outside aggregate roots, but themselves are hidden and never directly operated on by the outside world.

In the case of the example of a client with two addresses, there would never be a service operating directly on the addresses. Such a service may supply a new address to the client entity via theadd_address method. However, it would never load and change an address directly.

Ruby on Rails aggregate example

Ruby on Rails aggregate example From the code perspective, an aggregate root is just an entity. It is the usage and context which promote it to an aggregate root. Check out this simple example of an Invoice class. It uses InvoiceItem objects to create items within the boundary of the Invoice aggregate.


# Invoice aggregate

class Invoice < ApplicationRecord
  OperationNotAllowedError = Class.new(StandardError)

  # References to other aggregates
  validates :client_id, :order_id, presence: true

  validates :status, inclusion: { in: %w(draft booked voided) }

  # A collection of items (entities) within the boundary
  has_many :invoice_items

  def add_item(description, quantity, price, tax_definition)
    # Invariant guard
    raise OperationNotAllowedError if status != 'draft'

    # Calculation could also be done using another domain object (ie. service)
    tax = calculate_tax(price, quantity, tax_definition.rate)

    # Add a new entity to items' collection
    invoice_items.build(
      description: description,
      quantity: quantity,
      price: price,
      tax: tax,
      tax_label: tax_definition.label
    )

    # Recalculate the state of the aggregate root
    build_totals
  end

  def void
    raise OperationNotAllowedError if status == 'booked'

    self.status = 'voided'
  end

  private

  def build_totals
    # ...
  end

  def calculate_tax(price, quantity, tax_rate)
    # ...
  end

  # ...
end

class InvoiceItem < ApplicationRecord
  validates :invoice_id, presence: true

  # it is NOT needed to use belongs_to. We never fetch aggregate root from here
end

Summary

Aggregate is a transaction boundary. It's a guardian of consistency and validity of both data and business rules. It gives you the ability to model safe and reliable applications. It is not trivial to learn how to use this tool properly. Please, take some time to familiarize with it, even if you have no plans of using the full toolkit of Domain-Driven Design. It will make your applications more mature, secure and less error prone.

To learn even more, be sure to check out a recording of our webinar about Aggregate Pattern in Ruby on Rails.