Symphony’s Engineering Blog

Walking the models jungle

In the previous blog post of the Engineering series, we shared Symphony’s point of view about the added value of setting up an event mesh. Next, we need to translate the concepts promoted by data mesh into a clearly defined set of practices and tools to be used consistently throughout our development process.

We initially focused on a number of questions: What are the data items development teams need to consider as products? How can we help them to advertise the event streams? How can we introduce governance in a lightweight way? How can we ensure a good developer experience? How do we minimize data friction?

The Four Models problem

Software engineers are well versed in the Domain Object Model (DOM). Data mapping is the process of converting the DOM to a data model for persistence in the storage technology of choice (e.g. RDBMS, NoSQL, Object Storage, file systems, etc.) and in the reverse direction. Whether they choose to follow Contract-First Development or not, engineers also master API contracts (OpenAPI being the standard) and correspondence between the DOM and Data Transfer Object (DTO) model, versioned for compatibility management.

An event mesh supports an additional interface from the DOM: the event model. The event model differs from existing models by providing a different way to represent actions applied on data, along with contextual information, and their result being distributed to multiple consumers. Contextual information, such as event type, timestamp or actor, needs to be conveyed as it can alter the behavior of downstream systems on event consumption. It is an application programming interface, and as a result the formalism can be replicated to manage system evolutions, for instance with contract specification and versioning.

At Symphony, we deliver software as a service on top of a multi-tenant microservice architecture. We therefore complement the model with metadata information, conveyed through headers, to help with the identification and processing of events, such as a tenant ID for segregation, a trace ID for tracing flows across services, and an event type for filtering.

A tour of our event model management processes

Apply product guidelines to our event models

Symphony’s engineering team is globally distributed, and we want each member, regardless of location, to feel empowered to build the “next cool features.” Our first intent has been to apply a product strategy to our event models, consolidating them into an accessible catalog of data sets on top of which all teams can build.

Our developers already publish OpenAPI specs to document their HTTP APIs, which we use for our public facing documentation. We naturally decided to extend the practice to prescribe them to publish schemas for all their events. The most commonly used types of schemas are JSON schemaProtobuf or Avro. We decided not to enforce a particular type of schema to let teams decide on what is most achievable for their use cases. As a result, consumer services must be able to find out how incoming events are encoded. However, we do request that teams favor formats with compact binary serialization, such as Protobuf or Avro, to be cost effective. When using JSON, we only pair it with compression in scenarios where latency is not critical and volume does not force us to optimize CPU consumption. We provide schemas to support common fields expected in most schemas for standardization.

Along with the event schemas, we have devised a standard documentation to share other practical aspects around the events. For instance, this could include which services are known to produce or consume the events, if consumers cache data in some way, which topic the event is published on, what the objectives are in terms of latency, throughput, availability, and durability, what the encoding scheme is, or if events are compressed or encrypted. This documentation is used by teams trying to consume existing data or, when an incident occurs, to conduct a quick assessment of the impacted areas and guide incident recovery actions.

Github as a catalog of event models

In order to establish a source of truth to eventually publish model evolutions, we needed a system to store the event models independently of the various phases of deployment of our software lifecycles.

At Symphony, we already manage existing models as code, as well as other operational assets, adopting the GitOps pattern. We opted for the same for our event schemas, relying on our source control system, GitHub. It comes with the ability to browse through organized file directories and to search in file contents, helping teams identify existing events of interest. Relying on CODEOWNERS, we can partition the directory into areas of responsibility, clearly identifying teams or individuals that own product domains. Through the pull request process, counterpart teams can be involved to review and approve event model schema additions or updates.

Another advantage is that GitHub helps implement governance through Continuous Integration (CI) stages in the build pipeline. Automated governance allows us to maintain a lean and engineering-wide governance body. We ensure schemas are syntactically valid and backward compatible in their evolution, and that documentation is present and valid. We also use code generators to create the code libraries embedded into the applications, based on the language in which they are written, to manipulate the events in code.

Once qualified, we promote the event models into schema registries. This component is responsible for enforcing events published in the platform to adhere to the registered schemas. Our software lifecycle takes our code through various environments, from development to quality assurance to pre-production to production. We have added one schema registry in each of these environments. The reasons we selected Confluent Cloud’s managed Kafka service to power our event streaming platform include their offering of a schema registry supporting the various schema types commonly encountered in the industry, their compatible serializers and deserializers enforcing published events follow the reference schemas, and their overall stream governance product vision and tooling.

Did we meet our targets?

Event models and associated event streams are a form of API development teams manage. Through the usage of industry-standard schemas, we ensure event explorability and understandability. Thanks to standardization, publication in a catalog, and complementary documentation, we ensure event discoverability, addressability, and ease of consumption. With GitHub features, we clearly identify event ownership, as well as provide a framework for discussion and a handshake in between parties for evolutions. Code generation helps development teams build services from the source of truth in a repeatable way. Finally, the addition of CI checks and continuous delivery into an enforcing schema registry builds trustworthiness into the events available in the platform with governance applied at various stages.

All of this while empowering our development teams through the whole process, with the same tools and practices they rely on to manage their code, existing models and operations.

You may also like