As 2023 comes to a close, we are returning to a clean-tech topic that we have been following for quite some time now with Amenity’s ESG earnings call and news monitoring analytics. Our news data reveals that geothermal power has seen an uptick in recent months from key players in energy and industrials, which comes on the heels of some interesting proposed U.S. legislation.
Carrying over history to the mesh
In our journey building an event mesh, we face the challenge of reconciling with Symphony’s engineering history. Specifically, we are transitioning from a single-tenant, monolithic architecture to a multi-tenant, microservice architecture. And critically, due to the nature of our business, this shift must be both progressive and smooth.
Ultimately, we want to enable our microservices to access the data events originated by our existing primary backend service, without being dependent on the evolution of the backend service. Additionally, the pre-existing events have evolved in isolation over the years, encompassing content for fairly orthogonal concerns, with greater expectations on producers to provide context that is not strictly required. Our objective was to rework the schemes for our existing events within the framework established in our newly formed events catalog. So we evaluated the different patterns we could adopt to carry that baggage over and reduce its weight.
The below diagram represents the basics of Symphony’s asynchronous communications architecture. Our backend service is deployed over AWS and GCP public clouds. Depending on the cloud provider, it relies on either SNS/SQS or Pub/Sub for asynchronous communications through a custom-built abstraction.
We introduce our event mesh on top of Kafka, running in Confluent Cloud managed clusters, and want to rely on schema registry to enforce adherence to mutually agreeable event structures.
Option #1: Using Confluent-supported connectors
One of the reasons we chose to build our event mesh on top of the Confluent Kafka platform is that it provides support for interconnection with a variety of services through supported Kafka connectors, available through their hub.
This approach provides an easy-to-follow pattern when there is a limited number of SQS queues or PubSub subscriptions from which to ingest events. Otherwise, the operations of hundreds of connector instances on top of Kafka Connect would come with its own set of challenges relating to observability and connector error resolution. It also fits well if there are small changes to apply on the events structure, which can be handled by relying on Simple Message Transforms.
Our objective is to reshape the events wholly, and with this method the transformations are too complex to fit this model. For instance, existing event streams require decrypting events with per-tenant encryption keys before applying an advanced mapping logic. Additionally, while we could set up a single SQS queue to subscribe to and aggregate many SNS topics, Pub/Sub subscriptions are per topic, leading to a potentially high cardinality of Pub/Sub connector instances. Therefore, this pattern did not work for us.
Option #2: Updating producers to maintain dual event streams
An alternative we considered was to update producers to maintain dual streams: the existing one, and a new one that would feed the event mesh. This would have the benefit of preparing the evolution of the producing service, particularly considering that the legacy event stream will be dropped in the not-so-distant future.
However, this approach had two major drawbacks. First, it falls into the dual-write problem: since there are two independent actions to be triggered when an event occurs, it is possible that the second one will fail while the first one succeeds without a way to roll back that first action. This would introduce inconsistencies and complexity in order to compensate for events that failed. Moving entirely to a transactional pattern would also be costly in development efforts and diminish the system’s stability and scalability. Second, as Symphony has hundreds of servers running single-tenant publishers, this method would exceed limitations in terms of parallel connections to our Kafka clusters.
Option #3: Building our own connector service
The final approach we considered, and ultimately decided upon, was to build our own connector service. We traditionally use Java for backend development, so could therefore benefit from all the libraries and tools that come with Spring, including Spring Cloud, and in particular, SQS support in Spring Cloud for AWS, and Pub/Sub support in Spring Cloud for GCP. These libraries make it easy to spin up a microservice that consumes from both SQS and Pub/Sub. Such an independent microservice would be stateless, and could scale based on the need as a standard kubernetes pod.
For SQS, AWS already supports subscriptions to multiple SNS topics without any documented restriction, so we would be able to aggregate all the events from all the tenants into a single queue for mapping and forwarding to the event mesh. For Pub/Sub, the support of a reactive stream subscriber makes it possible to pipe the consumption from multiple subscriptions on top of a single bounded thread pool, minimizing the compute resources required.
For the mapping logic, we looked into MapStruct as an existing mapping library. This mapping logic was too complex to fit nicely in our planned formalism, and likely too complex for other similar libraries. Therefore, we proceeded with building dedicated conversion logic. We ultimately relied on classes generated from schemas agreed upon to represent the events, and serialize them with the KafkaProtobufSerializer to ensure there is no deviation from the currently registered schema. The following diagram recaps the intended architecture.
In implementing this pattern, we solve for our original target to bring events originating from our existing deployments into our event mesh without compromising on our goal to improve on our data stream. A direct benefit of this pattern is that it allows us to build and evolve microservices without being dependent on the evolution of a legacy backend undergoing significant transformation.
You may also like
As the landscape of communication continues to evolve in our increasingly digital world, the financial services industry is feeling the pinch of regulatory scrutiny. In recent years, financial firms have faced nearly $2 billion in penalties from the SEC and CFTC due to unregulated “off-channel” messaging. This surge in “off-channel” communication, including usage of platforms such as WhatsApp, WeChat, SMS, LINE, and mobile calls, is largely fueled by the rise of hybrid work. Yet, regulators maintain that all business communications must be monitored, auditable, and occur only within official channels. In the words of SEC Chair Gary Gensler, “As technology changes, it’s even more important that registrants appropriately conduct their communications about business matters within only official channels. And they must maintain and preserve those communications.”
We live in a fast-evolving age of information, where Artificial Intelligence (AI) tools are starting to be used in many areas like financial decision-making and