In my previous post, we discussed the Schema Registry as a tool for message validation. When designing asynchronous or event-/message-driven system architecture on the cloud, the schema registry should really be considered to check the validity of messages. Unfortunately, any of Azure messaging service including Queue Storage, Service Bus, Event Hub, Event Grid doesn’t currently support the schema registry feature. Therefore, we have to implement it by ourselves.
Throughout this post, I’m going to build a schema registry using Azure Blob Storage and register schemas there, with sample codes.
Sample Codes and NuGet Libraries
Publisher/Subscriber Architecture Pattern
We’re going to implement those three parts:
- Azure Blob Storage: This works as a schema registry.
- Azure Logic Apps: This is used for both publisher and subscriber. It’ll be further discussed in the next post.
- Azure Functions: This is used for message validation. It’ll be further discussed in the next post.
Implementing Schema Registry
By declaring a container on Azure Blob Storage, we can use it as a schema registry. If the high availability is considered, get another Blob Storage instance and store schemas to both storages. However, for our convenience, we’re going to create two containers in one Blob Storage, called
backups, which emulates as if there are two separate Azure Storage accounts. From the resource management perspective, we need three resources to use Blob Storage as the schema registry:
- Storage Account instance
- Blob Service
- Blob Container
Here’s the over-simplified version of ARM template for Blob Storage. If you’re interested in the whole template structure, check out this GitHub page.
As you can see above, all ARM templates are written in YAML. If you want to know more about YAML-based ARM templates, please have a look at my previous post.
After completing the ARM template, run this through Azure CLI to generate the instance.
This is the result of Schema Registry implementation.
Let’s write a console app to register schemas.
In a nutshell, registering schemas is just uploading them into Azure Blob Storage. Therefore, we can simply use Azure REST API or SDKs using in different languages. However, there are always use cases that Azure Blob Storage is not the only schema registry, but it can be anything, say AWS S3 Bucket or something else. To consider this sort of possibility, the library borrows the concept of the Sink and each sink works as DSL. Therefore, for our use case, declare
BlobStorageSchemaSink and upload schemas through it.
The entire sample code for this schema registration console app is here.
Sink Declaration for Schema Registry
Within the console app, declare two sinks with two containers, as we are going to have two sinks, one for main and the other for backup.
When you have a look at the code, the
BlobStorageSchemaSink library introduces Fluent API and actively uses the method chaining approach like
WithXXX() methods. As a result, the code readability gets significantly improved.
Schema Producer Declaration
The code shows how to declare the producer and register two sinks that connect to each schema registry.
Let’s upload schemas! The following code snippet shows how to upload schema through the producer by sending the class type reference.
If a JSON schema is ready, then upload it directly like below:
Once the schema is uploaded, Azure Blob Storage shows it’s uploaded.
So far, we’ve created a schema registry using Azure Blob Storage and register schemas using the NuGet libraries with a sample console app. In the next post, we’re going to deal with the next part of this implementation – how publisher and subscriber make use of the schema registry for message validation.