Producer Level Message Ordering Guarantees

Kafka is one the hottest technology in today’s world. The rising of the microservices architecture and event-driven systems makes it so popular in today.

In this article, I am gonna talk you about the ordering guarantees that Kafka provides. It is a confusing topic to me because of the nature of the distributed systems. There are lots of questions come to mind and stuck with it.

I am gonna talk you about the Producer level guarantees in today. So, lets define a little bit Kafka terms.

Topic, Partitions and Offsets

In Kafka, topic is a particular stream of data that we can send. …

The new generation storage solutions

In this post, I am gonna talk about the LakeHouse, the optimal storage solution for todays world. Let’s start with the history.

Characteristics of Optimal Storage Solution

When we want to build a system we typically need those types of features for storage:

  • Transaction Support: We often need for reading and writing data concurrently, so support for ACID is essential.
  • Scalability and Performance: The storage solution should be able to scale out to store huge amounts of data and the latency needs to be acceptable level.
  • Diverse Data Formats: Data is everywhere and can be in any format. Structured, semi-structured on unstructured. …

Create your cluster and deploy your jobs into it.

Apache Flink Logo
In this post, I am gonna create a standalone cluster in the AWS using EC2 instances with 3 machines including 1 master and 2 worker nodes.

The steps can be apply on your local machine also but it is best to use more than one machine and configure it via virtualization tool like Virtualbox.

What is Apache Flink?

Flink is a distributed processing framework which gains popularity over the last years. It processes data on multiple machines very fastly. It’s a real time processing framework but it can do also batch processing.

Flink vs Spark

If you have streaming workflow and want to analyze the streaming data…

Before going through the practice first lets talk about a little bit about Serverless.

It is a new paradigm in which the developers don’t have to manage servers anymore. We just deploy code and magic happens. It is the cloud providers responsilibity to run the functions. I mean, under-the-hood, they provision a machine, create a runtime for your code and run it. AWS Lambda is the technology we can build serverless functions in AWS cloud environment. There are also a lot of serverless technology built within the AWS, for example S3, DynamoDb, API GW etc. …

