In this post, I am gonna talk about the LakeHouse, the optimal storage solution for todays world. Let’s start with the history.
When we want to build a system we typically need those types of features for storage:
In this post, I am gonna create a standalone cluster in the AWS using EC2 instances with 3 machines including 1 master and 2 worker nodes.
The steps can be apply on your local machine also but it is best to use more than one machine and configure it via virtualization tool like Virtualbox.
Flink is a distributed processing framework which gains popularity over the last years. It processes data on multiple machines very fastly. It’s a real time processing framework but it can do also batch processing.
If you have streaming workflow and want to analyze the streaming data…
Before going through the practice first lets talk about a little bit about Serverless.
It is a new paradigm in which the developers don’t have to manage servers anymore. We just deploy code and magic happens. It is the cloud providers responsilibity to run the functions. I mean, under-the-hood, they provision a machine, create a runtime for your code and run it. AWS Lambda is the technology we can build serverless functions in AWS cloud environment. There are also a lot of serverless technology built within the AWS, for example S3, DynamoDb, API GW etc. …
AWS Certified Solutions Architect | Datastax Certified Apache Cassandra Developer. Big fan of distributed systems and cloud.