Sachin Patil
2 min readApr 12, 2021

Streaming Data Architecture Selection Criteria — Kappa vs Lambda

In the modern world, many of the business decisions are time sensitive and require harnessing data from the real time sources to transform customer experience. Lambda architecture and Kappa architecture are the two popular choices to modernize the traditional data pipelines. However it is equally challenging to evaluate the suitable pattern for the use cases. I have implemented multiple Lambda and Kappa architectures for various industry use cases and sharing my thoughts on the parameter impacting the choice of the architecture pattern.

Lambda architecture pattern:

Nathan Marz first described the batch and real time architecture for distributed, fault tolerant and scalable data processing.

Fig 1: Lambda Architecture Pattern

Lambda architecture consists of batch layer and speed layer. The batch layer is responsible for historical load, end of the day/month processing, reprocessing of data etc. Whereas the real time events are processed through the speed layer. Data from batch and speed layers flow into servicing layer for the business consumption.

Kappa architecture pattern:

Jay Kreps has described the idea of Kappa architecture where he proposed unified processing of streaming and batch sources.

Fig 2: Kappa Architecture Pattern

One of the drawbacks of the lambda architecture is a need to maintain two codebases — batch and streaming. Kappa architecture consists of only speed layer. Batch as well as real-time events are processed in the speed layer. Although codebase remains same, pipelines have different resource options optimized for either high throughput or low latency.

Parameters impacting choice of architecture pattern

There is no golden rule to choose between the different architectural patterns. It depends on multiple factors — data sources, latency expectations, business use cases, reprocessing requirements etc.

Below are few of the key considerations based on my experience.

Conclusion

These are few of the guiding parameters in deciding the right architecture for the enterprise data pipelines. However there are other factors that would be impact decision. In the next article in this series, I will elaborate the tool choices and design patterns in building data pipelines based on the Kappa architecture.

References

1. http://lambda-architecture.net/

2. https://www.oreilly.com/radar/questioning-the-lambda-architecture/

Sachin Patil
Sachin Patil

Written by Sachin Patil

Cloud and Big Data Architect

No responses yet