https://www.oreilly.com/ideas/questioning-the-lambda-architecture
blog-post-questioning-the-lambda-architecture#leave-input-unchanged-materialized-stages1 2 3I like that the Lambda Architecture emphasizes retaining the input data unchanged. I think the discipline of modeling data transformation as a series of materialized stages from an original input has a lot of merit. This is one of the things that makes large MapReduce workflows tractable, as it enables you to debug each stage independently. blog-post-questioning-the-lambda-architecture#leave-input-unchanged-materialized-stages1 2 3
blog-post-questioning-the-lambda-architecture#the-reprocessing-problem1The Lambda Architecture deserves a lot of credit for highlighting this (the reprocessing) problem. blog-post-questioning-the-lambda-architecture#the-reprocessing-problem1
blog-post-questioning-the-lambda-architecture#reprocessing-definition1"By "reprocessing," I mean processing input data over again to re-derive output. This is a completely obvious but often ignored requirement. Code will always change. So, if you have code that derives output data from an input stream, whenever the code changes, you will need to recompute your output to see the effect of the change." blog-post-questioning-the-lambda-architecture#reprocessing-definition1
blog-post-questioning-the-lambda-architecture#streaming-proposed-to-be-inherently-approximate1 2There are a number of other motivations proposed for the Lambda Architecture, but I don't think they make much sense. One is that real-time processing is inherently approximate, less powerful, and more lossy than batch processing. I actually do not think this is true. It is true that the existing set of stream processing frameworks are less mature than MapReduce, but there is no reason that a stream processing system can't give as strong a semantic guarantee as a batch system. blog-post-questioning-the-lambda-architecture#streaming-proposed-to-be-inherently-approximate1 2
The problem with the Lambda Architecture is that maintaining code that needs to produce the same result in two complex distributed systems is exactly as painful as it seems like it would be. I don't think this problem is fixable.
blog-post-questioning-the-lambda-architecture#stream-over-long-period-of-events1Jay says you can just do streaming over some long period of events blog-post-questioning-the-lambda-architecture#stream-over-long-period-of-events1