While designing the new availability model I created a table called availability_decisions
, which felt a little strange at the time, since it was simply the persistence of calculations on the availability_events
table. However, since then I've looked for other people talking about intermediate results and have come across:
David Reed commented to me that one of the good properties of a spreadsheet (leading to its wide acceptance) is that you usually set things up so that you can see the intermediate results of calculations. Rather than have one long formula in a cell, you use several cells, each with simpler formulas referring to some other cells. This makes testing and debugging much easier. (#)
Big, complex MapReduce workflows use files to checkpoint and share their intermediate results. Big, complex SQL processing pipelines create lots and lots of intermediate or temporary tables. This just applies the pattern with an abstraction that is suitable for data in motion, namely a log (#)
Show the process and algorithms of the automation by revealing intermediate results in a way that is comprehensible to the operators. (#)
Page 104. Immutable intermediate results. "As you can see, one of the keys to pipe diagrams is that fields are immutable once created. One obvious optimization that you can make is to discard fields as soon as they're no longer needed (preventing unnecessary serialization and network I/O)" (#)
CTEs seem a little like intermediate results that are automatically cleaned up. Unlike views (and more like temporary tables), they are optimization gates, and must be fully calculated before being used. However, that full calculation is automatic as part of the query.