Additional common features not yet part of the Beam model
| Drain |
|---|
| Checkpoint |
| Key-ordered delivery |
| Google Cloud Dataflow | Prism Local Runner | Apache Flink | Apache Spark (RDD/DStream based) | Apache Spark Structured Streaming (Dataset based) | Apache Nemo | Hazelcast Jet | Twister2 | Python Direct FnRunner |
|---|
Partially : Dataflow has a native drain operation, but it does not work in the presence of event time timer loops. Final implemention pending model support. | No : | Partially : Flink supports taking a "savepoint" of the pipeline and shutting the pipeline down after its completion. | : | : | : | : | : | : |
No : | No : | Partially : Flink has a native savepoint capability. | Partially : Spark has a native savepoint capability. | No : not implemented | : | : | : | : |
Partially : Dataflow performs different shuffling algorithms for batch and streaming. Dataflow guarantees key-ordered delivery in streaming, though not in batch. | Yes : fully supported | Partially : Flink may perform different shuffling algorithms for batch and streaming. Flink guarantees key-ordered delivery in streaming, though not in batch. | Unverified : | Unverified : | Unverified : | Unverified : | Unverified : | Unverified : |
Last updated on 2026/05/06
Have you found everything you were looking for?
Was it all useful and clear? Is there anything that you would like to change? Let us know!

