Apache Beam provides an advanced unified programming model, allowing you to implement batch and streaming data processing jobs that can run on any execution engine.
Apache Beam is:
- UNIFIED - Use a single programming model for both batch and streaming use cases.
- PORTABLE - Execute pipelines on multiple execution environments, including Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.
- EXTENSIBLE - Write and share new SDKs, IO connectors, and transformation libraries.
To use Beam for your data processing tasks, start by reading the Beam Overview and performing the steps in the Quickstart. Then dive into the Documentation section for in-depth concepts and reference materials for the Beam model, SDKs, and runners.
Beam is an Apache Software Foundation project, available under the Apache v2 license. Beam is an open source community and contributions are greatly appreciated! If you’d like to contribute, please see the Contribute section.