Apache Beam Documentation

This section provides in-depth conceptual information and reference material for the Beam Model, SDKs, and Runners:

Concepts

Learn about the Beam Programming Model and the concepts common to all Beam SDKs and Runners.

Pipeline Fundamentals

SDKs

Find status and reference information on all of the available Beam SDKs.

Runners

A Beam Runner runs a Beam pipeline on a specific (often distributed) data processing system.

Available Runners

Choosing a Runner

Beam is designed to enable pipelines to be portable across different runners. However, given every runner has different capabilities, they also have different abilities to implement the core concepts in the Beam model. The Capability Matrix provides a detailed comparison of runner functionality.

Once you have chosen which runner to use, see that runner’s page for more information about any initial runner-specific setup as well as any required or optional PipelineOptions for configuring it’s execution. You may also want to refer back to the Quickstart for instructions on executing the sample WordCount pipeline.