Beam reads your data from a diverse set of supported sources, no matter if it’s on-prem or in the cloud.
Data Processing
Beam executes your business logic for both batch and streaming use cases.
Data Writing
Beam writes the results of your data processing logic to the most popular data sinks in the industry.
You Choose Your Favorite Environment and Programming Language!
& MORE
Choose your runner
A Beam pipeline can execute in the most popular distributed data processing systems - choose a commercial service such as Google Cloud Dataflow or Amazon Kinesis Data Analytics, or roll your own Spark or Flink clusters.
& MORE
Choose your language
You can write Apache Beam pipelines in your programming language of choice: Java, Python and Go. Learn More.
Apache Beam Features
Unified
A simplified, single programming model for both batch and streaming use cases for every member of your data and application teams.
Extensible
Apache Beam is extensible, with projects such as TensorFlow Extended and Apache Hop built on top of Apache Beam.
Portable
Execute pipelines on multiple execution environments (runners), providing flexibility and avoiding lock-in.
Open Source
Open, community-based development and support to help evolve your application and meet the needs of your specific use cases.
Check out our social media to learn more about the community!
Try Beam Playground
Beam Playground is an interactive environment to try out Beam transforms and examples without having to install Apache Beam in your environment.
You can try the Apache Beam examples at Beam Playground (Beta).
Apache Beam Runs in These Environments
Case Studies Powered by Apache Beam
Seznam, a Czech search engine, has been an early contributor and adopter of Apache Beam, and they migrated several petabyte-scale workloads to Apache Beam pipelines.
Palo Alto Networks, Inc. is a global cybersecurity leader that uses Apache Beam to process ~10 millions of security log events per second for their real-time streaming infrastructure.
Apache Beam provides Ricardo, a leading Swiss second hand marketplace, with a scalable and reliable data processing framework that supports fundamental business scenarios and enables real-time and ML data processing.
Apache Hop, an open-source data orchestration platform, uses Apache Beam to “design once, run anywhere” and creates a value-add for Apache Beam users by enabling visual pipeline development and lifecycle management.