Introducing Apache Beam

The Unified Apache Beam Model

The easiest way to do batch and streaming data processing. Write once, run anywhere data processing for mission-critical production workloads.

Introducing Apache Beam

The Unified Apache Beam Model

The easiest way to do batch and streaming data processing. Write once, run anywhere data processing for mission-critical production workloads.

How Does It Work?

Data Sourcing

Beam reads your data from a diverse set of supported sources, no matter if it’s on-prem or in the cloud.

Data Processing

Beam executes your business logic for both batch and streaming use cases.

Data Writing

Beam writes the results of your data processing logic to the most popular data sinks in the industry.

Apache Beam Features

Unified

A simplified, single programming model for both batch and streaming use cases for every member of your data and application teams.

Extensible

Apache Beam is extensible, with projects such as TensorFlow Extended and Apache Hop built on top of Apache Beam.

Portable

Execute pipelines on multiple execution environments (runners), providing flexibility and avoiding lock-in.

Open Source

Open, community-based development and support to help evolve your application and meet the needs of your specific use cases.

Write Once, Run Anywhere

Create Multi-language Pipelines

Try Beam Playground

Beam Playground is an interactive environment to try out Beam transforms and examples without having to install Apache Beam in your environment. You can try the Apache Beam examples at Beam Playground.

Case Studies Powered by Apache Beam

Apache Beam fuels LinkedIn’s streaming infrastructure, processing 4 trillion events daily through 3K+ pipelines in near-real time. Beam enabled unified pipelines, yielding 2x cost savings and remarkable improvements for many use cases.

Learn more

Share your story

With Apache Beam, OCTO accelerated the migration of one of France’s largest grocery retailers to streaming processing for transactional data, achieving 5x reduced infrastructure costs and 4x improved performance.

Learn more

Share your story

HSBC leveraged Apache Beam as a computational platform and a risk engine that enabled 100x scaling, 2x faster performance, and simplified data distribution for assessing and managing XVA and counterparty credit risk at HSBC’s global scale.

Learn more

Share your story

Apache Beam supports Project Shield’s mission to protect freedom of speech and make the web a safer space by enabling ~2x streaming efficiency at >10,000 QPS and real-time visibility into attack data for their >3K customers.

Learn more

Share your story

Apache Beam powers the Booking.com global ad bidding for performance marketing and scans 2PB+ of data daily, accelerating processing by an eye-opening 36x and expediting time-to-market by as much as 4x.

Learn more

Share your story

Apache Beam has future-proofed Credit Karma’s data and ML platform for scalability and efficiency, enabling MLOps with unified pipelines, processing 5-10 TB daily at 5K events per second, and managing 20K+ ML features.

Learn more

Share your story

Apache Beam is a central component to Intuit’s Stream Processing Platform, which has driven 3x faster time-to-production for authoring a stream processing pipeline.

Learn more

Share your story

Apache Beam enabled real-time ML streaming feature generation and model execution playing a pivotal role in optimizing Lyft’s Marketplace ML predictions, processing ~4mil events per minute to generate ~100 features.

Learn more

Share your story

Seznam, a Czech search engine, has been an early contributor and adopter of Apache Beam, and they migrated several petabyte-scale workloads to Apache Beam pipelines.

Learn more

Share your story

Palo Alto Networks, Inc. is a global cybersecurity leader that uses Apache Beam to process ~10 millions of security log events per second for their real-time streaming infrastructure.

Learn more

Share your story

Apache Beam provides Ricardo, a leading Swiss second hand marketplace, with a scalable and reliable data processing framework that supports fundamental business scenarios and enables real-time and ML data processing.

Learn more

Share your story

Apache Hop, an open-source data orchestration platform, uses Apache Beam to “design once, run anywhere” and creates a value-add for Apache Beam users by enabling visual pipeline development and lifecycle management.

Learn more

Share your story

At Yelp, Apache Beam allows teams to create custom streaming pipelines using Python, eliminating the need to switch to Scala or Java.

Learn more

Share your story

Accenture Baltics uses Apache Beam on Google Cloud to build a robust data processing infrastructure for a sustainable energy leader.They use Beam to democratize data access, process data in real-time, and handle complex ETL tasks.

Learn more

Share your story

Akvelon built Beam-based solutions for Protegrity and a major North American credit reporting company, enabling tokenization with Dataflow Flex Templates and reducing infrastructure and deployment complexity.

Learn more

Share your story

With Apache Beam and Dataflow, Credit Karma achieved a 99% uptime for critical data pipelines, a significant jump from 80%. This reliability, coupled with faster development (1 engineer vs. 3 estimated), has been crucial for enabling real-time financial insights for our more than 140 million members.

Learn more

Share your story

Have a story to share? Your logo could be here.

Share your story

Stay Up To Date with Beam

blog & release

2025/07/01

Apache Beam 2.66.0

Vitalii Terentev

blog

2025/06/16

My Experience at Beam College 2025: 3rd Place Hackathon Winner

Marcio Sugar