The Apache Beam community is pleased to announce the availability of version 2.0.0. This is the first stable release of Apache Beam, signifying a statement from the community that it intends to maintain API stability with all releases for the foreseeable future, and making Beam suitable for enterprise deployment.

This first stable release is the third important milestone for the Apache Beam community. Beam joined the Apache Incubator in February 2016 and graduated as a top-level project of The Apache Software Foundation in December. Through these fifteen months of concentrated effort, a slightly chaotic codebase, merged from multiple organizations, has been developed into a generalized framework for data processing that is truly engine- and environment-independent. Apache Beam has evolved and improved through three incubating and three post-incubation releases, culminating in the stable release announced today as version 2.0.0.

In the five months since graduation, Apache Beam has seen a significant growth, both in terms of adoption and community contribution. Apache Beam is in use at Google Cloud, PayPal, and Talend, among others.

Apache Beam, version 2.0.0 improves user experience across the project, focusing on seamless portability across execution environments, including engines, operating systems, on-premise clusters, cloud providers, and data storage systems. Other highlights include:

  • API stability and future compatibility within this major version.
  • Stateful data processing paradigms that unlock efficient, data-dependent computations.
  • Support for user-extensible file systems, with built-in support for Hadoop Distributed File System, among others.
  • A metrics subsystem for deeper insight into pipeline execution.

Many contributors made this release possible, by participating in different roles: contributing code, writing documentation, testing release candidates, supporting users, or helping in some other way. The following is a partial list of contributors – 76 individuals contributed code to the project since the previous release, assembled from source history:

  • Ahmet Altay
  • Eric Anderson
  • Raghu Angadi
  • Sourabh Bajaj
  • Péter Gergő Barna
  • Chen Bin
  • Davor Bonaci
  • Robert Bradshaw
  • Ben Chambers
  • Etienne Chauchot
  • Chang Chen
  • Charles Chen
  • Craig Citro
  • Lukasz Cwik
  • Márton Elek
  • Pablo Estrada
  • Josh Forman-Gornall
  • Maria García Herrero
  • Jins George
  • Damien Gouyette
  • Thomas Groh
  • Dan Halperin
  • Pei He
  • Hadar Hod
  • Chamikara Jayalath
  • Rekha Joshi
  • Uwe Jugel
  • Sung Junyoung
  • Holden Karau
  • Vikas Kedigehalli
  • Eugene Kirpichov
  • Tibor Kiss
  • Kenneth Knowles
  • Vassil Kolarov
  • Chinmay Kolhatkar
  • Aljoscha Krettek
  • Dipti Kulkarni
  • Radhika Kulkarni
  • Jason Kuster
  • Reuven Lax
  • Stas Levin
  • Julien Lhermitte
  • Jingsong Li
  • Neville Li
  • Mark Liu
  • Michael Luckey
  • Andrew Martin
  • Ismaël Mejía
  • Devon Meunier
  • Neda Mirian
  • Anil Muppalla
  • Gergely Novak
  • Jean-Baptiste Onofré
  • Melissa Pashniak
  • peay
  • David Rieber
  • Rahul Sabbineni
  • Kobi Salant
  • Amit Sela
  • Mark Shalda
  • Stephen Sisk
  • Yuya Tajima
  • Wesley Tanaka
  • JiJun Tang
  • Valentyn Tymofieiev
  • David Volquartz
  • Huafeng Wang
  • Thomas Weise
  • Rafal Wojdyla
  • Yangping Wu
  • wyp
  • James Xu
  • Mingmin Xu
  • Ted Yu
  • Borisa Zivkovic
  • Aviem Zur

Apache Beam, version 2.0.0, is making its debut at Apache: Big Data, taking place this week in Miami, FL, with four sessions featuring Apache Beam. Apache Beam will also be highlighted at numerous face-to-face meetups and conferences, including the Future of Data San Jose meetup, Strata Data Conference London, Berlin Buzzwords, and DataWorks Summit San Jose.

We’d like to invite everyone to try out Apache Beam today and consider joining our vibrant community. We welcome feedback, contribution and participation through our mailing lists, issue tracker, pull requests, and events.