blog & release
2021/11/11
Apache Beam 2.34.0Kyle Weaver
We are happy to present the new 2.34.0 release of Beam. This release includes both improvements and new functionality. See the download page for this release.
For more information on changes in 2.34.0, check out the detailed release notes.
Highlights
- The Beam Java API for Calcite SqlTransform is no longer experimental (BEAM-12680).
- Python’s ParDo (Map, FlatMap, etc.) transforms now suport a
with_exception_handlingoption for easily ignoring bad records and implementing the dead letter pattern.
I/Os
ReadFromBigQueryandReadAllFromBigQuerynow run queries with BATCH priority by default. Thequery_priorityparameter is introduced to the same transforms to allow configuring the query priority (Python) (BEAM-12913).- [EXPERIMENTAL] Support for BigQuery Storage Read API added to
ReadFromBigQuery. The newly introducedmethodparameter can be set asDIRECT_READto use the Storage Read API. The default isEXPORTwhich invokes a BigQuery export request. (Python) (BEAM-10917). - [EXPERIMENTAL] Added
use_native_datetimeparameter toReadFromBigQueryto configure the return type of DATETIME fields when usingReadFromBigQuery. This parameter can only be used whenmethod = DIRECT_READ(Python) (BEAM-10917).
New Features / Improvements
- Upgrade to Calcite 1.26.0 (BEAM-9379).
- Added a new
dataframeextra to the Python SDK that trackspandasversions we’ve verified compatibility with. We now recommend installing Beam withpip install apache-beam[dataframe]when you intend to use the DataFrame API (BEAM-12906). - Add an example of deploying Python Apache Beam job with Spark Cluster
Breaking Changes
- SQL Rows are no longer flattened (BEAM-5505).
- [Go SDK] beam.TryCrossLanguage’s signature now matches beam.CrossLanguage. Like other Try functions it returns an error instead of panicking. (BEAM-9918).
- BEAM-12925 was fixed. It used to silently pass incorrect null data read from JdbcIO. Pipelines affected by this will now start throwing failures instead of silently passing incorrect data.
Bugfixes
- Fixed error while writing multiple DeferredFrames to csv (Python) (BEAM-12701).
- Fixed error when importing the DataFrame API with pandas 1.0.x installed (BEAM-12945).
- Fixed top.SmallestPerKey implementation in the Go SDK (BEAM-12946).
Known Issues
- Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used).
This results in the error message:
IllegalArgumentException: Attempting to access unknown side input. Please upgrade to a newer version (> 2.34.0) or use another write method (e.g.STORAGE_WRITE_API).
List of Contributors
According to git shortlog, the following people contributed to the 2.34.0 release. Thank you to all contributors!
Ahmet Altay, Aizhamal Nurmamat kyzy, Alex Amato, Alexander Chermenin, Alexey Romanenko, AlikRodriguez, Andrew Pilloud, Andy Xu, Ankur Goenka, Aydar Farrakhov, Aydar Zainutdinov, Aydar Zaynutdinov, AydarZaynutdinov, Benjamin Gonzalez, BenWhitehead, Brachi Packter, Brian Hulette, Bu Sun Kim, Chamikara Jayalath, Chris Gray, Chuck Yang, Chun Yang, Claire McGinty, comet, Daniel Collins, Daniel Oliveira, Daniel Thevessen, daria.malkova, David Cavazos, David Huntsperger, Dmytro Kozhevin, dpcollins-google, Eduardo Sánchez López, Elias Djurfeldt, emily, Emily Ye, Enis Sert, Etienne Chauchot, Fernando Morales, Heejong Lee, Ihor Indyk, Ismaël Mejía, Israel Herraiz, Jack McCluskey, Jonathan Hourany, Judah Rand, Kenneth Knowles, KevinGG, Ke Wu, kileys, Kyle Weaver, Luke Cwik, masahitojp, MiguelAnzoWizeline, Minbo Bae, Niels Basjes, Ning Kang, Pablo Estrada, pareshsarafmdb, Paul Féraud, Piotr Szczepanik, Reuven Lax, Ritesh Ghorse, R. Miles McCain, Robert Bradshaw, Robert Burke, Rogan Morrow, Ruwan Lambrichts, rvballada, Ryan Thompson, Sam Rohde, Sam Whittle, Ștefan Istrate, Steve Niemitz, Thomas Li Fredriksen, Tomo Suzuki, tvalentyn, Udi Meiri, Vachan, Valentyn Tymofieiev, Vincent Marquez, WinsonT, Yichi Zhang, Yifan Mai, Yilei “Dolee” Yang, zhoufek
Latest from the blog
blog & gsoc
2025/10/14
Google Summer of Code 2025 - Enhanced Interactive Pipeline Development Environment for JupyterLab
Canyu Chen
blog & gsoc
2025/09/26
Google Summer of Code 2025 - Beam ML Vector DB/Feature Store integrations
Mohamed Awnallah
blog & gsoc
2025/09/23
Google Summer of Code 2025 - Beam YAML, Kafka and Iceberg User Accessibility
Charles Nguyen

