blog & release
We are happy to present the new 2.38.0 release of Beam. This release includes both improvements and new functionality. See the download page for this release.
For more information on changes in 2.38.0 check out the detailed release notes.
- Introduce projection pushdown optimizer to the Java SDK (BEAM-12976). The optimizer currently only works on the BigQuery Storage API, but more I/Os will be added in future releases. If you encounter a bug with the optimizer, please file a JIRA and disable the optimizer using pipeline option
- A new IO for Neo4j graph databases was added. (BEAM-1857) It has the ability to update nodes and relationships using UNWIND statements and to read data using cypher statements with parameters.
amazon-web-services2has reached feature parity and is finally recommended over the earlier
kinesismodules (Java). These will be deprecated in one of the next releases (BEAM-13174).
- Long outstanding write support for
Kinesiswas added (BEAM-13175).
- Configuration was simplified and made consistent across all IOs, including the usage of
AwsOptions(BEAM-13563, BEAM-13663, BEAM-13587).
- Additionally, there’s a long list of recent improvements and fixes to
S3Filesystem (BEAM-13245, BEAM-13246, BEAM-13441, BEAM-13445, BEAM-14011),
DynamoDBIO (BEAM-13209, BEAM-13209),
SQSIO (BEAM-13631, BEAM-13510) and others.
- Long outstanding write support for
New Features / Improvements
- Pipeline dependencies supplied through
--requirements_filewill now be staged to the runner using binary distributions (wheels) of the PyPI packages for linux_x86_64 platform (BEAM-4032). To restore the behavior to use source distributions, set pipeline option
--requirements_cache_only_sources. To skip staging the packages at submission time, set pipeline option
- The Flink runner now supports Flink 1.14.x (BEAM-13106).
- Interactive Beam now supports remotely executing Flink pipelines on Dataproc (Python) (BEAM-14071).
- (Python) Previously
DoFn.infer_output_typeswas expected to return
element_typeis the PCollection elemnt type. It is now expected to return
element_type. Take care if you have overriden
DoFn(this is not common). See BEAM-13860.
amazon-web-services2) The types of
AwsOptionschanged from String to
- Beam 2.38.0 will be the last minor release to support Flink 1.11.
amazon-web-services2) Client providers (
withXYZClientProvider()) as well as IO specific
RetryConfigurations are deprecated, instead use
AwsOptionsto configure AWS IOs / clients. Custom implementations of client providers shall be replaced with a respective
ClientBuilderFactoryand configured through
- Fix S3 copy for large objects (Java) (BEAM-14011)
- Fix quadratic behavior of pipeline canonicalization (Go) (BEAM-14128)
- This caused unnecessarily long pre-processing times before job submission for large complex pipelines.
pyarrowversion parsing (Python)(BEAM-14235)
- See a full list of open issues that affect this version.
List of Contributors
According to git shortlog, the following people contributed to the 2.38.0 release. Thank you to all contributors!
abhijeet-lele Ahmet Altay akustov Alexander Alexander Zhuravlev Alexey Romanenko AlikRodriguez Anand Inguva andoni-guzman andreukus Andy Ye Ankur Goenka ansh0l Artur Khanin Aydar Farrakhov Aydar Zainutdinov Benjamin Gonzalez Brian Hulette brucearctor bulat safiullin bullet03 Carl Mastrangelo Chamikara Jayalath Chun Yang Daniela Martín Daniel Oliveira Danny McCormick daria.malkova David Cavazos David Huntsperger dmitryor Dmytro Sadovnychyi dpcollins-google egalpin Elias Segundo Antonio emily Etienne Chauchot Hengfeng Li Ismaël Mejía Israel Herraiz Jack McCluskey Jakub Kukul Janek Bevendorff Jeff Klukas Johan Sternby Kamil Breguła Kenneth Knowles Ke Wu Kiley Kyle Weaver laraschmidt Lara Schmidt LE QUELLEC Olivier Luka Kalinovcic Luke Cwik Marcin Kuthan masahitojp Masato Nakamura Matt Casters Melissa Pashniak Michael Li Miguel Hernandez Moritz Mack mosche nancyxu123 Nathan J Mehl Niel Markwick Ning Kang Pablo Estrada paul-tlh Pavel Avilov Rahul Iyer Reuven Lax Ritesh Ghorse Robert Bradshaw Robert Burke Ryan Skraba Ryan Thompson Sam Whittle Seth Vargo sp029619 Steven Niemitz Thiago Nunes Udi Meiri Valentyn Tymofieiev Victor vitaly.terentyev Yichi Zhang Yi Hu yirutang Zachary Houfek Zoe