Separates elements in a collection into multiple output collections. The partitioning function contains the logic that determines how to separate the elements of the input collection into each resulting partition output collection.
The number of partitions must be determined at graph construction time. You cannot determine the number of partitions in mid-pipeline.
See more information in the Beam Programming Guide.
Example: dividing a
PCollection into percentile groups
- Filter is useful if the function is just deciding whether to output an element or not.
- ParDo is the most general element-wise mapping operation, and includes other abilities such as multiple output collections and side-inputs.
- CoGroupByKey performs a per-key equijoin.
Last updated on 2023/12/01
Have you found everything you were looking for?
Was it all useful and clear? Is there anything that you would like to change? Let us know!