Element-wise Transform Description Filter Given a predicate, filter out all elements that don't satisfy the predicate. FlatMapElements Applies a function that returns a collection to every element in the input and
outputs all resulting elements. Keys Extracts the key from each element in a collection of key-value pairs. KvSwap Swaps the key and value of each element in a collection of key-value pairs. MapElements Applies a function to every element in the input and outputs the result. ParDo The most-general mechanism for applying a user-defined DoFn
to every element
in the input collection. Partition Routes each input element to a specific output collection based on some partition
function. Regex Filters input string elements based on a regex. May also transform them based on the matching groups. Reify Transforms for converting between explicit and implicit form of various Beam values. ToString Transforms every element in an input collection to a string. WithKeys Produces a collection containing each element from the input collection converted to a key-value pair, with a key selected by applying a function to the input element. WithTimestamps Applies a function to determine a timestamp to each element in the output collection,
and updates the implicit timestamp associated with each input. Note that it is only safe to adjust timestamps forwards. Values Extracts the value from each element in a collection of key-value pairs.
Aggregation Transform Description ApproximateQuantiles Uses an approximation algorithm to estimate the data distribution within each aggregation using a specified number of quantiles. ApproximateUnique Uses an approximation algorithm to estimate the number of unique elements within each aggregation. CoGroupByKey Similar to GroupByKey
, but groups values associated with each key into a batch of a given size Combine Transforms to combine elements according to a provided CombineFn
. CombineWithContext An extended version of Combine which allows accessing side-inputs and other context. Count Counts the number of elements within each aggregation. Distinct Produces a collection containing distinct elements from the input collection. GroupByKey Takes a keyed collection of elements and produces a collection where each element
consists of a key and all values associated with that key. GroupIntoBatches Batches values associated with keys into Iterable
batches of some size. Each batch contains elements associated with a specific key. HllCount Estimates the number of distinct elements and creates re-aggregatable sketches using the HyperLogLog++ algorithm. Latest Selects the latest element within each aggregation according to the implicit timestamp. Max Outputs the maximum element within each aggregation. Mean Computes the average within each aggregation. Min Outputs the minimum element within each aggregation. Sample Randomly select some number of elements from each aggregation. Sum Compute the sum of elements in each aggregation. Top Compute the largest element(s) in each aggregation.
Other Transform Description Create Creates a collection from an in-memory list. Flatten Given multiple input collections, produces a single output collection containing
all elements from all of the input collections. PAssert A transform to assert the contents of a PCollection
used as part of testing a pipeline either locally or with a runner. View Operations for turning a collection into view that may be used as a side-input to a ParDo
. Window Logically divides up or groups the elements of a collection into finite
windows according to a provided WindowFn
.
Last updated on 2025/01/19
Have you found everything you were looking for? Was it all useful and clear? Is there anything that you would like to change? Let us know!
SEND FEEDBACK