Element-wise Transform Description Enrichment Performs data enrichment with a remote service. Filter Given a predicate, filter out all elements that don't satisfy the predicate. FlatMap Applies a function that returns a collection to every element in the input and
outputs all resulting elements. Keys Extracts the key from each element in a collection of key-value pairs. KvSwap Swaps the key and value of each element in a collection of key-value pairs. Map Applies a function to every element in the input and outputs the result. MLTransform Applies data processing transforms to the dataset. ParDo The most-general mechanism for applying a user-defined DoFn
to every element
in the input collection. Partition Routes each input element to a specific output collection based on some partition
function. Regex Filters input string elements based on a regex. May also transform them based on the matching groups. Reify Transforms for converting between explicit and implicit form of various Beam values. RunInference Uses machine learning (ML) models to do local and remote inference. ToString Transforms every element in an input collection a string. WithTimestamps Applies a function to determine a timestamp to each element in the output collection,
and updates the implicit timestamp associated with each input. Note that it is only
safe to adjust timestamps forwards. Values Extracts the value from each element in a collection of key-value pairs.
Aggregation Transform Description ApproximateQuantiles Given a distribution, find the approximate N-tiles. ApproximateUnique Given a pcollection, return the estimated number of unique elements. BatchElements Transform that batches elements for amortized processing. CoGroupByKey Takes several keyed collections of elements and produces a collection where each element consists of a key and all values associated with that key. CombineGlobally Transforms to combine elements. CombinePerKey Transforms to combine elements for each key. CombineValues Transforms to combine keyed iterables. Count Counts the number of elements within each aggregation. Distinct Produces a collection containing distinct elements from the input collection. GroupByKey Takes a keyed collection of elements and produces a collection where each element consists of a key and all values associated with that key. GroupBy Takes a collection of elements and produces a collection grouped, by properties of those elements. Unlike GroupByKey, the key is dynamically created from the elements themselves. GroupIntoBatches Batches the input into desired batch size. Latest Gets the element with the latest timestamp. Max Gets the element with the maximum value within each aggregation. Mean Computes the average within each aggregation. Min Gets the element with the minimum value within each aggregation. Sample Randomly select some number of elements from each aggregation. Sum Sums all the elements within each aggregation. ToList Aggregates all elements into a single list. Top Compute the largest element(s) in each aggregation.
Other Transform Description Create Creates a collection from an in-memory list. Flatten Given multiple input collections, produces a single output collection containing
all elements from all of the input collections. Reshuffle Given an input collection, redistributes the elements between workers. This is
most useful for adjusting parallelism or preventing coupled failures. WindowInto Logically divides up or groups the elements of a collection into finite
windows according to a function.
Last updated on 2024/10/08
Have you found everything you were looking for? Was it all useful and clear? Is there anything that you would like to change? Let us know!
SEND FEEDBACK