@DefaultAnnotation(value=edu.umd.cs.findbugs.annotations.NonNull.class)
PTransform
s for transforming data in a pipeline.See: Description
Interface | Description |
---|---|
Combine.AccumulatingCombineFn.Accumulator<InputT,AccumT,OutputT> |
The type of mutable accumulator values used by this
AccumulatingCombineFn . |
CombineFnBase.GlobalCombineFn<InputT,AccumT,OutputT> |
For internal use only; no backwards-compatibility guarantees.
|
CombineWithContext.RequiresContextInternal |
An internal interface for signaling that a
GloballyCombineFn or a PerKeyCombineFn needs to access CombineWithContext.Context . |
Contextful.Fn<InputT,OutputT> |
A function from an input to an output that may additionally access
Contextful.Fn.Context when
computing the result. |
DoFn.MultiOutputReceiver |
Receives tagged output for a multi-output function.
|
DoFn.OutputReceiver<T> |
Receives values of the given type.
|
Materialization<T> |
For internal use only; no backwards-compatibility guarantees.
|
Materializations.MultimapView<K,V> |
Represents the
PrimitiveViewT supplied to the ViewFn when it declares to use
the multimap materialization . |
Partition.PartitionFn<T> |
A function object that chooses an output partition for an element.
|
SerializableComparator<T> |
A
Comparator that is also Serializable . |
SerializableFunction<InputT,OutputT> |
A function that computes an output value of type
OutputT from an input value of type
InputT and is Serializable . |
Watch.Growth.TerminationCondition<InputT,StateT> |
A strategy for determining whether it is time to stop polling the current input regardless of
whether its output is complete or not.
|
Class | Description |
---|---|
ApproximateQuantiles |
PTransform s for getting an idea of a PCollection 's data distribution using
approximate N -tiles (e.g. |
ApproximateQuantiles.ApproximateQuantilesCombineFn<T,ComparatorT extends java.util.Comparator<T> & java.io.Serializable> |
The
ApproximateQuantilesCombineFn combiner gives an idea of the distribution of a
collection of values using approximate N -tiles. |
ApproximateUnique |
PTransform s for estimating the number of distinct elements in a PCollection , or
the number of distinct values associated with each key in a PCollection of KV s. |
ApproximateUnique.ApproximateUniqueCombineFn<T> |
CombineFn that computes an estimate of the number of distinct values that were
combined. |
ApproximateUnique.ApproximateUniqueCombineFn.LargestUnique |
A heap utility class to efficiently track the largest added elements.
|
Combine |
PTransform s for combining PCollection elements globally and per-key. |
Combine.AccumulatingCombineFn<InputT,AccumT extends Combine.AccumulatingCombineFn.Accumulator<InputT,AccumT,OutputT>,OutputT> |
A
CombineFn that uses a subclass of Combine.AccumulatingCombineFn.Accumulator as its
accumulator type. |
Combine.BinaryCombineDoubleFn |
An abstract subclass of
Combine.CombineFn for implementing combiners that are more easily and
efficiently expressed as binary operations on double s. |
Combine.BinaryCombineFn<V> |
An abstract subclass of
Combine.CombineFn for implementing combiners that are more easily
expressed as binary operations. |
Combine.BinaryCombineIntegerFn |
An abstract subclass of
Combine.CombineFn for implementing combiners that are more easily and
efficiently expressed as binary operations on int s |
Combine.BinaryCombineLongFn |
An abstract subclass of
Combine.CombineFn for implementing combiners that are more easily and
efficiently expressed as binary operations on long s. |
Combine.CombineFn<InputT,AccumT,OutputT> |
A
CombineFn<InputT, AccumT, OutputT> specifies how to combine a collection of input
values of type InputT into a single output value of type OutputT . |
Combine.Globally<InputT,OutputT> |
Combine.Globally<InputT, OutputT> takes a PCollection<InputT> and returns a
PCollection<OutputT> whose elements are the result of combining all the elements in
each window of the input PCollection , using a specified CombineFn<InputT, AccumT, OutputT> . |
Combine.GloballyAsSingletonView<InputT,OutputT> |
Combine.GloballyAsSingletonView<InputT, OutputT> takes a PCollection<InputT>
and returns a PCollectionView<OutputT> whose elements are the result of combining all
the elements in each window of the input PCollection , using a specified CombineFn<InputT, AccumT, OutputT> . |
Combine.GroupedValues<K,InputT,OutputT> |
GroupedValues<K, InputT, OutputT> takes a PCollection<KV<K, Iterable<InputT>>> ,
such as the result of GroupByKey , applies a specified CombineFn<InputT, AccumT, OutputT> to each of the input KV<K, Iterable<InputT>>
elements to produce a combined output KV<K, OutputT> element, and returns a PCollection<KV<K, OutputT>> containing all the combined output elements. |
Combine.Holder<V> |
Holds a single value value of type
V which may or may not be present. |
Combine.IterableCombineFn<V> | |
Combine.PerKey<K,InputT,OutputT> |
PerKey<K, InputT, OutputT> takes a PCollection<KV<K, InputT>> , groups it by
key, applies a combining function to the InputT values associated with each key to
produce a combined OutputT value, and returns a PCollection<KV<K, OutputT>>
representing a map from each distinct key of the input PCollection to the corresponding
combined value. |
Combine.PerKeyWithHotKeyFanout<K,InputT,OutputT> |
Like
Combine.PerKey , but sharding the combining of hot keys. |
Combine.SimpleCombineFn<V> | Deprecated |
CombineFnBase |
For internal use only; no backwards-compatibility guarantees.
|
CombineFns |
Static utility methods that create combine function instances.
|
CombineFns.CoCombineResult |
A tuple of outputs produced by a composed combine functions.
|
CombineFns.ComposeCombineFnBuilder |
A builder class to construct a composed
CombineFnBase.GlobalCombineFn . |
CombineFns.ComposedCombineFn<DataT> |
A composed
Combine.CombineFn that applies multiple CombineFns . |
CombineFns.ComposedCombineFnWithContext<DataT> |
A composed
CombineWithContext.CombineFnWithContext that applies multiple CombineFnWithContexts . |
CombineWithContext |
This class contains combine functions that have access to
PipelineOptions and side inputs
through CombineWithContext.Context . |
CombineWithContext.CombineFnWithContext<InputT,AccumT,OutputT> |
A combine function that has access to
PipelineOptions and side inputs through CombineWithContext.Context . |
CombineWithContext.Context |
Information accessible to all methods in
CombineFnWithContext and KeyedCombineFnWithContext . |
Contextful<ClosureT> |
Pair of a bit of user code (a "closure") and the
Requirements needed to run it. |
Contextful.Fn.Context |
An accessor for additional capabilities available in
Contextful.Fn.apply(InputT, org.apache.beam.sdk.transforms.Contextful.Fn.Context) . |
Count |
PTransforms to count the elements in a PCollection . |
Create<T> |
Create<T> takes a collection of elements of type T known when the pipeline is
constructed and returns a PCollection<T> containing the elements. |
Create.OfValueProvider<T> | |
Create.TimestampedValues<T> |
A
PTransform that creates a PCollection whose elements have associated
timestamps. |
Create.Values<T> |
A
PTransform that creates a PCollection from a set of in-memory objects. |
Distinct<T> |
Distinct<T> takes a PCollection<T> and returns a PCollection<T> that has
all distinct elements of the input. |
Distinct.WithRepresentativeValues<T,IdT> |
A
Distinct PTransform that uses a SerializableFunction to obtain a
representative value for each input element. |
DoFn<InputT,OutputT> |
The argument to
ParDo providing the code to use to process elements of the input PCollection . |
DoFn.ProcessContinuation |
When used as a return value of
DoFn.ProcessElement , indicates whether there is more work to
be done for the current element. |
DoFnOutputReceivers |
Common
DoFn.OutputReceiver and DoFn.MultiOutputReceiver classes. |
DoFnTester<InputT,OutputT> | Deprecated
Use
TestPipeline with the DirectRunner . |
Filter<T> |
PTransform s for filtering from a PCollection the elements satisfying a predicate,
or satisfying an inequality with a given value based on the elements' natural ordering. |
FlatMapElements<InputT,OutputT> |
PTransform s for mapping a simple function that returns iterables over the elements of a
PCollection and merging the results. |
Flatten |
Flatten<T> takes multiple PCollection<T> s bundled into a PCollectionList<T> and returns a single PCollection<T> containing all the elements in
all the input PCollection s. |
Flatten.Iterables<T> |
FlattenIterables<T> takes a PCollection<Iterable<T>> and returns a PCollection<T> that contains all the elements from each iterable. |
Flatten.PCollections<T> |
A
PTransform that flattens a PCollectionList into a PCollection
containing all the elements of all the PCollection s in its input. |
GroupByKey<K,V> |
GroupByKey<K, V> takes a PCollection<KV<K, V>> , groups the values by key and
windows, and returns a PCollection<KV<K, Iterable<V>>> representing a map from each
distinct key and window of the input PCollection to an Iterable over all the
values associated with that key in the input per window. |
GroupIntoBatches<K,InputT> |
A
PTransform that batches inputs to a desired batch size. |
Impulse |
For internal use only; no backwards-compatibility guarantees.
|
JsonToRow |
Experimental
|
Keys<K> |
Keys<K> takes a PCollection of KV<K, V> s and returns a PCollection<K> of the keys. |
KvSwap<K,V> |
KvSwap<K, V> takes a PCollection<KV<K, V>> and returns a PCollection<KV<V,
K>> , where all the keys and values have been swapped. |
Latest | |
MapElements<InputT,OutputT> |
PTransform s for mapping a simple function over the elements of a PCollection . |
Materializations |
For internal use only; no backwards-compatibility guarantees.
|
Max |
PTransform s for computing the maximum of the elements in a PCollection , or the
maximum of the values associated with each key in a PCollection of KV s. |
Mean |
PTransform s for computing the arithmetic mean (a.k.a. |
Min |
PTransform s for computing the minimum of the elements in a PCollection , or the
minimum of the values associated with each key in a PCollection of KV s. |
ParDo |
ParDo is the core element-wise transform in Apache Beam, invoking a user-specified
function on each of the elements of the input PCollection to produce zero or more output
elements, all of which are collected into the output PCollection . |
ParDo.MultiOutput<InputT,OutputT> |
A
PTransform that, when applied to a PCollection<InputT> , invokes a
user-specified DoFn<InputT, OutputT> on all its elements, which can emit elements to
any of the PTransform 's output PCollection s, which are bundled into a result
PCollectionTuple . |
ParDo.SingleOutput<InputT,OutputT> |
A
PTransform that, when applied to a PCollection<InputT> , invokes a
user-specified DoFn<InputT, OutputT> on all its elements, with all its outputs
collected into an output PCollection<OutputT> . |
Partition<T> |
Partition takes a PCollection<T> and a PartitionFn , uses the PartitionFn to split the elements of the input PCollection into N partitions,
and returns a PCollectionList<T> that bundles N PCollection<T> s
containing the split elements. |
PTransform<InputT extends PInput,OutputT extends POutput> | |
Regex |
PTransorm s to use Regular Expressions to process elements in a PCollection . |
Regex.AllMatches |
Regex.MatchesName<String> takes a PCollection<String> and returns a PCollection<List<String>> representing the value extracted from all the Regex groups of the
input PCollection to the number of times that element occurs in the input. |
Regex.Find |
Regex.Find<String> takes a PCollection<String> and returns a PCollection<String> representing the value extracted from the Regex groups of the input PCollection to the number of times that element occurs in the input. |
Regex.FindAll |
Regex.Find<String> takes a PCollection<String> and returns a PCollection<List<String>> representing the value extracted from the Regex groups of the input
PCollection to the number of times that element occurs in the input. |
Regex.FindKV |
Regex.MatchesKV<KV<String, String>> takes a PCollection<String> and returns a
PCollection<KV<String, String>> representing the key and value extracted from the Regex
groups of the input PCollection to the number of times that element occurs in the
input. |
Regex.FindName |
Regex.Find<String> takes a PCollection<String> and returns a PCollection<String> representing the value extracted from the Regex groups of the input PCollection to the number of times that element occurs in the input. |
Regex.FindNameKV |
Regex.MatchesKV<KV<String, String>> takes a PCollection<String> and returns a
PCollection<KV<String, String>> representing the key and value extracted from the Regex
groups of the input PCollection to the number of times that element occurs in the
input. |
Regex.Matches |
Regex.Matches<String> takes a PCollection<String> and returns a PCollection<String> representing the value extracted from the Regex groups of the input PCollection to the number of times that element occurs in the input. |
Regex.MatchesKV |
Regex.MatchesKV<KV<String, String>> takes a PCollection<String> and returns a
PCollection<KV<String, String>> representing the key and value extracted from the Regex
groups of the input PCollection to the number of times that element occurs in the
input. |
Regex.MatchesName |
Regex.MatchesName<String> takes a PCollection<String> and returns a PCollection<String> representing the value extracted from the Regex groups of the input PCollection to the number of times that element occurs in the input. |
Regex.MatchesNameKV |
Regex.MatchesNameKV<KV<String, String>> takes a PCollection<String> and returns
a PCollection<KV<String, String>> representing the key and value extracted from the
Regex groups of the input PCollection to the number of times that element occurs in the
input. |
Regex.ReplaceAll |
Regex.ReplaceAll<String> takes a PCollection<String> and returns a PCollection<String> with all Strings that matched the Regex being replaced with the
replacement string. |
Regex.ReplaceFirst |
Regex.ReplaceFirst<String> takes a PCollection<String> and returns a PCollection<String> with the first Strings that matched the Regex being replaced with the
replacement string. |
Regex.Split |
Regex.Split<String> takes a PCollection<String> and returns a PCollection<String> with the input string split into individual items in a list. |
Reify |
PTransforms for converting between explicit and implicit form of various Beam
values. |
Requirements |
Describes the run-time requirements of a
Contextful , such as access to side inputs. |
Reshuffle<K,V> | Deprecated
this transform's intended side effects are not portable; it will likely be removed
|
Reshuffle.ViaRandomKey<T> |
Implementation of
Reshuffle.viaRandomKey() . |
Sample |
PTransform s for taking samples of the elements in a PCollection , or samples of
the values associated with each key in a PCollection of KV s. |
Sample.FixedSizedSampleFn<T> |
CombineFn that computes a fixed-size sample of a collection of values. |
SerializableFunctions |
Useful
SerializableFunction overrides. |
SimpleFunction<InputT,OutputT> |
A
SerializableFunction which is not a functional interface. |
Sum |
PTransform s for computing the sum of the elements in a PCollection , or the sum of
the values associated with each key in a PCollection of KV s. |
Top |
PTransform s for finding the largest (or smallest) set of elements in a PCollection , or the largest (or smallest) set of values associated with each key in a PCollection of KV s. |
Top.Largest<T extends java.lang.Comparable<? super T>> | Deprecated
use
Top.Natural instead |
Top.Natural<T extends java.lang.Comparable<? super T>> |
A
Serializable Comparator that that uses the compared elements' natural
ordering. |
Top.Reversed<T extends java.lang.Comparable<? super T>> |
Serializable Comparator that that uses the reverse of the compared elements'
natural ordering. |
Top.Smallest<T extends java.lang.Comparable<? super T>> | Deprecated
use
Top.Reversed instead |
Top.TopCombineFn<T,ComparatorT extends java.util.Comparator<T> & java.io.Serializable> |
CombineFn for Top transforms that combines a bunch of T s into a single
count -long List<T> , using compareFn to choose the largest T s. |
ToString |
PTransforms for converting a PCollection<?> , PCollection<KV<?,?>> , or PCollection<Iterable<?>> to a PCollection<String> . |
Values<V> |
Values<V> takes a PCollection of KV<K, V> s and returns a PCollection<V> of the values. |
View |
Transforms for creating
PCollectionViews from PCollections (to read them as side inputs). |
View.AsIterable<T> |
For internal use only; no backwards-compatibility guarantees.
|
View.AsList<T> |
For internal use only; no backwards-compatibility guarantees.
|
View.AsMap<K,V> |
For internal use only; no backwards-compatibility guarantees.
|
View.AsMultimap<K,V> |
For internal use only; no backwards-compatibility guarantees.
|
View.AsSingleton<T> |
For internal use only; no backwards-compatibility guarantees.
|
View.CreatePCollectionView<ElemT,ViewT> |
For internal use only; no backwards-compatibility guarantees.
|
ViewFn<PrimitiveViewT,ViewT> |
For internal use only; no backwards-compatibility guarantees.
|
Wait |
Delays processing of each window in a
PCollection until signaled. |
Wait.OnSignal<T> |
Implementation of
Wait.on(org.apache.beam.sdk.values.PCollection<?>...) . |
Watch |
Given a "poll function" that produces a potentially growing set of outputs for an input, this
transform simultaneously continuously watches the growth of output sets of all inputs, until a
per-input termination condition is reached.
|
Watch.Growth<InputT,OutputT,KeyT> | |
Watch.Growth.PollFn<InputT,OutputT> |
A function that computes the current set of outputs for the given input, in the form of a
Watch.Growth.PollResult . |
Watch.Growth.PollResult<OutputT> |
The result of a single invocation of a
Watch.Growth.PollFn . |
WithKeys<K,V> |
WithKeys<K, V> takes a PCollection<V> , and either a constant key of type K or a function from V to K , and returns a PCollection<KV<K, V>> , where
each of the values in the input PCollection has been paired with either the constant key
or a key computed from the value. |
WithTimestamps<T> |
A
PTransform for assigning timestamps to all the elements of a PCollection . |
Enum | Description |
---|---|
DoFnTester.CloningBehavior | Deprecated
Use
TestPipeline with the DirectRunner . |
Annotation Type | Description |
---|---|
DoFn.BoundedPerElement |
Annotation on a splittable
DoFn
specifying that the DoFn performs a bounded amount of work per input element, so
applying it to a bounded PCollection will produce also a bounded PCollection . |
DoFn.Element |
Parameter annotation for the input element for a
DoFn.ProcessElement method. |
DoFn.FieldAccess |
Annotation for specifying specific fields that are accessed in a Schema PCollection.
|
DoFn.FinishBundle |
Annotation for the method to use to finish processing a batch of elements.
|
DoFn.GetInitialRestriction |
Annotation for the method that maps an element to an initial restriction for a splittable
DoFn . |
DoFn.GetRestrictionCoder |
Annotation for the method that returns the coder to use for the restriction of a splittable
DoFn . |
DoFn.NewTracker |
Annotation for the method that creates a new
RestrictionTracker for the restriction of
a splittable DoFn . |
DoFn.OnTimer |
Annotation for registering a callback for a timer.
|
DoFn.ProcessElement |
Annotation for the method to use for processing elements.
|
DoFn.RequiresStableInput |
Experimental - no backwards compatibility guarantees.
|
DoFn.Setup |
Annotation for the method to use to prepare an instance for processing bundles of elements.
|
DoFn.SplitRestriction |
Annotation for the method that splits restriction of a splittable
DoFn into multiple parts to
be processed in parallel. |
DoFn.StartBundle |
Annotation for the method to use to prepare an instance for processing a batch of elements.
|
DoFn.StateId |
Annotation for declaring and dereferencing state cells.
|
DoFn.Teardown |
Annotation for the method to use to clean up this instance before it is discarded.
|
DoFn.TimerId |
Annotation for declaring and dereferencing timers.
|
DoFn.Timestamp |
Parameter annotation for the input element timestamp for a
DoFn.ProcessElement method. |
DoFn.UnboundedPerElement |
Annotation on a splittable
DoFn
specifying that the DoFn performs an unbounded amount of work per input element, so
applying it to a bounded PCollection will produce an unbounded PCollection . |
PTransform
s for transforming data in a pipeline.
A PTransform
is an operation that takes an InputT
(some subtype of PInput
) and produces an OutputT
(some subtype of POutput
).
Common PTransforms include root PTransforms like TextIO.Read
and Create
, processing and conversion operations like
ParDo
, GroupByKey
,
CoGroupByKey
, Combine
, and Count
, and
outputting PTransforms like TextIO.Write
.
New PTransforms can be created by composing existing PTransforms. Most PTransforms in this package are composites, and users can also create composite PTransforms for their own application-specific logic.