InputT
- type of input valuesOutputT
- type of output valuespublic static class Combine.Globally<InputT,OutputT> extends PTransform<PCollection<InputT>,PCollection<OutputT>>
Combine.Globally<InputT, OutputT>
takes a PCollection<InputT>
and returns a
PCollection<OutputT>
whose elements are the result of combining all the elements in
each window of the input PCollection
, using a specified CombineFn<InputT, AccumT, OutputT>
. It is common for InputT == OutputT
, but not
required. Common combining functions include sums, mins, maxes, and averages of numbers,
conjunctions and disjunctions of booleans, statistical aggregations, etc.
Example of use:
PCollection<Integer> pc = ...;
PCollection<Integer> sum = pc.apply(
Combine.globally(new Sum.SumIntegerFn()));
Combining can happen in parallel, with different subsets of the input PCollection
being combined separately, and their intermediate results combined further, in an arbitrary
tree reduction pattern, until a single result value is produced.
If the input PCollection
is windowed into GlobalWindows
, a default value in
the GlobalWindow
will be output if the input PCollection
is empty. To use this
with inputs with other windowing, either withoutDefaults()
or asSingletonView()
must be called, as the default value cannot be automatically assigned to any single window.
By default, the Coder
of the output PValue<OutputT>
is inferred from the
concrete type of the CombineFn<InputT, AccumT, OutputT>
's output type OutputT
.
See also Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>)
/Combine.PerKey
and Combine.groupedValues(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>)
/Combine.GroupedValues
, which are useful for combining values associated with
each key in a PCollection
of KV
s.
name
Modifier and Type | Method and Description |
---|---|
Combine.GloballyAsSingletonView<InputT,OutputT> |
asSingletonView()
Returns a
PTransform that produces a PCollectionView whose elements are the
result of combining elements per-window in the input PCollection . |
PCollection<OutputT> |
expand(PCollection<InputT> input)
Override this method to specify how this
PTransform should be expanded on the given
InputT . |
java.util.Map<TupleTag<?>,PValue> |
getAdditionalInputs()
Returns the side inputs of this
Combine , tagged with the tag of the PCollectionView . |
CombineFnBase.GlobalCombineFn<? super InputT,?,OutputT> |
getFn()
Returns the
CombineFnBase.GlobalCombineFn used by this Combine operation. |
protected java.lang.String |
getKindString()
Returns the name to use by default for this
PTransform (not including the names of any
enclosing PTransform s). |
java.util.List<PCollectionView<?>> |
getSideInputs()
Returns the side inputs used by this Combine operation.
|
boolean |
isInsertDefault()
Returns whether or not this transformation applies a default value.
|
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
Combine.Globally<InputT,OutputT> |
withFanout(int fanout)
Returns a
PTransform identical to this, but that uses an intermediate node to combine
parts of the data to reduce load on the final global combine step. |
Combine.Globally<InputT,OutputT> |
withoutDefaults()
Returns a
PTransform identical to this, but that does not attempt to provide a
default value in the case of empty input. |
Combine.Globally<InputT,OutputT> |
withSideInputs(java.lang.Iterable<? extends PCollectionView<?>> sideInputs)
Returns a
PTransform identical to this, but with the specified side inputs to use in
CombineWithContext.CombineFnWithContext . |
Combine.Globally<InputT,OutputT> |
withSideInputs(PCollectionView<?>... sideInputs)
Returns a
PTransform identical to this, but with the specified side inputs to use in
CombineWithContext.CombineFnWithContext . |
compose, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getName, toString, validate
protected java.lang.String getKindString()
PTransform
PTransform
(not including the names of any
enclosing PTransform
s).
By default, returns the base name of this PTransform
's class.
The caller is responsible for ensuring that names of applied PTransform
s are unique,
e.g., by adding a uniquifying suffix when needed.
getKindString
in class PTransform<PCollection<InputT>,PCollection<OutputT>>
public Combine.GloballyAsSingletonView<InputT,OutputT> asSingletonView()
PTransform
that produces a PCollectionView
whose elements are the
result of combining elements per-window in the input PCollection
. If a value is
requested from the view for a window that is not present, the result of applying the CombineFn
to an empty input set will be returned.public Combine.Globally<InputT,OutputT> withoutDefaults()
PTransform
identical to this, but that does not attempt to provide a
default value in the case of empty input. Required when the input is not globally windowed
and the output is not being used as a side input.public Combine.Globally<InputT,OutputT> withFanout(int fanout)
PTransform
identical to this, but that uses an intermediate node to combine
parts of the data to reduce load on the final global combine step.
The fanout
parameter determines the number of intermediate keys that will be used.
public Combine.Globally<InputT,OutputT> withSideInputs(PCollectionView<?>... sideInputs)
PTransform
identical to this, but with the specified side inputs to use in
CombineWithContext.CombineFnWithContext
.public Combine.Globally<InputT,OutputT> withSideInputs(java.lang.Iterable<? extends PCollectionView<?>> sideInputs)
PTransform
identical to this, but with the specified side inputs to use in
CombineWithContext.CombineFnWithContext
.public CombineFnBase.GlobalCombineFn<? super InputT,?,OutputT> getFn()
CombineFnBase.GlobalCombineFn
used by this Combine operation.public java.util.List<PCollectionView<?>> getSideInputs()
public java.util.Map<TupleTag<?>,PValue> getAdditionalInputs()
Combine
, tagged with the tag of the PCollectionView
. The values of the returned map will be equal to the result of getSideInputs()
.getAdditionalInputs
in class PTransform<PCollection<InputT>,PCollection<OutputT>>
public boolean isInsertDefault()
public PCollection<OutputT> expand(PCollection<InputT> input)
PTransform
PTransform
should be expanded on the given
InputT
.
NOTE: This method should not be called directly. Instead apply the PTransform
should
be applied to the InputT
using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<InputT>,PCollection<OutputT>>
public void populateDisplayData(DisplayData.Builder builder)
PTransform
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call super.populateDisplayData(builder)
in order to register display data in the current namespace,
but should otherwise use subcomponent.populateDisplayData(builder)
to use the namespace
of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
populateDisplayData
in class PTransform<PCollection<InputT>,PCollection<OutputT>>
builder
- The builder to populate with display data.HasDisplayData