Class GroupCombineFunctions
java.lang.Object
org.apache.beam.runners.spark.translation.GroupCombineFunctions
A set of group/combine functions to apply to Spark
RDD
s.-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic <InputT,
OutputT, AccumT>
SparkCombineFn.WindowedAccumulator<InputT, InputT, AccumT, ?> combineGlobally
(org.apache.spark.api.java.JavaRDD<WindowedValue<InputT>> rdd, SparkCombineFn<InputT, InputT, AccumT, OutputT> sparkCombineFn, Coder<AccumT> aCoder, WindowingStrategy<?, ?> windowingStrategy) Apply a compositeCombine.Globally
transformation.static <K,
V, AccumT>
org.apache.spark.api.java.JavaPairRDD<K, SparkCombineFn.WindowedAccumulator<KV<K, V>, V, AccumT, ?>> combinePerKey
(org.apache.spark.api.java.JavaRDD<WindowedValue<KV<K, V>>> rdd, SparkCombineFn<KV<K, V>, V, AccumT, ?> sparkCombineFn, Coder<K> keyCoder, Coder<V> valueCoder, Coder<AccumT> aCoder, WindowingStrategy<?, ?> windowingStrategy) Apply a compositeCombine.PerKey
transformation.static <K,
V> org.apache.spark.api.java.JavaRDD <KV<K, Iterable<WindowedValue<V>>>> groupByKeyOnly
(org.apache.spark.api.java.JavaRDD<WindowedValue<KV<K, V>>> rdd, Coder<K> keyCoder, WindowedValues.WindowedValueCoder<V> wvCoder, @Nullable org.apache.spark.Partitioner partitioner) An implementation ofGroupByKeyViaGroupByKeyOnly.GroupByKeyOnly
for the Spark runner.static <T> org.apache.spark.api.java.JavaRDD
<WindowedValue<T>> reshuffle
(org.apache.spark.api.java.JavaRDD<WindowedValue<T>> rdd, WindowedValues.WindowedValueCoder<T> wvCoder) An implementation ofReshuffle
for the Spark runner.
-
Constructor Details
-
GroupCombineFunctions
public GroupCombineFunctions()
-
-
Method Details
-
groupByKeyOnly
public static <K,V> org.apache.spark.api.java.JavaRDD<KV<K,Iterable<WindowedValue<V>>>> groupByKeyOnly(org.apache.spark.api.java.JavaRDD<WindowedValue<KV<K, V>>> rdd, Coder<K> keyCoder, WindowedValues.WindowedValueCoder<V> wvCoder, @Nullable org.apache.spark.Partitioner partitioner) An implementation ofGroupByKeyViaGroupByKeyOnly.GroupByKeyOnly
for the Spark runner. -
combineGlobally
public static <InputT,OutputT, SparkCombineFn.WindowedAccumulator<InputT,AccumT> InputT, combineGloballyAccumT, ?> (org.apache.spark.api.java.JavaRDD<WindowedValue<InputT>> rdd, SparkCombineFn<InputT, InputT, AccumT, OutputT> sparkCombineFn, Coder<AccumT> aCoder, WindowingStrategy<?, ?> windowingStrategy) Apply a compositeCombine.Globally
transformation. -
combinePerKey
public static <K,V, org.apache.spark.api.java.JavaPairRDD<K,AccumT> SparkCombineFn.WindowedAccumulator<KV<K, combinePerKeyV>, V, AccumT, ?>> (org.apache.spark.api.java.JavaRDD<WindowedValue<KV<K, V>>> rdd, SparkCombineFn<KV<K, V>, V, AccumT, ?> sparkCombineFn, Coder<K> keyCoder, Coder<V> valueCoder, Coder<AccumT> aCoder, WindowingStrategy<?, ?> windowingStrategy) Apply a compositeCombine.PerKey
transformation.This aggregation will apply Beam's
Combine.CombineFn
via Spark'sJavaPairRDD.combineByKey(Function, Function2, Function2)
aggregation. For streaming, this will be called from within a serialized context (DStream's transform callback), so passed arguments need to be Serializable. -
reshuffle
public static <T> org.apache.spark.api.java.JavaRDD<WindowedValue<T>> reshuffle(org.apache.spark.api.java.JavaRDD<WindowedValue<T>> rdd, WindowedValues.WindowedValueCoder<T> wvCoder) An implementation ofReshuffle
for the Spark runner.
-