Class GroupCombineFunctions
java.lang.Object
org.apache.beam.runners.spark.translation.GroupCombineFunctions
A set of group/combine functions to apply to Spark
RDDs.-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic <InputT,OutputT, AccumT>
SparkCombineFn.WindowedAccumulator<InputT, InputT, AccumT, ?> combineGlobally(org.apache.spark.api.java.JavaRDD<WindowedValue<InputT>> rdd, SparkCombineFn<InputT, InputT, AccumT, OutputT> sparkCombineFn, Coder<AccumT> aCoder, WindowingStrategy<?, ?> windowingStrategy) Apply a compositeCombine.Globallytransformation.static <K,V, AccumT>
org.apache.spark.api.java.JavaPairRDD<K, SparkCombineFn.WindowedAccumulator<KV<K, V>, V, AccumT, ?>> combinePerKey(org.apache.spark.api.java.JavaRDD<WindowedValue<KV<K, V>>> rdd, SparkCombineFn<KV<K, V>, V, AccumT, ?> sparkCombineFn, Coder<K> keyCoder, Coder<V> valueCoder, Coder<AccumT> aCoder, WindowingStrategy<?, ?> windowingStrategy) Apply a compositeCombine.PerKeytransformation.static <K,V> org.apache.spark.api.java.JavaRDD <KV<K, Iterable<WindowedValue<V>>>> groupByKeyOnly(org.apache.spark.api.java.JavaRDD<WindowedValue<KV<K, V>>> rdd, Coder<K> keyCoder, WindowedValues.WindowedValueCoder<V> wvCoder, @Nullable org.apache.spark.Partitioner partitioner) An implementation ofGroupByKeyViaGroupByKeyOnly.GroupByKeyOnlyfor the Spark runner.static <T> org.apache.spark.api.java.JavaRDD<WindowedValue<T>> reshuffle(org.apache.spark.api.java.JavaRDD<WindowedValue<T>> rdd, WindowedValues.WindowedValueCoder<T> wvCoder) An implementation ofReshufflefor the Spark runner.
-
Constructor Details
-
GroupCombineFunctions
public GroupCombineFunctions()
-
-
Method Details
-
groupByKeyOnly
public static <K,V> org.apache.spark.api.java.JavaRDD<KV<K,Iterable<WindowedValue<V>>>> groupByKeyOnly(org.apache.spark.api.java.JavaRDD<WindowedValue<KV<K, V>>> rdd, Coder<K> keyCoder, WindowedValues.WindowedValueCoder<V> wvCoder, @Nullable org.apache.spark.Partitioner partitioner) An implementation ofGroupByKeyViaGroupByKeyOnly.GroupByKeyOnlyfor the Spark runner. -
combineGlobally
public static <InputT,OutputT, SparkCombineFn.WindowedAccumulator<InputT,AccumT> InputT, combineGloballyAccumT, ?> (org.apache.spark.api.java.JavaRDD<WindowedValue<InputT>> rdd, SparkCombineFn<InputT, InputT, AccumT, OutputT> sparkCombineFn, Coder<AccumT> aCoder, WindowingStrategy<?, ?> windowingStrategy) Apply a compositeCombine.Globallytransformation. -
combinePerKey
public static <K,V, org.apache.spark.api.java.JavaPairRDD<K,AccumT> SparkCombineFn.WindowedAccumulator<KV<K, combinePerKeyV>, V, AccumT, ?>> (org.apache.spark.api.java.JavaRDD<WindowedValue<KV<K, V>>> rdd, SparkCombineFn<KV<K, V>, V, AccumT, ?> sparkCombineFn, Coder<K> keyCoder, Coder<V> valueCoder, Coder<AccumT> aCoder, WindowingStrategy<?, ?> windowingStrategy) Apply a compositeCombine.PerKeytransformation.This aggregation will apply Beam's
Combine.CombineFnvia Spark'sJavaPairRDD.combineByKey(Function, Function2, Function2)aggregation. For streaming, this will be called from within a serialized context (DStream's transform callback), so passed arguments need to be Serializable. -
reshuffle
public static <T> org.apache.spark.api.java.JavaRDD<WindowedValue<T>> reshuffle(org.apache.spark.api.java.JavaRDD<WindowedValue<T>> rdd, WindowedValues.WindowedValueCoder<T> wvCoder) An implementation ofReshufflefor the Spark runner.
-