Package org.apache.beam.runners.spark.translation
package org.apache.beam.runners.spark.translation
Internal translators for running Beam pipelines on Spark.
-
ClassDescriptionAbstractInOutIterator<K,
InputT, OutputT> Abstract base class for iterators that process Spark input data and produce corresponding output values.Holds an RDD or values for deferred conversion to an RDD if needed.Holder for Spark RDD/DStream.DoFnRunnerWithMetrics<InputT,OutputT> DoFnRunner decorator which registersMetricsContainerImpl
.The EvaluationContext allows us to define pipeline instructions and translate betweenPObject<T>
s orPCollection<T>
s and Ts or DStreams/RDDs of Ts.Traverses the pipeline to populate the candidates for group by key.A set of group/combine functions to apply to SparkRDD
s.Functions for GroupByKey with Non-Merging windows translations to Spark.MultiDoFnFunction<InputT,OutputT> DoFunctions ignore outputs that are not the main output.SimpleFunction
to bring the windowing information into the value from the implicit background representation of thePCollection
.Metadata class for side inputs in Spark runner.A specializedConstantInputDStream
that emits its RDD exactly once.SparkAssignWindowFn<T,W extends BoundedWindow> An implementation ofWindow.Assign
for the Spark runner.Translates a bounded portable pipeline into a Spark job.Predicate to determine whether a URN is a Spark native transform.SparkCombineFn<InputT,ValueT, AccumT, OutputT> ACombineFnBase.GlobalCombineFn
with aCombineWithContext.Context
for the SparkRunner.SparkCombineFn.WindowedAccumulator<InputT,ValueT, AccumT, ImplT extends SparkCombineFn.WindowedAccumulator<InputT, ValueT, AccumT, ImplT>> Accumulator of WindowedValues holding values for different windows.Type of the accumulator.Singleton class that contains oneExecutableStageContext.Factory
per job.SparkInputDataProcessor<FnInputT,FnOutputT, OutputT> Processes Spark's input data iterators using Beam'sDoFnRunner
.SparkPCollectionView is used to pass serialized views to lambdas.Type of side input.Translator to support translation between Beam transformations and Spark transformations.Interface for portable Spark translators.SparkProcessContext<K,InputT, OutputT> Holds current processing context forSparkInputDataProcessor
.Translates an unbounded portable pipeline into a Spark job.Translation context used to lazily store Spark datasets during streaming portable pipeline translation and compute them after translation.Translation context used to lazily store Spark data sets during portable pipeline translation and compute them after translation.Describe aPTransform
evaluator.Supports translation between a Beam transform, and Spark's operations on RDDs.Translator matches Beam transformation with the appropriate evaluator.A set of utilities to help translating Beam transformations into Spark transformations.TranslationUtils.CombineGroupedValues<K,InputT, OutputT> A SparkCombineFn function applied to grouped KVs.A utility class to filterTupleTag
s.Kryo serializer forValueAndCoderLazySerializable
.A holder object that lets you serialize an element with a Coder with minimal wasted space.