org.apache.beam.runners.spark.translation (Apache Beam 2.69.0)

package org.apache.beam.runners.spark.translation

Internal translators for running Beam pipelines on Spark.

Related Packages

Package

Description

org.apache.beam.runners.spark

Internal implementation of the Beam runner for Apache Spark.

org.apache.beam.runners.spark.translation.streaming

Internal utilities to translate Beam pipelines to Spark streaming.
Class

Description

AbstractInOutIterator<K,InputT,OutputT>

Abstract base class for iterators that process Spark input data and produce corresponding output values.

BoundedDataset<T>

Holds an RDD or values for deferred conversion to an RDD if needed.

Dataset

Holder for Spark RDD/DStream.

DoFnRunnerWithMetrics<InputT,OutputT>

DoFnRunner decorator which registers MetricsContainerImpl.

EvaluationContext

The EvaluationContext allows us to define pipeline instructions and translate between PObject<T>s or PCollection<T>s and Ts or DStreams/RDDs of Ts.

GroupByKeyVisitor

Traverses the pipeline to populate the candidates for group by key.

GroupCombineFunctions

A set of group/combine functions to apply to Spark RDDs.

GroupNonMergingWindowsFunctions

Functions for GroupByKey with Non-Merging windows translations to Spark.

MultiDoFnFunction<InputT,OutputT>

DoFunctions ignore outputs that are not the main output.

ReifyTimestampsAndWindowsFunction<K,V>

Simple Function to bring the windowing information into the value from the implicit background representation of the PCollection.

SideInputMetadata

Metadata class for side inputs in Spark runner.

SingleEmitInputDStream<T>

A specialized ConstantInputDStream that emits its RDD exactly once.

SparkAssignWindowFn<T,W extends BoundedWindow>

An implementation of Window.Assign for the Spark runner.

SparkBatchPortablePipelineTranslator

Translates a bounded portable pipeline into a Spark job.

SparkBatchPortablePipelineTranslator.IsSparkNativeTransform

Predicate to determine whether a URN is a Spark native transform.

SparkCombineFn<InputT,ValueT,AccumT,OutputT>

A CombineFnBase.GlobalCombineFn with a CombineWithContext.Context for the SparkRunner.

SparkCombineFn.WindowedAccumulator<InputT,ValueT,AccumT,ImplT extends SparkCombineFn.WindowedAccumulator<InputT,ValueT,AccumT,ImplT>>

Accumulator of WindowedValues holding values for different windows.

SparkCombineFn.WindowedAccumulator.Type

Type of the accumulator.

SparkContextFactory

SparkExecutableStageContextFactory

Singleton class that contains one ExecutableStageContext.Factory per job.

SparkInputDataProcessor<FnInputT,FnOutputT,OutputT>

Processes Spark's input data iterators using Beam's DoFnRunner.

SparkPCollectionView

SparkPCollectionView is used to pass serialized views to lambdas.

SparkPCollectionView.Type

Type of side input.

SparkPipelineTranslator

Translator to support translation between Beam transformations and Spark transformations.

SparkPortablePipelineTranslator<T extends SparkTranslationContext>

Interface for portable Spark translators.

SparkProcessContext<K,InputT,OutputT>

Holds current processing context for SparkInputDataProcessor.

SparkStreamingPortablePipelineTranslator

Translates an unbounded portable pipeline into a Spark job.

SparkStreamingTranslationContext

Translation context used to lazily store Spark datasets during streaming portable pipeline translation and compute them after translation.

SparkTranslationContext

Translation context used to lazily store Spark data sets during portable pipeline translation and compute them after translation.

TransformEvaluator<TransformT extends PTransform<?,?>>

Describe a PTransform evaluator.

TransformTranslator

Supports translation between a Beam transform, and Spark's operations on RDDs.

TransformTranslator.Translator

Translator matches Beam transformation with the appropriate evaluator.

TranslationUtils

A set of utilities to help translating Beam transformations into Spark transformations.

TranslationUtils.CombineGroupedValues<K,InputT,OutputT>

A SparkCombineFn function applied to grouped KVs.

TranslationUtils.TupleTagFilter<V>

A utility class to filter TupleTags.

ValueAndCoderKryoSerializer<T>

Kryo serializer for ValueAndCoderLazySerializable.

ValueAndCoderLazySerializable<T>

A holder object that lets you serialize an element with a Coder with minimal wasted space.

Package org.apache.beam.runners.spark.translation