public abstract class PipelineTranslator extends Pipeline.PipelineVisitor.Defaults
Pipeline.PipelineVisitor
that translates the Beam operators to their Spark counterparts.
It also does the pipeline preparation: mode detection, transforms replacement, classpath
preparation. If we have a streaming job, it is instantiated as a PipelineTranslatorStreaming
. If we have a batch job, it is instantiated as a PipelineTranslatorBatch
.Pipeline.PipelineVisitor.CompositeBehavior, Pipeline.PipelineVisitor.Defaults
Modifier and Type | Field and Description |
---|---|
protected TranslationContext |
translationContext |
Constructor and Description |
---|
PipelineTranslator() |
Modifier and Type | Method and Description |
---|---|
static void |
detectTranslationMode(Pipeline pipeline,
StreamingOptions options)
Visit the pipeline to determine the translation mode (batch/streaming) and update options
accordingly.
|
Pipeline.PipelineVisitor.CompositeBehavior |
enterCompositeTransform(org.apache.beam.sdk.runners.TransformHierarchy.Node node)
Called for each composite transform after all topological predecessors have been visited but
before any of its component transforms.
|
protected abstract TransformTranslator<?> |
getTransformTranslator(org.apache.beam.sdk.runners.TransformHierarchy.Node node)
Get a
TransformTranslator for the given TransformHierarchy.Node . |
TranslationContext |
getTranslationContext() |
void |
leaveCompositeTransform(org.apache.beam.sdk.runners.TransformHierarchy.Node node)
Called for each composite transform after all of its component transforms and their outputs
have been visited.
|
static void |
prepareFilesToStageForRemoteClusterExecution(SparkStructuredStreamingPipelineOptions options)
Local configurations work in the same JVM and have no problems with improperly formatted files
on classpath (eg.
|
static void |
replaceTransforms(Pipeline pipeline,
StreamingOptions options) |
void |
translate(Pipeline pipeline)
Translates the pipeline by passing this class as a visitor.
|
void |
visitPrimitiveTransform(org.apache.beam.sdk.runners.TransformHierarchy.Node node)
Called for each primitive transform after all of its topological predecessors and inputs have
been visited.
|
enterPipeline, getPipeline, leavePipeline, visitValue
protected TranslationContext translationContext
public static void prepareFilesToStageForRemoteClusterExecution(SparkStructuredStreamingPipelineOptions options)
public static void replaceTransforms(Pipeline pipeline, StreamingOptions options)
public static void detectTranslationMode(Pipeline pipeline, StreamingOptions options)
protected abstract TransformTranslator<?> getTransformTranslator(org.apache.beam.sdk.runners.TransformHierarchy.Node node)
TransformTranslator
for the given TransformHierarchy.Node
.public void translate(Pipeline pipeline)
pipeline
- The pipeline to be translatedpublic Pipeline.PipelineVisitor.CompositeBehavior enterCompositeTransform(org.apache.beam.sdk.runners.TransformHierarchy.Node node)
Pipeline.PipelineVisitor
The return value controls whether or not child transforms are visited.
enterCompositeTransform
in interface Pipeline.PipelineVisitor
enterCompositeTransform
in class Pipeline.PipelineVisitor.Defaults
public void leaveCompositeTransform(org.apache.beam.sdk.runners.TransformHierarchy.Node node)
Pipeline.PipelineVisitor
leaveCompositeTransform
in interface Pipeline.PipelineVisitor
leaveCompositeTransform
in class Pipeline.PipelineVisitor.Defaults
public void visitPrimitiveTransform(org.apache.beam.sdk.runners.TransformHierarchy.Node node)
Pipeline.PipelineVisitor
visitPrimitiveTransform
in interface Pipeline.PipelineVisitor
visitPrimitiveTransform
in class Pipeline.PipelineVisitor.Defaults
public TranslationContext getTranslationContext()