Class SparkTranslationContext
java.lang.Object
org.apache.beam.runners.spark.translation.SparkTranslationContext
- Direct Known Subclasses:
SparkStreamingTranslationContext
Translation context used to lazily store Spark data sets during portable pipeline translation and
compute them after translation.
-
Constructor Summary
ConstructorsConstructorDescriptionSparkTranslationContext
(org.apache.spark.api.java.JavaSparkContext jsc, PipelineOptions options, JobInfo jobInfo) -
Method Summary
Modifier and TypeMethodDescriptionvoid
Compute the outputs for all RDDs that are leaves in the DAG.org.apache.beam.runners.core.construction.SerializablePipelineOptions
org.apache.spark.api.java.JavaSparkContext
int
Generate a unique pCollection id number to identify runner-generated sinks.popDataset
(String pCollectionId) Retrieve the dataset for the pCollection id and remove it from the DAG's leaves.void
pushDataset
(String pCollectionId, Dataset dataset) Add output of transform to context.
-
Constructor Details
-
SparkTranslationContext
public SparkTranslationContext(org.apache.spark.api.java.JavaSparkContext jsc, PipelineOptions options, JobInfo jobInfo)
-
-
Method Details
-
getSparkContext
public org.apache.spark.api.java.JavaSparkContext getSparkContext() -
getSerializableOptions
public org.apache.beam.runners.core.construction.SerializablePipelineOptions getSerializableOptions() -
pushDataset
Add output of transform to context. -
popDataset
Retrieve the dataset for the pCollection id and remove it from the DAG's leaves. -
computeOutputs
public void computeOutputs()Compute the outputs for all RDDs that are leaves in the DAG. -
nextSinkId
public int nextSinkId()Generate a unique pCollection id number to identify runner-generated sinks.
-