org.apache.beam.runners.spark.translation.SparkTranslationContext

public class SparkTranslationContext extends Object

Translation context used to lazily store Spark data sets during portable pipeline translation and compute them after translation.

Constructor Details
- SparkTranslationContext
  
  public SparkTranslationContext(org.apache.spark.api.java.JavaSparkContext jsc, PipelineOptions options, JobInfo jobInfo)
Method Details
- getSparkContext
  
  public org.apache.spark.api.java.JavaSparkContext getSparkContext()
- getSerializableOptions
  
  public org.apache.beam.runners.core.construction.SerializablePipelineOptions getSerializableOptions()
- pushDataset
  
  public void pushDataset(String pCollectionId, Dataset dataset)
  
  Add output of transform to context.
- popDataset
  
  public Dataset popDataset(String pCollectionId)
  
  Retrieve the dataset for the pCollection id and remove it from the DAG's leaves.
- computeOutputs
  
  public void computeOutputs()
  
  Compute the outputs for all RDDs that are leaves in the DAG.
- nextSinkId
  
  public int nextSinkId()
  
  Generate a unique pCollection id number to identify runner-generated sinks.

Class SparkTranslationContext

Constructor Summary