org.apache.beam.sdk.PipelineRunner<SparkPipelineResult>

org.apache.beam.runners.spark.TestSparkRunner

public final class TestSparkRunner extends PipelineRunner<SparkPipelineResult>

The SparkRunner translate operations defined on a pipeline to a representation executable by Spark, and then submitting the job to Spark to be executed. If we wanted to run a Beam pipeline with the default options of a single threaded spark instance in local mode, we would do the following:

Pipeline p = [logic for pipeline creation] SparkPipelineResult result = (SparkPipelineResult) p.run();

To create a pipeline runner to run against a different spark cluster, with a custom master url we would do the following:

Pipeline p = [logic for pipeline creation] SparkPipelineOptions options = SparkPipelineOptionsFactory.create(); options.setSparkMaster("spark://host:port"); SparkPipelineResult result = (SparkPipelineResult) p.run();

Method Summary

Modifier and Type

Method

Description

static TestSparkRunner

fromOptions(PipelineOptions options)

SparkPipelineResult

run(Pipeline pipeline)

Processes the given Pipeline, potentially asynchronously, returning a runner-specific type of result.

Methods inherited from class org.apache.beam.sdk.PipelineRunner
create, run, run

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- fromOptions
  
  public static TestSparkRunner fromOptions(PipelineOptions options)
- run
  
  public SparkPipelineResult run(Pipeline pipeline)
  
  Description copied from class: PipelineRunner
  
  Processes the given Pipeline, potentially asynchronously, returning a runner-specific type of result.
  
  Specified by:
  
  run in class PipelineRunner<SparkPipelineResult>

Class TestSparkRunner

Method Summary

Methods inherited from class org.apache.beam.sdk.PipelineRunner

Methods inherited from class java.lang.Object

Method Details

fromOptions

run