Class TestSparkRunner

java.lang.Object
org.apache.beam.sdk.PipelineRunner<SparkPipelineResult>
org.apache.beam.runners.spark.TestSparkRunner

public final class TestSparkRunner extends PipelineRunner<SparkPipelineResult>
The SparkRunner translate operations defined on a pipeline to a representation executable by Spark, and then submitting the job to Spark to be executed. If we wanted to run a Beam pipeline with the default options of a single threaded spark instance in local mode, we would do the following:

Pipeline p = [logic for pipeline creation] SparkPipelineResult result = (SparkPipelineResult) p.run();

To create a pipeline runner to run against a different spark cluster, with a custom master url we would do the following:

Pipeline p = [logic for pipeline creation] SparkPipelineOptions options = SparkPipelineOptionsFactory.create(); options.setSparkMaster("spark://host:port"); SparkPipelineResult result = (SparkPipelineResult) p.run();