Package org.apache.beam.runners.spark
Class TestSparkRunner
java.lang.Object
org.apache.beam.sdk.PipelineRunner<SparkPipelineResult>
org.apache.beam.runners.spark.TestSparkRunner
The SparkRunner translate operations defined on a pipeline to a representation executable by
Spark, and then submitting the job to Spark to be executed. If we wanted to run a Beam pipeline
with the default options of a single threaded spark instance in local mode, we would do the
following:
Pipeline p = [logic for pipeline creation] SparkPipelineResult result =
(SparkPipelineResult) p.run();
To create a pipeline runner to run against a different spark cluster, with a custom master url we would do the following:
Pipeline p = [logic for pipeline creation] SparkPipelineOptions options =
SparkPipelineOptionsFactory.create(); options.setSparkMaster("spark://host:port");
SparkPipelineResult result = (SparkPipelineResult) p.run();
-
Method Summary
Modifier and TypeMethodDescriptionstatic TestSparkRunner
fromOptions
(PipelineOptions options) Processes the givenPipeline
, potentially asynchronously, returning a runner-specific type of result.Methods inherited from class org.apache.beam.sdk.PipelineRunner
create, run, run
-
Method Details
-
fromOptions
-
run
Description copied from class:PipelineRunner
Processes the givenPipeline
, potentially asynchronously, returning a runner-specific type of result.- Specified by:
run
in classPipelineRunner<SparkPipelineResult>
-