public final class SparkRunner extends PipelineRunner<SparkPipelineResult>
Pipeline p = [logic for pipeline creation] SparkPipelineResult result =
(SparkPipelineResult) p.run();
To create a pipeline runner to run against a different spark cluster, with a custom master url we would do the following:
Pipeline p = [logic for pipeline creation] SparkPipelineOptions options =
SparkPipelineOptionsFactory.create(); options.setSparkMaster("spark://host:port");
SparkPipelineResult result = (SparkPipelineResult) p.run();
Modifier and Type | Class and Description |
---|---|
static class |
SparkRunner.Evaluator
Evaluator on the pipeline.
|
Modifier and Type | Method and Description |
---|---|
static SparkRunner |
create()
Creates and returns a new SparkRunner with default options.
|
static SparkRunner |
create(SparkPipelineOptions options)
Creates and returns a new SparkRunner with specified options.
|
static SparkRunner |
fromOptions(PipelineOptions options)
Creates and returns a new SparkRunner with specified options.
|
static void |
initAccumulators(SparkPipelineOptions opts,
org.apache.spark.api.java.JavaSparkContext jsc)
Init Metrics/Aggregators accumulators.
|
SparkPipelineResult |
run(Pipeline pipeline)
Processes the given
Pipeline , potentially asynchronously, returning a runner-specific
type of result. |
static void |
updateCacheCandidates(Pipeline pipeline,
org.apache.beam.runners.spark.translation.SparkPipelineTranslator translator,
org.apache.beam.runners.spark.translation.EvaluationContext evaluationContext)
Evaluator that update/populate the cache candidates.
|
run, run
public static SparkRunner create()
public static SparkRunner create(SparkPipelineOptions options)
options
- The SparkPipelineOptions to use when executing the job.public static SparkRunner fromOptions(PipelineOptions options)
options
- The PipelineOptions to use when executing the job.public SparkPipelineResult run(Pipeline pipeline)
PipelineRunner
Pipeline
, potentially asynchronously, returning a runner-specific
type of result.run
in class PipelineRunner<SparkPipelineResult>
public static void initAccumulators(SparkPipelineOptions opts, org.apache.spark.api.java.JavaSparkContext jsc)
public static void updateCacheCandidates(Pipeline pipeline, org.apache.beam.runners.spark.translation.SparkPipelineTranslator translator, org.apache.beam.runners.spark.translation.EvaluationContext evaluationContext)