apache_beam.runners.interactive.interactive_runner module

A runner that allows running of Beam pipelines interactively.

This module is experimental. No backwards-compatibility guarantees.

class apache_beam.runners.interactive.interactive_runner.InteractiveRunner(underlying_runner=None, render_option=None, skip_display=True, force_compute=True, blocking=True)[source]

Bases: PipelineRunner

An interactive runner for Beam Python pipelines.

Allows interactively building and running Beam Python pipelines.

Constructor of InteractiveRunner.

Parameters:
  • underlying_runner – (runner.PipelineRunner)

  • render_option – (str) this parameter decides how the pipeline graph is rendered. See display.pipeline_graph_renderer for available options.

  • skip_display – (bool) whether to skip display operations when running the pipeline. Useful if running large pipelines when display is not needed.

  • force_compute – (bool) whether sequential pipeline runs can use cached data of PCollections computed from the previous runs including show API invocation from interactive_beam module. If True, always run the whole pipeline and compute data for PCollections forcefully. If False, use available data and run minimum pipeline fragment to only compute data not available.

  • blocking – (bool) whether the pipeline run should be blocking or not.

is_fnapi_compatible()[source]
set_render_option(render_option)[source]

Sets the rendering option.

Parameters:

render_option – (str) this parameter decides how the pipeline graph is rendered. See display.pipeline_graph_renderer for available options.

start_session()[source]

Start the session that keeps back-end managers and workers alive.

end_session()[source]

End the session that keeps backend managers and workers alive.

apply(transform, pvalueish, options)[source]
run_pipeline(pipeline, options)[source]

Configures the pipeline options for running a job with Flink.

When running with a FlinkRunner, a job server started from an uber jar (locally built or remotely downloaded) hosting the beam_job_api will communicate with the Flink cluster located at the given flink_master in the pipeline options.

class apache_beam.runners.interactive.interactive_runner.PipelineResult(underlying_result, pipeline_instrument)[source]

Bases: PipelineResult

Provides access to information about a pipeline.

Constructor of PipelineResult.

Parameters:
  • underlying_result – (PipelineResult) the result returned by the underlying runner running the pipeline.

  • pipeline_instrument – (PipelineInstrument) pipeline instrument describing the pipeline being executed with interactivity applied and related metadata including where the interactivity-backing cache lies.

property state
wait_until_finish()[source]
get(pcoll, include_window_info=False)[source]

Materializes the PCollection into a list.

If include_window_info is True, then returns the elements as WindowedValues. Otherwise, return the element as itself.

read(pcoll, include_window_info=False)[source]

Reads the PCollection one element at a time from cache.

If include_window_info is True, then returns the elements as WindowedValues. Otherwise, return the element as itself.

cancel()[source]