apache_beam.runners.interactive.interactive_runner module

A runner that allows running of Beam pipelines interactively.

This module is experimental. No backwards-compatibility guarantees.

class apache_beam.runners.interactive.interactive_runner.InteractiveRunner(underlying_runner=<apache_beam.runners.direct.direct_runner.BundleBasedDirectRunner object>)[source]

Bases: apache_beam.runners.runner.PipelineRunner

An interactive runner for Beam Python pipelines.

Allows interactively building and running Beam Python pipelines.

cleanup()[source]
apply(transform, pvalueish)[source]
run_pipeline(pipeline)[source]
class apache_beam.runners.interactive.interactive_runner.ReadCache(cache_manager, label)[source]

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that reads the PCollections from the cache.

expand(pbegin)[source]
class apache_beam.runners.interactive.interactive_runner.WriteCache(cache_manager, sample=False)[source]

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that writes the PCollections to the cache.

expand(pcolls_to_write)[source]
class apache_beam.runners.interactive.interactive_runner.CacheManager(temp_dir=None)[source]

Bases: object

Maps PCollections to files for materialization.

exists(*labels)[source]
read(prefix, cache_label)[source]
glob_path(*labels)[source]
path(*labels)[source]
cleanup()[source]
class apache_beam.runners.interactive.interactive_runner.PipelineInfo(proto)[source]

Bases: object

Provides access to pipeline metadata.

all_pcollections()[source]
leaf_pcollections()[source]
producer(pcoll_id)[source]
derivation(pcoll_id)[source]

Returns the Derivation corresponding to the PCollection.

class apache_beam.runners.interactive.interactive_runner.Derivation(inputs, transform_proto, output_tag)[source]

Bases: object

Records derivation info of a PCollection. Helper for PipelineInfo.

Constructor of Derivation.

Parameters:
  • inputs – (Dict[str, str]) a dict that contains input PCollections to the producing PTransform of the output PCollection. Maps local names to IDs.
  • transform_proto – (Transform proto) the producing PTransform of the output PCollection.
  • output_tag – (str) local name of the output PCollection; this is the PCollection in analysis.
cache_label()[source]
json()[source]
class apache_beam.runners.interactive.interactive_runner.SafeFastPrimitivesCoder[source]

Bases: apache_beam.coders.coders.Coder

This class add an quote/unquote step to escape special characters.

encode(value)[source]
decode(value)[source]
apache_beam.runners.interactive.interactive_runner.set_proto_map(proto_map, new_value)[source]
class apache_beam.runners.interactive.interactive_runner.PipelineResult(underlying_result, runner, pipeline_info, cache_manager, pcolls_to_pcoll_id)[source]

Bases: apache_beam.runners.runner.PipelineResult

Provides access to information about a pipeline.

wait_until_finish()[source]
get(pcoll)[source]
sample(pcoll)[source]