apache_beam.runners.interactive.display.interactive_pipeline_graph module

Helper to render pipeline graph in IPython when running interactively.

This module is experimental. No backwards-compatibility guarantees.

apache_beam.runners.interactive.display.interactive_pipeline_graph.format_sample(contents, count=1000)[source]
class apache_beam.runners.interactive.display.interactive_pipeline_graph.InteractivePipelineGraph(pipeline, required_transforms=None, referenced_pcollections=None, cached_pcollections=None)[source]

Bases: apache_beam.runners.interactive.display.pipeline_graph.PipelineGraph

Creates the DOT representation of an interactive pipeline. Thread-safe.

Constructor of PipelineGraph.

  • pipeline – (Pipeline proto) or (Pipeline) pipeline to be rendered.
  • required_transforms – (list/set of str) ID of top level PTransforms that lead to visible results.
  • referenced_pcollections – (list/set of str) ID of PCollections that are referenced by top level PTransforms executed (i.e. required_transforms)
  • cached_pcollections – (set of str) a set of PCollection IDs of those whose cached results are used in the execution.

Updates PCollection stats.

Parameters:pcollection_stats – (dict of dict) maps PCollection IDs to informations. In particular, we only care about the field ‘sample’ which should be a the PCollection result in as a list.