apache_beam.runners.interactive.user_pipeline_tracker module

Class that tracks derived/pipeline fragments from user pipelines.

For internal use only; no backwards-compatibility guarantees. In the InteractiveRunner the design is to keep the user pipeline unchanged, create a copy of the user pipeline, and modify the copy. When the derived pipeline runs, there should only be per-user pipeline state. This makes sure that derived pipelines can link back to the parent user pipeline.

class apache_beam.runners.interactive.user_pipeline_tracker.UserPipelineTracker[source]

Bases: object

Tracks user pipelines from derived pipelines.

This data structure is similar to a disjoint set data structure. A derived pipeline can only have one parent user pipeline. A user pipeline can have many derived pipelines.

evict(pipeline: apache_beam.pipeline.Pipeline) → None[source]

Evicts the pipeline.

Removes the given pipeline and derived pipelines if a user pipeline. Otherwise, removes the given derived pipeline.

clear() → None[source]

Clears the tracker of all user and derived pipelines.

get_pipeline(pid: str) → Optional[apache_beam.pipeline.Pipeline][source]

Returns the pipeline corresponding to the given pipeline id.

add_user_pipeline(p: apache_beam.pipeline.Pipeline) → apache_beam.pipeline.Pipeline[source]

Adds a user pipeline with an empty set of derived pipelines.

add_derived_pipeline(maybe_user_pipeline: apache_beam.pipeline.Pipeline, derived_pipeline: apache_beam.pipeline.Pipeline) → None[source]

Adds a derived pipeline with the user pipeline.

If the maybe_user_pipeline is a user pipeline, then the derived pipeline will be added to its set. Otherwise, the derived pipeline will be added to the user pipeline of the maybe_user_pipeline.

By doing the above one can do: p = beam.Pipeline()

derived1 = beam.Pipeline() derived2 = beam.Pipeline()

ut = UserPipelineTracker() ut.add_derived_pipeline(p, derived1) ut.add_derived_pipeline(derived1, derived2)

# Returns p. ut.get_user_pipeline(derived2)

get_user_pipeline(p: apache_beam.pipeline.Pipeline) → Optional[apache_beam.pipeline.Pipeline][source]

Returns the user pipeline of the given pipeline.

If the given pipeline has no user pipeline, i.e. not added to this tracker, then this returns None. If the given pipeline is a user pipeline then this returns the same pipeline. If the given pipeline is a derived pipeline then this returns the user pipeline.