apache_beam.runners.interactive.augmented_pipeline module

Module to augment interactive flavor into the given pipeline.

For internal use only; no backward-compatibility guarantees.

class apache_beam.runners.interactive.augmented_pipeline.AugmentedPipeline(user_pipeline: Pipeline, pcolls: Set[PCollection] | None = None)[source]

Bases: object

A pipeline with augmented interactive flavor that caches intermediate PCollections defined by the user, reads computed PCollections as source and prunes unnecessary pipeline parts for fast computation.

Initializes a pipelilne for augmenting interactive flavor.

Parameters:
  • user_pipeline – a beam.Pipeline instance defined by the user.

  • pcolls – cacheable pcolls to be computed/retrieved. If the set is empty, all intermediate pcolls assigned to variables are applicable.

property augmented_pipeline: Pipeline
property background_recording_pipeline: Pipeline
cacheables() Dict[PCollection, Cacheable][source]

Finds all the cacheable intermediate PCollections in the pipeline with their metadata.

augment() Pipeline[source]

Augments the pipeline with cache. Always calculates a new result.

For a cacheable PCollection, if cache exists, read cache; else, write cache.