apache_beam.runners.interactive.pipeline_analyzer module¶
Analyzes and modifies the pipeline that utilize the PCollection cache.
This module is experimental. No backwards-compatibility guarantees.
-
class
apache_beam.runners.interactive.pipeline_analyzer.
PipelineAnalyzer
(cache_manager, pipeline_proto, underlying_runner, options=None, desired_cache_labels=None)[source]¶ Bases:
object
Constructor of PipelineAnanlyzer.
Parameters: - cache_manager – (CacheManager)
- pipeline_proto – (Pipeline proto)
- underlying_runner – (PipelineRunner)
- options – (PipelineOptions)
- desired_cache_labels – (Set[str]) a set of labels of the PCollection queried by the user.
-
top_level_referenced_pcollection_ids
()[source]¶ Returns an array of top level referenced PCollection IDs.
-
class
apache_beam.runners.interactive.pipeline_analyzer.
PipelineInfo
(proto)[source]¶ Bases:
object
Provides access to pipeline metadata.
-
class
Derivation
(inputs, transform_proto, output_tag)[source]¶ Bases:
object
Records derivation info of a PCollection. Helper for PipelineInfo.
Constructor of Derivation.
Parameters: - inputs – (Dict[str, Derivation]) maps PCollection names to Derivations.
- transform_proto – (Transform proto) the producing PTransform.
- output_tag – (str) local name of the PCollection in analysis.
-
class