apache_beam.runners.interactive.cache_manager module

class apache_beam.runners.interactive.cache_manager.CacheManager[source]

Bases: object

Abstract class for caching PCollections.

A PCollection cache is identified by labels, which consist of a prefix (either ‘full’ or ‘sample’) and a cache_label which is a hash of the PCollection derivation.

exists(*labels)[source]

Returns if the PCollection cache exists.

is_latest_version(version, *labels)[source]

Returns if the given version number is the latest.

read(*labels)[source]

Return the PCollection as a list as well as the version number.

Returns:(List[PCollection]) (int) the version number

It is possible that the version numbers from read() and_latest_version() are different. This usually means that the cache’s been evicted (thus unavailable => read() returns version = -1), but it had reached version n before eviction.

source(*labels)[source]

Returns a beam.io.Source that reads the PCollection cache.

sink(*labels)[source]

Returns a beam.io.Sink that writes the PCollection cache.

cleanup()[source]

Cleans up all the PCollection caches.

class apache_beam.runners.interactive.cache_manager.FileBasedCacheManager(cache_dir=None)[source]

Bases: apache_beam.runners.interactive.cache_manager.CacheManager

Maps PCollections to local temp files for materialization.

exists(*labels)[source]
read(*labels)[source]
source(*labels)[source]
sink(*labels)[source]
cleanup()[source]
class apache_beam.runners.interactive.cache_manager.ReadCache(cache_manager, label)[source]

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that reads the PCollections from the cache.

expand(pbegin)[source]
class apache_beam.runners.interactive.cache_manager.WriteCache(cache_manager, label, sample=False, sample_size=0)[source]

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that writes the PCollections to the cache.

expand(pcoll)[source]
class apache_beam.runners.interactive.cache_manager.SafeFastPrimitivesCoder[source]

Bases: apache_beam.coders.coders.Coder

This class add an quote/unquote step to escape special characters.

encode(value)[source]
decode(value)[source]