apache_beam.ml.inference.utils module¶
Util/helper functions used in apache_beam.ml.inference.
- class apache_beam.ml.inference.utils.WatchFilePattern(file_pattern, interval=360, stop_timestamp=Timestamp(9223372036854.775000))[source]¶
Bases:
PTransform
Watches a directory for updates to files matching a given file pattern.
- Parameters:
file_pattern – The file path to read from as a local file path or a GCS
gs://
path. The path can contain glob characters (*
,?
, and[...]
sets). interval: Interval at which to check for files matching file_pattern in seconds.stop_timestamp – Timestamp after which no more files will be checked.
Note:
- Any previously used filenames cannot be reused. If a file is added
or updated to a previously used filename, this transform will ignore that update. To trigger a model update, always upload a file with unique name.
- Initially, before the pipeline startup time, WatchFilePattern expects
at least one file present that matches the file_pattern.
- This transform is supported in streaming mode since
MatchContinuously produces an unbounded source. Running in batch mode can lead to undesired results or result in pipeline being stuck.
- expand(pcoll) PCollection[ModelMetadata] [source]¶