apache_beam.io.fileio module¶
PTransforms for manipulating files in Apache Beam.
Provides reading PTransforms, MatchFiles,
MatchAll, that produces a PCollection of records representing a file
and its metadata; and ReadMatches, which takes in a PCollection of file
metadata records, and produces a PCollection of ReadableFile objects.
These transforms currently do not support splitting by themselves.
No backward compatibility guarantees. Everything in this module is experimental.
-
class
apache_beam.io.fileio.EmptyMatchTreatment[source]¶ Bases:
objectHow to treat empty matches in
MatchAllandMatchFilestransforms.If empty matches are disallowed, an error will be thrown if a pattern does not match any files.
-
ALLOW= 'ALLOW'¶
-
DISALLOW= 'DISALLOW'¶
-
ALLOW_IF_WILDCARD= 'ALLOW_IF_WILDCARD'¶
-
-
apache_beam.io.fileio.MatchFiles(*args, **kwargs)[source]¶ Matches a file pattern using
FileSystems.match.This
PTransformreturns aPCollectionof matching files in the form ofFileMetadataobjects.
-
apache_beam.io.fileio.MatchAll(*args, **kwargs)[source]¶ Matches file patterns from the input PCollection via
FileSystems.match.This
PTransformreturns aPCollectionof matching files in the form ofFileMetadataobjects.