apache_beam.io.gcp.gcsfilesystem module¶
GCS file system implementation for accessing files on GCS.
- 
class apache_beam.io.gcp.gcsfilesystem.GCSFileSystem(pipeline_options)[source]¶
- Bases: - apache_beam.io.filesystem.FileSystem- A GCS - FileSystemimplementation for accessing files on GCS.- Parameters: - pipeline_options – Instance of - PipelineOptionsor dict of options and values (like- RuntimeValueProvider.runtime_options).- 
CHUNK_SIZE= 100¶
 - 
GCS_PREFIX= 'gs://'¶
 - 
join(basepath, *paths)[source]¶
- Join two or more pathname components for the filesystem - Parameters: - basepath – string path of the first component of the path
- paths – path components to be added
 - Returns: full path after combining all the passed components 
 - 
split(path)[source]¶
- Splits the given path into two parts. - Splits the path into a pair (head, tail) such that tail contains the last component of the path and head contains everything up to that. - Head will include the GCS prefix (‘gs://’). - Parameters: - path – path as a string - Returns: - a pair of path components as strings. 
 - 
mkdirs(path)[source]¶
- Recursively create directories for the provided path. - Parameters: - path – string path of the directory structure that should be created - Raises: - IOError if leaf directory already exists. 
 - 
create(path, mime_type='application/octet-stream', compression_type='auto')[source]¶
- Returns a write channel for the given file path. - Parameters: - path – string path of the file object to be written to the system
- mime_type – MIME type to specify the type of content in the file object
- compression_type – Type of compression to be used for this object
 - Returns: file handle with a close function for the user to use 
 - 
open(path, mime_type='application/octet-stream', compression_type='auto')[source]¶
- Returns a read channel for the given file path. - Parameters: - path – string path of the file object to be written to the system
- mime_type – MIME type to specify the type of content in the file object
- compression_type – Type of compression to be used for this object
 - Returns: file handle with a close function for the user to use 
 - 
copy(source_file_names, destination_file_names)[source]¶
- Recursively copy the file tree from the source to the destination - Parameters: - source_file_names – list of source file objects that needs to be copied
- destination_file_names – list of destination of the new object
 - Raises: - BeamIOErrorif any of the copy operations fail
 - 
rename(source_file_names, destination_file_names)[source]¶
- Rename the files at the source list to the destination list. Source and destination lists should be of the same size. - Parameters: - source_file_names – List of file paths that need to be moved
- destination_file_names – List of destination_file_names for the files
 - Raises: - BeamIOErrorif any of the rename operations fail
 - 
exists(path)[source]¶
- Check if the provided path exists on the FileSystem. - Parameters: - path – string path that needs to be checked. - Returns: boolean flag indicating if path exists 
 - 
size(path)[source]¶
- Get size of path on the FileSystem. - Parameters: - path – string path in question. - Returns: int size of path according to the FileSystem. - Raises: - BeamIOErrorif path doesn’t exist.
 - 
last_updated(path)[source]¶
- Get UNIX Epoch time in seconds on the FileSystem. - Parameters: - path – string path of file. - Returns: float UNIX Epoch time - Raises: - BeamIOErrorif path doesn’t exist.
 - 
checksum(path)[source]¶
- Fetch checksum metadata of a file on the - FileSystem.- Parameters: - path – string path of a file. - Returns: string containing checksum - Raises: - BeamIOErrorif path isn’t a file or doesn’t exist.
 
-