apache_beam.io.azure.blobstorageio module¶
Azure Blob Storage client.
-
apache_beam.io.azure.blobstorageio.
parse_azfs_path
(azfs_path, blob_optional=False, get_account=False)[source]¶ Return the storage account, the container and blob names of the given azfs:// path.
-
apache_beam.io.azure.blobstorageio.
get_azfs_url
(storage_account, container, blob='')[source]¶ Returns the url in the form of https://account.blob.core.windows.net/container/blob-name
-
class
apache_beam.io.azure.blobstorageio.
Blob
(etag, name, last_updated, size, mime_type)[source]¶ Bases:
object
A Blob in Azure Blob Storage.
-
exception
apache_beam.io.azure.blobstorageio.
BlobStorageIOError
[source]¶ Bases:
OSError
,apache_beam.utils.retry.PermanentException
Blob Strorage IO error that should not be retried.
-
exception
apache_beam.io.azure.blobstorageio.
BlobStorageError
(message=None, code=None)[source]¶ Bases:
Exception
Blob Storage client error.
-
class
apache_beam.io.azure.blobstorageio.
BlobStorageIO
(client=None, pipeline_options=None)[source]¶ Bases:
object
Azure Blob Storage I/O client.
-
open
(filename, mode='r', read_buffer_size=16777216, mime_type='application/octet-stream')[source]¶ Open an Azure Blob Storage file path for reading or writing.
Parameters: Returns: Azure Blob Storage file object.
Raises: ValueError
– Invalid open file mode.
-
copy
(src, dest)[source]¶ Copies a single Azure Blob Storage blob from src to dest.
Parameters: - src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
Raises: TimeoutError
– on timeout.
-
copy_tree
(src, dest)[source]¶ Renames the given Azure Blob storage directory and its contents recursively from src to dest.
Parameters: - src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
Returns: List of tuples of (src, dest, exception) where exception is None if the operation succeeded or the relevant exception if the operation failed.
-
copy_paths
(src_dest_pairs)[source]¶ Copies the given Azure Blob Storage blobs from src to dest. This can handle directory or file paths.
Parameters: src_dest_pairs – List of (src, dest) tuples of azfs://<storage-account>/<container>/[name] file paths to copy from src to dest. Returns: List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.
-
rename
(src, dest)[source]¶ Renames the given Azure Blob Storage blob from src to dest.
Parameters: - src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
-
rename_files
(src_dest_pairs)[source]¶ Renames the given Azure Blob Storage blobs from src to dest.
Parameters: src_dest_pairs – List of (src, dest) tuples of azfs://<storage-account>/<container>/[name] file paths to rename from src to dest. - Returns: List of tuples of (src, dest, exception) in the same order as the
- src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.
-
exists
(path)[source]¶ Returns whether the given Azure Blob Storage blob exists.
Parameters: path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
-
size
(path)[source]¶ Returns the size of a single Blob Storage blob.
This method does not perform glob expansion. Hence the given path must be for a single Blob Storage blob.
Returns: size of the Blob Storage blob in bytes.
-
last_updated
(path)[source]¶ Returns the last updated epoch time of a single Azure Blob Storage blob.
This method does not perform glob expansion. Hence the given path must be for a single Azure Blob Storage blob.
Returns: last updated time of the Azure Blob Storage blob in seconds.
-
checksum
(path)[source]¶ Looks up the checksum of an Azure Blob Storage blob.
Parameters: path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
-
delete
(path)[source]¶ Deletes a single blob at the given Azure Blob Storage path.
Parameters: path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
-
delete_paths
(paths)[source]¶ Deletes the given Azure Blob Storage blobs from src to dest. This can handle directory or file paths.
Parameters: paths – list of Azure Blob Storage paths in the form azfs://<storage-account>/<container>/[name] that give the file blobs to be deleted. Returns: List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.
-
delete_tree
(root)[source]¶ Deletes all blobs under the given Azure BlobStorage virtual directory.
Parameters: path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name] (ending with a “/”). Returns: List of tuples of (path, exception), where each path is a blob under the given root. exception is None if the operation succeeded or the relevant exception if the operation failed.
-
delete_files
(paths)[source]¶ Deletes the given Azure Blob Storage blobs from src to dest.
Parameters: paths – list of Azure Blob Storage paths in the form azfs://<storage-account>/<container>/[name] that give the file blobs to be deleted. Returns: List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.
-
list_prefix
(path, with_metadata=False)[source]¶ Lists files matching the prefix.
Parameters: - path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- with_metadata – Experimental. Specify whether returns file metadata.
Returns: - dict of file name -> size; if
with_metadata
is True: dict of file name -> tuple(size, timestamp).
Return type: If
with_metadata
is False
-
-
class
apache_beam.io.azure.blobstorageio.
BlobStorageDownloader
(client, path, buffer_size)[source]¶ Bases:
apache_beam.io.filesystemio.Downloader
-
size
¶
-