apache_beam.io.filebasedsink module¶
File-based sink.
-
class
apache_beam.io.filebasedsink.
FileBasedSink
(file_path_prefix, coder, file_name_suffix='', num_shards=0, shard_name_template=None, mime_type='application/octet-stream', compression_type='auto', *, max_records_per_shard=None, max_bytes_per_shard=None, skip_if_empty=False)[source]¶ Bases:
apache_beam.io.iobase.Sink
A sink to a GCS or local files.
To implement a file-based sink, extend this class and override either
write_record()
orwrite_encoded_record()
.If needed, also overwrite
open()
and/orclose()
to customize the file handling or write headers and footers.The output of this write is a
PCollection
of all written shards.Raises: TypeError
– if file path parameters are not astr
orValueProvider
, or if compression_type is not member ofCompressionTypes
.ValueError
– if shard_name_template is not of expected format.
-
open
(temp_path)[source]¶ Opens
temp_path
, returning an opaque file handle object.The returned file handle is passed to
write_[encoded_]record
andclose
.
-
write_record
(file_handle, value)[source]¶ Writes a single record go the file handle returned by
open()
.By default, calls
write_encoded_record
after encoding the record with this sink’s Coder.
-
write_encoded_record
(file_handle, encoded_value)[source]¶ Writes a single encoded record to the file handle returned by
open()
.