apache_beam.io.filebasedsink module

File-based sink.

class apache_beam.io.filebasedsink.FileBasedSink(file_path_prefix, coder, file_name_suffix='', num_shards=0, shard_name_template=None, mime_type='application/octet-stream', compression_type='auto', *, max_records_per_shard=None, max_bytes_per_shard=None, skip_if_empty=False)[source]

Bases: Sink

A sink to a GCS or local files.

To implement a file-based sink, extend this class and override either write_record() or write_encoded_record().

If needed, also overwrite open() and/or close() to customize the file handling or write headers and footers.

The output of this write is a PCollection of all written shards.

Raises:
display_data()[source]
open(temp_path)[source]

Opens temp_path, returning an opaque file handle object.

The returned file handle is passed to write_[encoded_]record and close.

write_record(file_handle, value)[source]

Writes a single record go the file handle returned by open().

By default, calls write_encoded_record after encoding the record with this sink’s Coder.

write_encoded_record(file_handle, encoded_value)[source]

Writes a single encoded record to the file handle returned by open().

close(file_handle)[source]

Finalize and close the file handle returned from open().

Called after all records are written.

By default, calls file_handle.close() iff it is not None.

initialize_write()[source]
open_writer(init_result, uid)[source]
pre_finalize(init_result, writer_results)[source]
finalize_write(init_result, writer_results, unused_pre_finalize_results)[source]