Class ParquetIO.Sink

java.lang.Object
org.apache.beam.sdk.io.parquet.ParquetIO.Sink
All Implemented Interfaces:
Serializable, FileIO.Sink<GenericRecord>
Enclosing class:
ParquetIO

public abstract static class ParquetIO.Sink extends Object implements FileIO.Sink<GenericRecord>
See Also:
  • Constructor Details

    • Sink

      public Sink()
  • Method Details

    • withCompressionCodec

      public ParquetIO.Sink withCompressionCodec(org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName)
      Specifies compression codec. By default, CompressionCodecName.SNAPPY.
    • withConfiguration

      public ParquetIO.Sink withConfiguration(Map<String,String> configuration)
      Specify Hadoop configuration for ParquetWriter.
    • withConfiguration

      public ParquetIO.Sink withConfiguration(org.apache.hadoop.conf.Configuration configuration)
      Specify Hadoop configuration for ParquetWriter.
    • withRowGroupSize

      public ParquetIO.Sink withRowGroupSize(int rowGroupSize)
      Specify row-group size; if not set or zero, a default is used by the underlying writer.
    • withAvroDataModel

      public ParquetIO.Sink withAvroDataModel(GenericData model)
      Define the Avro data model; see AvroParquetWriter.Builder.withDataModel(GenericData).
    • open

      public void open(WritableByteChannel channel) throws IOException
      Description copied from interface: FileIO.Sink
      Initializes writing to the given channel. Will be invoked once on a given FileIO.Sink instance.
      Specified by:
      open in interface FileIO.Sink<GenericRecord>
      Throws:
      IOException
    • write

      public void write(GenericRecord element) throws IOException
      Description copied from interface: FileIO.Sink
      Appends a single element to the file. May be invoked zero or more times.
      Specified by:
      write in interface FileIO.Sink<GenericRecord>
      Throws:
      IOException
    • flush

      public void flush() throws IOException
      Description copied from interface: FileIO.Sink
      Flushes the buffered state (if any) before the channel is closed. Does not need to close the channel. Will be invoked once.
      Specified by:
      flush in interface FileIO.Sink<GenericRecord>
      Throws:
      IOException