Interface HadoopFormatIO.Write.PartitionedWriterBuilder<KeyT,ValueT>

Type Parameters:
KeyT - Key type to write
ValueT - Value type to write
Enclosing class:
HadoopFormatIO.Write<KeyT,ValueT>

public static interface HadoopFormatIO.Write.PartitionedWriterBuilder<KeyT,ValueT>
Builder for partitioning determining.
  • Method Details

    • withPartitioning

      Writes to the sink with partitioning by Task Id.

      Following Hadoop configuration properties are required with this option:

      • mapreduce.job.reduces: Number of reduce tasks. Value is equal to number of write tasks which will be generated.
      • mapreduce.job.partitioner.class: Hadoop partitioner class which will be used for distributing of records among partitions.
      Returns:
      WriteBuilder for write transformation
    • withoutPartitioning

      Writes to the sink without need to partition output into specified number of partitions.

      This write operation doesn't do shuffle by the partition so it saves transfer time before write operation itself. As a consequence it generates random number of partitions.

      Note: Works only for PCollection.IsBounded.BOUNDED PCollection with global WindowingStrategy.

      Returns:
      WriteBuilder for write transformation