KeyT - Key type to writeValueT - Value type to writepublic static interface HadoopFormatIO.Write.PartitionedWriterBuilder<KeyT,ValueT>
| Modifier and Type | Method and Description |
|---|---|
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> |
withoutPartitioning()
Writes to the sink without need to partition output into specified number of partitions.
|
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> |
withPartitioning()
Writes to the sink with partitioning by Task Id.
|
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> withPartitioning()
Following Hadoop configuration properties are required with this option:
mapreduce.job.reduces: Number of reduce tasks. Value is equal to number of
write tasks which will be generated.
mapreduce.job.partitioner.class: Hadoop partitioner class which will be used
for distributing of records among partitions.
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> withoutPartitioning()
This write operation doesn't do shuffle by the partition so it saves transfer time before write operation itself. As a consequence it generates random number of partitions.
Note: Works only for PCollection.IsBounded.BOUNDED PCollection with global
WindowingStrategy.