KeyT
- Key type to writeValueT
- Value type to writepublic static interface HadoopFormatIO.Write.PartitionedWriterBuilder<KeyT,ValueT>
Modifier and Type | Method and Description |
---|---|
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> |
withoutPartitioning()
Writes to the sink without need to partition output into specified number of partitions.
|
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> |
withPartitioning()
Writes to the sink with partitioning by Task Id.
|
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> withPartitioning()
Following Hadoop configuration properties are required with this option:
mapreduce.job.reduces
: Number of reduce tasks. Value is equal to number of
write tasks which will be genarated.
mapreduce.job.partitioner.class
: Hadoop partitioner class which will be used
for distributing of records among partitions.
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> withoutPartitioning()
This write operation doesn't do shuffle by the partition so it saves transfer time before write operation itself. As a consequence it generates random number of partitions.
Note: Works only for PCollection.IsBounded.BOUNDED
PCollection
with global
WindowingStrategy
.