Interface HadoopFormatIO.Write.PartitionedWriterBuilder<KeyT,ValueT>
- Type Parameters:
KeyT
- Key type to writeValueT
- Value type to write
- Enclosing class:
HadoopFormatIO.Write<KeyT,
ValueT>
public static interface HadoopFormatIO.Write.PartitionedWriterBuilder<KeyT,ValueT>
Builder for partitioning determining.
-
Method Summary
Modifier and TypeMethodDescriptionWrites to the sink without need to partition output into specified number of partitions.Writes to the sink with partitioning by Task Id.
-
Method Details
-
withPartitioning
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> withPartitioning()Writes to the sink with partitioning by Task Id.Following Hadoop configuration properties are required with this option:
mapreduce.job.reduces
: Number of reduce tasks. Value is equal to number of write tasks which will be generated.mapreduce.job.partitioner.class
: Hadoop partitioner class which will be used for distributing of records among partitions.
- Returns:
- WriteBuilder for write transformation
-
withoutPartitioning
HadoopFormatIO.Write.ExternalSynchronizationBuilder<KeyT,ValueT> withoutPartitioning()Writes to the sink without need to partition output into specified number of partitions.This write operation doesn't do shuffle by the partition so it saves transfer time before write operation itself. As a consequence it generates random number of partitions.
Note: Works only for
PCollection.IsBounded.BOUNDED
PCollection
with globalWindowingStrategy
.- Returns:
- WriteBuilder for write transformation
-