T - the type of the elements of the input and output
PCollectionspublic class Partition<T> extends PTransform<PCollection<T>,PCollectionList<T>>
Partition takes a PCollection<T> and a
PartitionFn, uses the PartitionFn to split the
elements of the input PCollection into N partitions, and
returns a PCollectionList<T> that bundles N
PCollection<T>s containing the split elements.
Example of use:
PCollection<Student> students = ...;
// Split students up into 10 partitions, by percentile:
PCollectionList<Student> studentsByPercentile =
students.apply(Partition.of(10, new PartitionFn<Student>() {
public int partitionFor(Student student, int numPartitions) {
return student.getPercentile() // 0..99
* numPartitions / 100;
}}))
for (int i = 0; i < 10; i++) {
PCollection<Student> partition = studentsByPercentile.get(i);
...
}
By default, the Coder of each of the
PCollections in the output PCollectionList is the
same as the Coder of the input PCollection.
Each output element has the same timestamp and is in the same windows
as its corresponding input element, and each output PCollection
has the same
WindowFn
associated with it as the input.
| Modifier and Type | Class and Description |
|---|---|
static interface |
Partition.PartitionFn<T>
A function object that chooses an output partition for an element.
|
name| Modifier and Type | Method and Description |
|---|---|
PCollectionList<T> |
expand(PCollection<T> in)
Override this method to specify how this
PTransform should be expanded
on the given InputT. |
static <T> Partition<T> |
of(int numPartitions,
Partition.PartitionFn<? super T> partitionFn)
Returns a new
Partition PTransform that divides
its input PCollection into the given number of partitions,
using the given partitioning function. |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validatepublic static <T> Partition<T> of(int numPartitions, Partition.PartitionFn<? super T> partitionFn)
Partition PTransform that divides
its input PCollection into the given number of partitions,
using the given partitioning function.numPartitions - the number of partitions to divide the input
PCollection intopartitionFn - the function to invoke on each element to
choose its output partitionjava.lang.IllegalArgumentException - if numPartitions <= 0public PCollectionList<T> expand(PCollection<T> in)
PTransformPTransform should be expanded
on the given InputT.
NOTE: This method should not be called directly. Instead apply the
PTransform should be applied to the InputT using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand in class PTransform<PCollection<T>,PCollectionList<T>>public void populateDisplayData(DisplayData.Builder builder)
PTransformpopulateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData). Implementations may call
super.populateDisplayData(builder) in order to register display data in the current
namespace, but should otherwise use subcomponent.populateDisplayData(builder) to use
the namespace of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData in interface HasDisplayDatapopulateDisplayData in class PTransform<PCollection<T>,PCollectionList<T>>builder - The builder to populate with display data.HasDisplayData