public class Top
extends java.lang.Object
PTransform
s for finding the largest (or smallest) set of elements in a PCollection
, or the largest (or smallest) set of values associated with each key in a PCollection
of KV
s.Modifier and Type | Class and Description |
---|---|
static class |
Top.Largest<T extends java.lang.Comparable<? super T>>
Deprecated.
use
Top.Natural instead |
static class |
Top.Natural<T extends java.lang.Comparable<? super T>>
A
Serializable Comparator that that uses the compared elements' natural
ordering. |
static class |
Top.Reversed<T extends java.lang.Comparable<? super T>>
Serializable Comparator that that uses the reverse of the compared elements'
natural ordering. |
static class |
Top.Smallest<T extends java.lang.Comparable<? super T>>
Deprecated.
use
Top.Reversed instead |
static class |
Top.TopCombineFn<T,ComparatorT extends java.util.Comparator<T> & java.io.Serializable>
CombineFn for Top transforms that combines a bunch of T s into a single
count -long List<T> , using compareFn to choose the largest T s. |
Modifier and Type | Method and Description |
---|---|
static <T extends java.lang.Comparable<T>> |
largest(int count)
Returns a
PTransform that takes an input PCollection<T> and returns a PCollection<List<T>> with a single element containing the largest count elements of
the input PCollection<T> , in decreasing order, sorted according to their natural order. |
static <K,V extends java.lang.Comparable<V>> |
largestPerKey(int count)
Returns a
PTransform that takes an input PCollection<KV<K, V>> and returns a
PCollection<KV<K, List<V>>> that contains an output element mapping each distinct key
in the input PCollection to the largest count values associated with that key
in the input PCollection<KV<K, V>> , in decreasing order, sorted according to their
natural order. |
static <T,ComparatorT extends java.util.Comparator<T> & java.io.Serializable> |
of(int count,
ComparatorT compareFn)
Returns a
PTransform that takes an input PCollection<T> and returns a PCollection<List<T>> with a single element containing the largest count elements of
the input PCollection<T> , in decreasing order, sorted using the given Comparator<T> . |
static <K,V,ComparatorT extends java.util.Comparator<V> & java.io.Serializable> |
perKey(int count,
ComparatorT compareFn)
Returns a
PTransform that takes an input PCollection<KV<K, V>> and returns a
PCollection<KV<K, List<V>>> that contains an output element mapping each distinct key
in the input PCollection to the largest count values associated with that key
in the input PCollection<KV<K, V>> , in decreasing order, sorted using the given Comparator<V> . |
static <T extends java.lang.Comparable<T>> |
smallest(int count)
Returns a
PTransform that takes an input PCollection<T> and returns a PCollection<List<T>> with a single element containing the smallest count elements of
the input PCollection<T> , in increasing order, sorted according to their natural order. |
static <K,V extends java.lang.Comparable<V>> |
smallestPerKey(int count)
Returns a
PTransform that takes an input PCollection<KV<K, V>> and returns a
PCollection<KV<K, List<V>>> that contains an output element mapping each distinct key
in the input PCollection to the smallest count values associated with that key
in the input PCollection<KV<K, V>> , in increasing order, sorted according to their
natural order. |
public static <T,ComparatorT extends java.util.Comparator<T> & java.io.Serializable> Combine.Globally<T,java.util.List<T>> of(int count, ComparatorT compareFn)
PTransform
that takes an input PCollection<T>
and returns a PCollection<List<T>>
with a single element containing the largest count
elements of
the input PCollection<T>
, in decreasing order, sorted using the given Comparator<T>
. The Comparator<T>
must also be Serializable
.
If count
>
the number of elements in the input PCollection
, then all
the elements of the input PCollection
will be in the resulting List
, albeit in
sorted order.
All the elements of the result's List
must fit into the memory of a single machine.
Example of use:
PCollection<Student> students = ...;
PCollection<List<Student>> top10Students =
students.apply(Top.of(10, new CompareStudentsByAvgGrade()));
By default, the Coder
of the output PCollection
is a ListCoder
of
the Coder
of the elements of the input PCollection
.
If the input PCollection
is windowed into GlobalWindows
, an empty List<T>
in the GlobalWindow
will be output if the input PCollection
is empty.
To use this with inputs with other windowing, either withoutDefaults
or asSingletonView
must be called.
See also smallest(int)
and largest(int)
, which sort Comparable
elements
using their natural ordering.
See also perKey(int, ComparatorT)
, smallestPerKey(int)
, and largestPerKey(int)
, which take a
PCollection
of KV
s and return the top values associated with each key.
public static <T extends java.lang.Comparable<T>> Combine.Globally<T,java.util.List<T>> smallest(int count)
PTransform
that takes an input PCollection<T>
and returns a PCollection<List<T>>
with a single element containing the smallest count
elements of
the input PCollection<T>
, in increasing order, sorted according to their natural order.
If count
>
the number of elements in the input PCollection
, then all
the elements of the input PCollection
will be in the resulting PCollection
's
List
, albeit in sorted order.
All the elements of the result List
must fit into the memory of a single machine.
Example of use:
PCollection<Integer> values = ...;
PCollection<List<Integer>> smallest10Values = values.apply(Top.smallest(10));
By default, the Coder
of the output PCollection
is a ListCoder
of
the Coder
of the elements of the input PCollection
.
If the input PCollection
is windowed into GlobalWindows
, an empty List<T>
in the GlobalWindow
will be output if the input PCollection
is empty.
To use this with inputs with other windowing, either withoutDefaults
or asSingletonView
must be called.
See also largest(int)
.
See also of(int, ComparatorT)
, which sorts using a user-specified Comparator
function.
See also perKey(int, ComparatorT)
, smallestPerKey(int)
, and largestPerKey(int)
, which take a
PCollection
of KV
s and return the top values associated with each key.
public static <T extends java.lang.Comparable<T>> Combine.Globally<T,java.util.List<T>> largest(int count)
PTransform
that takes an input PCollection<T>
and returns a PCollection<List<T>>
with a single element containing the largest count
elements of
the input PCollection<T>
, in decreasing order, sorted according to their natural order.
If count
>
the number of elements in the input PCollection
, then all
the elements of the input PCollection
will be in the resulting PCollection
's
List
, albeit in sorted order.
All the elements of the result's List
must fit into the memory of a single machine.
Example of use:
PCollection<Integer> values = ...;
PCollection<List<Integer>> largest10Values = values.apply(Top.largest(10));
By default, the Coder
of the output PCollection
is a ListCoder
of
the Coder
of the elements of the input PCollection
.
If the input PCollection
is windowed into GlobalWindows
, an empty List<T>
in the GlobalWindow
will be output if the input PCollection
is empty.
To use this with inputs with other windowing, either withoutDefaults
or asSingletonView
must be called.
See also smallest(int)
.
See also of(int, ComparatorT)
, which sorts using a user-specified Comparator
function.
See also perKey(int, ComparatorT)
, smallestPerKey(int)
, and largestPerKey(int)
, which take a
PCollection
of KV
s and return the top values associated with each key.
public static <K,V,ComparatorT extends java.util.Comparator<V> & java.io.Serializable> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.util.List<V>>>> perKey(int count, ComparatorT compareFn)
PTransform
that takes an input PCollection<KV<K, V>>
and returns a
PCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key
in the input PCollection
to the largest count
values associated with that key
in the input PCollection<KV<K, V>>
, in decreasing order, sorted using the given Comparator<V>
. The Comparator<V>
must also be Serializable
.
If there are fewer than count
values associated with a particular key, then all
those values will be in the result mapping for that key, albeit in sorted order.
All the values associated with a single key must fit into the memory of a single machine,
but there can be many more KV
s in the resulting PCollection
than can fit into
the memory of a single machine.
Example of use:
PCollection<KV<School, Student>> studentsBySchool = ...;
PCollection<KV<School, List<Student>>> top10StudentsBySchool =
studentsBySchool.apply(
Top.perKey(10, new CompareStudentsByAvgGrade()));
By default, the Coder
of the keys of the output PCollection
is the same as
that of the keys of the input PCollection
, and the Coder
of the values of the
output PCollection
is a ListCoder
of the Coder
of the values of the
input PCollection
.
See also smallestPerKey(int)
and largestPerKey(int)
, which sort Comparable<V>
values using their natural ordering.
See also of(int, ComparatorT)
, smallest(int)
, and largest(int)
, which take a PCollection
and return the top elements.
public static <K,V extends java.lang.Comparable<V>> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.util.List<V>>>> smallestPerKey(int count)
PTransform
that takes an input PCollection<KV<K, V>>
and returns a
PCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key
in the input PCollection
to the smallest count
values associated with that key
in the input PCollection<KV<K, V>>
, in increasing order, sorted according to their
natural order.
If there are fewer than count
values associated with a particular key, then all
those values will be in the result mapping for that key, albeit in sorted order.
All the values associated with a single key must fit into the memory of a single machine,
but there can be many more KV
s in the resulting PCollection
than can fit into
the memory of a single machine.
Example of use:
PCollection<KV<String, Integer>> keyedValues = ...;
PCollection<KV<String, List<Integer>>> smallest10ValuesPerKey =
keyedValues.apply(Top.smallestPerKey(10));
By default, the Coder
of the keys of the output PCollection
is the same as
that of the keys of the input PCollection
, and the Coder
of the values of the
output PCollection
is a ListCoder
of the Coder
of the values of the
input PCollection
.
See also largestPerKey(int)
.
See also perKey(int, ComparatorT)
, which sorts values using a user-specified Comparator
function.
See also of(int, ComparatorT)
, smallest(int)
, and largest(int)
, which take a PCollection
and return the top elements.
public static <K,V extends java.lang.Comparable<V>> Combine.PerKey<K,V,java.util.List<V>> largestPerKey(int count)
PTransform
that takes an input PCollection<KV<K, V>>
and returns a
PCollection<KV<K, List<V>>>
that contains an output element mapping each distinct key
in the input PCollection
to the largest count
values associated with that key
in the input PCollection<KV<K, V>>
, in decreasing order, sorted according to their
natural order.
If there are fewer than count
values associated with a particular key, then all
those values will be in the result mapping for that key, albeit in sorted order.
All the values associated with a single key must fit into the memory of a single machine,
but there can be many more KV
s in the resulting PCollection
than can fit into
the memory of a single machine.
Example of use:
PCollection<KV<String, Integer>> keyedValues = ...;
PCollection<KV<String, List<Integer>>> largest10ValuesPerKey =
keyedValues.apply(Top.largestPerKey(10));
By default, the Coder
of the keys of the output PCollection
is the same as
that of the keys of the input PCollection
, and the Coder
of the values of the
output PCollection
is a ListCoder
of the Coder
of the values of the
input PCollection
.
See also smallestPerKey(int)
.
See also perKey(int, ComparatorT)
, which sorts values using a user-specified Comparator
function.
See also of(int, ComparatorT)
, smallest(int)
, and largest(int)
, which take a PCollection
and return the top elements.