public class Count
extends java.lang.Object
PTransforms
to count the elements in a PCollection
.
perElement()
can be used to count the number of occurrences of each
distinct element in the PCollection, perKey()
can be used to count the
number of values per key, and globally()
can be used to count the total
number of elements in a PCollection.
combineFn()
can also be used manually, in combination with state and with the
Combine
transform.
Modifier and Type | Method and Description |
---|---|
static <T> Combine.CombineFn<T,?,java.lang.Long> |
combineFn()
Returns a
Combine.CombineFn that counts the number of its inputs. |
static <T> PTransform<PCollection<T>,PCollection<java.lang.Long>> |
globally()
Returns a
PTransform that counts the number of elements in
its input PCollection . |
static <T> PTransform<PCollection<T>,PCollection<KV<T,java.lang.Long>>> |
perElement()
Returns a
PTransform that counts the number of occurrences of each element
in its input PCollection . |
static <K,V> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.lang.Long>>> |
perKey()
Returns a
PTransform that counts the number of elements
associated with each key of its input PCollection . |
public static <T> Combine.CombineFn<T,?,java.lang.Long> combineFn()
Combine.CombineFn
that counts the number of its inputs.public static <T> PTransform<PCollection<T>,PCollection<java.lang.Long>> globally()
PTransform
that counts the number of elements in
its input PCollection
.
Note: if the input collection uses a windowing strategy other than GlobalWindows
,
use Combine.globally(Count.<T>combineFn()).withoutDefaults()
instead.
public static <K,V> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.lang.Long>>> perKey()
PTransform
that counts the number of elements
associated with each key of its input PCollection
.public static <T> PTransform<PCollection<T>,PCollection<KV<T,java.lang.Long>>> perElement()
PTransform
that counts the number of occurrences of each element
in its input PCollection
.
The returned PTransform
takes a PCollection<T>
and returns a
PCollection<KV<T, Long>>
representing a map from each distinct element of the input
PCollection
to the number of times that element occurs in the input. Each key in the
output PCollection
is unique.
The returned transform compares two values of type T
by first encoding each
element using the input PCollection
's Coder
, then comparing the encoded
bytes. Because of this, the input coder must be deterministic.
(See Coder.verifyDeterministic()
for more detail).
Performing the comparison in this manner admits efficient parallel evaluation.
By default, the Coder
of the keys of the output PCollection
is the same as
the Coder
of the elements of the input PCollection
.
Example of use:
PCollection<String> words = ...;
PCollection<KV<String, Long>> wordCounts =
words.apply(Count.<String>perElement());