public class Count
extends java.lang.Object
PTransforms
to count the elements in a PCollection
.
perElement()
can be used to count the number of occurrences of each distinct
element in the PCollection, perKey()
can be used to count the number of values per
key, and globally()
can be used to count the total number of elements in a
PCollection.
combineFn()
can also be used manually, in combination with state and with the Combine
transform.
Modifier and Type | Method and Description |
---|---|
static <T> Combine.CombineFn<T,?,java.lang.Long> |
combineFn()
Returns a
Combine.CombineFn that counts the number of its inputs. |
static <T> PTransform<PCollection<T>,PCollection<java.lang.Long>> |
globally()
Returns a
PTransform that counts the number of elements in its input PCollection . |
static <T> PTransform<PCollection<T>,PCollection<KV<T,java.lang.Long>>> |
perElement()
Returns a
PTransform that counts the number of occurrences of each element in its input
PCollection . |
static <K,V> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.lang.Long>>> |
perKey()
Returns a
PTransform that counts the number of elements associated with each key of its
input PCollection . |
public static <T> Combine.CombineFn<T,?,java.lang.Long> combineFn()
Combine.CombineFn
that counts the number of its inputs.public static <T> PTransform<PCollection<T>,PCollection<java.lang.Long>> globally()
PTransform
that counts the number of elements in its input PCollection
.
Note: if the input collection uses a windowing strategy other than GlobalWindows
,
use Combine.globally(Count.<T>combineFn()).withoutDefaults()
instead.
public static <K,V> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.lang.Long>>> perKey()
PTransform
that counts the number of elements associated with each key of its
input PCollection
.public static <T> PTransform<PCollection<T>,PCollection<KV<T,java.lang.Long>>> perElement()
PTransform
that counts the number of occurrences of each element in its input
PCollection
.
The returned PTransform
takes a PCollection<T>
and returns a PCollection<KV<T, Long>>
representing a map from each distinct element of the input PCollection
to the number of times that element occurs in the input. Each key in the output
PCollection
is unique.
The returned transform compares two values of type T
by first encoding each element
using the input PCollection
's Coder
, then comparing the encoded bytes. Because
of this, the input coder must be deterministic. (See Coder.verifyDeterministic()
for more detail). Performing the
comparison in this manner admits efficient parallel evaluation.
By default, the Coder
of the keys of the output PCollection
is the same as
the Coder
of the elements of the input PCollection
.
Example of use:
PCollection<String> words = ...;
PCollection<KV<String, Long>> wordCounts =
words.apply(Count.<String>perElement());