public class Count
extends java.lang.Object
PTransforms to count the elements in a PCollection.
perElement() can be used to count the number of occurrences of each distinct
element in the PCollection, perKey() can be used to count the number of values per
key, and globally() can be used to count the total number of elements in a
PCollection.
combineFn() can also be used manually, in combination with state and with the Combine transform.
| Modifier and Type | Method and Description |
|---|---|
static <T> Combine.CombineFn<T,?,java.lang.Long> |
combineFn()
Returns a
Combine.CombineFn that counts the number of its inputs. |
static <T> PTransform<PCollection<T>,PCollection<java.lang.Long>> |
globally()
Returns a
PTransform that counts the number of elements in its input PCollection. |
static <T> PTransform<PCollection<T>,PCollection<KV<T,java.lang.Long>>> |
perElement()
Returns a
PTransform that counts the number of occurrences of each element in its input
PCollection. |
static <K,V> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.lang.Long>>> |
perKey()
Returns a
PTransform that counts the number of elements associated with each key of its
input PCollection. |
public static <T> Combine.CombineFn<T,?,java.lang.Long> combineFn()
Combine.CombineFn that counts the number of its inputs.public static <T> PTransform<PCollection<T>,PCollection<java.lang.Long>> globally()
PTransform that counts the number of elements in its input PCollection.
Note: if the input collection uses a windowing strategy other than GlobalWindows,
use Combine.globally(Count.<T>combineFn()).withoutDefaults() instead.
public static <K,V> PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.lang.Long>>> perKey()
PTransform that counts the number of elements associated with each key of its
input PCollection.public static <T> PTransform<PCollection<T>,PCollection<KV<T,java.lang.Long>>> perElement()
PTransform that counts the number of occurrences of each element in its input
PCollection.
The returned PTransform takes a PCollection<T> and returns a PCollection<KV<T, Long>> representing a map from each distinct element of the input PCollection to the number of times that element occurs in the input. Each key in the output
PCollection is unique.
The returned transform compares two values of type T by first encoding each element
using the input PCollection's Coder, then comparing the encoded bytes. Because
of this, the input coder must be deterministic. (See Coder.verifyDeterministic() for more detail). Performing the
comparison in this manner admits efficient parallel evaluation.
By default, the Coder of the keys of the output PCollection is the same as
the Coder of the elements of the input PCollection.
Example of use:
PCollection<String> words = ...;
PCollection<KV<String, Long>> wordCounts =
words.apply(Count.<String>perElement());