Class Count
PTransforms
to count the elements in a PCollection
.
perElement()
can be used to count the number of occurrences of each distinct
element in the PCollection, perKey()
can be used to count the number of values per
key, and globally()
can be used to count the total number of elements in a
PCollection.
combineFn()
can also be used manually, in combination with state and with the Combine
transform.
-
Method Summary
Modifier and TypeMethodDescriptionstatic <T> Combine.CombineFn
<T, ?, Long> Returns aCombine.CombineFn
that counts the number of its inputs.static <T> PTransform
<PCollection<T>, PCollection<Long>> globally()
Returns aPTransform
that counts the number of elements in its inputPCollection
.static <T> PTransform
<PCollection<T>, PCollection<KV<T, Long>>> Returns aPTransform
that counts the number of occurrences of each element in its inputPCollection
.static <K,
V> PTransform <PCollection<KV<K, V>>, PCollection<KV<K, Long>>> perKey()
Returns aPTransform
that counts the number of elements associated with each key of its inputPCollection
.
-
Method Details
-
combineFn
Returns aCombine.CombineFn
that counts the number of its inputs. -
globally
Returns aPTransform
that counts the number of elements in its inputPCollection
.Note: if the input collection uses a windowing strategy other than
GlobalWindows
, useCombine.globally(Count.<T>combineFn()).withoutDefaults()
instead. -
perKey
Returns aPTransform
that counts the number of elements associated with each key of its inputPCollection
. -
perElement
Returns aPTransform
that counts the number of occurrences of each element in its inputPCollection
.The returned
PTransform
takes aPCollection<T>
and returns aPCollection<KV<T, Long>>
representing a map from each distinct element of the inputPCollection
to the number of times that element occurs in the input. Each key in the outputPCollection
is unique.The returned transform compares two values of type
T
by first encoding each element using the inputPCollection
'sCoder
, then comparing the encoded bytes. Because of this, the input coder must be deterministic. (SeeCoder.verifyDeterministic()
for more detail). Performing the comparison in this manner admits efficient parallel evaluation.By default, the
Coder
of the keys of the outputPCollection
is the same as theCoder
of the elements of the inputPCollection
.Example of use:
PCollection<String> words = ...; PCollection<KV<String, Long>> wordCounts = words.apply(Count.<String>perElement());
-