K
- type of the keys mapping the elementsV
- type of the values being combined per keypublic abstract static class ApproximateDistinct.PerKeyDistinct<K,V> extends PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.lang.Long>>>
ApproximateDistinct.perKey()
.name
Constructor and Description |
---|
PerKeyDistinct() |
Modifier and Type | Method and Description |
---|---|
PCollection<KV<K,java.lang.Long>> |
expand(PCollection<KV<K,V>> input)
Override this method to specify how this
PTransform should be expanded on the given
InputT . |
ApproximateDistinct.PerKeyDistinct<K,V> |
withPrecision(int p)
Sets the precision
p . |
ApproximateDistinct.PerKeyDistinct<K,V> |
withSparsePrecision(int sp)
Sets the sparse representation's precision
sp . |
getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, populateDisplayData, toString, validate
public ApproximateDistinct.PerKeyDistinct<K,V> withPrecision(int p)
p
.
Keep in mind that p
cannot be lower than 4, because the estimation would be too
inaccurate.
See ApproximateDistinct.precisionForRelativeError(double)
and ApproximateDistinct.relativeErrorForPrecision(int)
to have more information about the
relationship between precision and relative error.
p
- the precision value for the normal representationpublic ApproximateDistinct.PerKeyDistinct<K,V> withSparsePrecision(int sp)
sp
.
Values above 32 are not yet supported by the AddThis version of HyperLogLog+.
Fore more information about the sparse representation, read Google's paper available here.
sp
- the precision of HyperLogLog+' sparse representationpublic PCollection<KV<K,java.lang.Long>> expand(PCollection<KV<K,V>> input)
PTransform
PTransform
should be expanded on the given
InputT
.
NOTE: This method should not be called directly. Instead apply the PTransform
should
be applied to the InputT
using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<KV<K,V>>,PCollection<KV<K,java.lang.Long>>>