InputT - the type of the elements in the input PCollectionpublic static class ApproximateDistinct.ApproximateDistinctFn<InputT> extends Combine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus>
Combine.CombineFn of ApproximateDistinct transforms.| Modifier and Type | Method and Description |
|---|---|
com.clearspring.analytics.stream.cardinality.HyperLogLogPlus |
addInput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus acc,
InputT record)
Adds the given input value to the given accumulator, returning the new accumulator value.
|
static <InputT> ApproximateDistinct.ApproximateDistinctFn<InputT> |
create(Coder<InputT> coder)
Returns an
ApproximateDistinct.ApproximateDistinctFn combiner with the given input coder. |
com.clearspring.analytics.stream.cardinality.HyperLogLogPlus |
createAccumulator()
Returns a new, mutable accumulator value, representing the accumulation of zero input values.
|
com.clearspring.analytics.stream.cardinality.HyperLogLogPlus |
extractOutput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus accumulator)
Output the whole structure so it can be queried, reused or stored easily.
|
java.lang.reflect.TypeVariable<?> |
getAccumTVariable()
Returns the
TypeVariable of AccumT. |
Coder<AccumT> |
getAccumulatorCoder(CoderRegistry registry,
Coder<InputT> inputCoder)
Returns the
Coder to use for accumulator AccumT values, or null if it is not
able to be inferred. |
Coder<OutputT> |
getDefaultOutputCoder(CoderRegistry registry,
Coder<InputT> inputCoder)
Returns the
Coder to use by default for output OutputT values, or null if it
is not able to be inferred. |
java.lang.String |
getIncompatibleGlobalWindowErrorMessage()
Returns the error message for not supported default values in Combine.globally().
|
java.lang.reflect.TypeVariable<?> |
getInputTVariable()
Returns the
TypeVariable of InputT. |
java.lang.reflect.TypeVariable<?> |
getOutputTVariable()
Returns the
TypeVariable of OutputT. |
com.clearspring.analytics.stream.cardinality.HyperLogLogPlus |
mergeAccumulators(java.lang.Iterable<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> accumulators)
Returns an accumulator representing the accumulation of all the input values accumulated in
the merging accumulators.
|
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
ApproximateDistinct.ApproximateDistinctFn<InputT> |
withPrecision(int p)
Returns an
ApproximateDistinct.ApproximateDistinctFn combiner with a new precision p. |
ApproximateDistinct.ApproximateDistinctFn<InputT> |
withSparseRepresentation(int sp)
Returns an
ApproximateDistinct.ApproximateDistinctFn combiner with a new sparse representation's
precision sp. |
apply, compact, defaultValue, getInputType, getOutputTypepublic static <InputT> ApproximateDistinct.ApproximateDistinctFn<InputT> create(Coder<InputT> coder)
ApproximateDistinct.ApproximateDistinctFn combiner with the given input coder.coder - the coder that encodes the elements' typepublic ApproximateDistinct.ApproximateDistinctFn<InputT> withPrecision(int p)
ApproximateDistinct.ApproximateDistinctFn combiner with a new precision p.
Keep in mind that p cannot be lower than 4, because the estimation would be too
inaccurate.
See ApproximateDistinct.precisionForRelativeError(double) and ApproximateDistinct.relativeErrorForPrecision(int) to have more information about the
relationship between precision and relative error.
p - the precision value for the normal representationpublic ApproximateDistinct.ApproximateDistinctFn<InputT> withSparseRepresentation(int sp)
ApproximateDistinct.ApproximateDistinctFn combiner with a new sparse representation's
precision sp.
Values above 32 are not yet supported by the AddThis version of HyperLogLog+.
Fore more information about the sparse representation, read Google's paper available here.
sp - the precision of HyperLogLog+' sparse representationpublic com.clearspring.analytics.stream.cardinality.HyperLogLogPlus createAccumulator()
Combine.CombineFncreateAccumulator in class Combine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus>public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus addInput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus acc,
InputT record)
Combine.CombineFnaddInput in class Combine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus>acc - may be modified and returned for efficiencyrecord - should not be mutatedpublic com.clearspring.analytics.stream.cardinality.HyperLogLogPlus extractOutput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus accumulator)
extractOutput in class Combine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus>accumulator - can be modified for efficiencypublic com.clearspring.analytics.stream.cardinality.HyperLogLogPlus mergeAccumulators(java.lang.Iterable<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> accumulators)
Combine.CombineFnmergeAccumulators in class Combine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus>accumulators - only the first accumulator may be modified and returned for efficiency;
the other accumulators should not be mutated, because they may be shared with other code
and mutating them could lead to incorrect results or data corruption.public void populateDisplayData(DisplayData.Builder builder)
populateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData). Implementations may call super.populateDisplayData(builder) in order to register display data in the current namespace,
but should otherwise use subcomponent.populateDisplayData(builder) to use the namespace
of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData in interface HasDisplayDatabuilder - The builder to populate with display data.HasDisplayDatapublic Coder<AccumT> getAccumulatorCoder(CoderRegistry registry, Coder<InputT> inputCoder) throws CannotProvideCoderException
CombineFnBase.GlobalCombineFnCoder to use for accumulator AccumT values, or null if it is not
able to be inferred.
By default, uses the knowledge of the Coder being used for InputT values
and the enclosing Pipeline's CoderRegistry to try to infer the Coder for
AccumT values.
This is the Coder used to send data through a communication-intensive shuffle step, so a compact and efficient representation may have significant performance benefits.
getAccumulatorCoder in interface CombineFnBase.GlobalCombineFn<InputT,AccumT,OutputT>CannotProvideCoderExceptionpublic Coder<OutputT> getDefaultOutputCoder(CoderRegistry registry, Coder<InputT> inputCoder) throws CannotProvideCoderException
CombineFnBase.GlobalCombineFnCoder to use by default for output OutputT values, or null if it
is not able to be inferred.
By default, uses the knowledge of the Coder being used for input InputT
values and the enclosing Pipeline's CoderRegistry to try to infer the Coder
for OutputT values.
getDefaultOutputCoder in interface CombineFnBase.GlobalCombineFn<InputT,AccumT,OutputT>CannotProvideCoderExceptionpublic java.lang.String getIncompatibleGlobalWindowErrorMessage()
CombineFnBase.GlobalCombineFngetIncompatibleGlobalWindowErrorMessage in interface CombineFnBase.GlobalCombineFn<InputT,AccumT,OutputT>public java.lang.reflect.TypeVariable<?> getInputTVariable()
TypeVariable of InputT.public java.lang.reflect.TypeVariable<?> getAccumTVariable()
TypeVariable of AccumT.public java.lang.reflect.TypeVariable<?> getOutputTVariable()
TypeVariable of OutputT.