Class ApproximateDistinct.ApproximateDistinctFn<InputT>
- Type Parameters:
InputT- the type of the elements in the inputPCollection
- All Implemented Interfaces:
Serializable,CombineFnBase.GlobalCombineFn<InputT,,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> HasDisplayData
- Enclosing class:
ApproximateDistinct
Combine.CombineFn of ApproximateDistinct transforms.- See Also:
-
Method Summary
Modifier and TypeMethodDescriptioncom.clearspring.analytics.stream.cardinality.HyperLogLogPlusAdds the given input value to the given accumulator, returning the new accumulator value.static <InputT> ApproximateDistinct.ApproximateDistinctFn<InputT> Returns anApproximateDistinct.ApproximateDistinctFncombiner with the given input coder.com.clearspring.analytics.stream.cardinality.HyperLogLogPlusReturns a new, mutable accumulator value, representing the accumulation of zero input values.com.clearspring.analytics.stream.cardinality.HyperLogLogPlusextractOutput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus accumulator) Output the whole structure so it can be queried, reused or stored easily.TypeVariable<?> Returns theTypeVariableofAccumT.Coder<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> getAccumulatorCoder(CoderRegistry registry, Coder<InputT> inputCoder) Returns theCoderto use for accumulatorAccumTvalues, or null if it is not able to be inferred.Coder<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> getDefaultOutputCoder(CoderRegistry registry, Coder<InputT> inputCoder) Returns theCoderto use by default for outputOutputTvalues, or null if it is not able to be inferred.Returns the error message for not supported default values in Combine.globally().TypeVariable<?> Returns theTypeVariableofInputT.TypeVariable<?> Returns theTypeVariableofOutputT.com.clearspring.analytics.stream.cardinality.HyperLogLogPlusmergeAccumulators(Iterable<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> accumulators) Returns an accumulator representing the accumulation of all the input values accumulated in the merging accumulators.voidpopulateDisplayData(DisplayData.Builder builder) Register display data for the given transform or component.withPrecision(int p) Returns anApproximateDistinct.ApproximateDistinctFncombiner with a new precisionp.withSparseRepresentation(int sp) Returns anApproximateDistinct.ApproximateDistinctFncombiner with a new sparse representation's precisionsp.Methods inherited from class org.apache.beam.sdk.transforms.Combine.CombineFn
apply, compact, defaultValue, getInputType, getOutputType
-
Method Details
-
create
public static <InputT> ApproximateDistinct.ApproximateDistinctFn<InputT> create(Coder<InputT> coder) Returns anApproximateDistinct.ApproximateDistinctFncombiner with the given input coder.- Parameters:
coder- the coder that encodes the elements' type
-
withPrecision
Returns anApproximateDistinct.ApproximateDistinctFncombiner with a new precisionp.Keep in mind that
pcannot be lower than 4, because the estimation would be too inaccurate.See
ApproximateDistinct.precisionForRelativeError(double)andApproximateDistinct.relativeErrorForPrecision(int)to have more information about the relationship between precision and relative error.- Parameters:
p- the precision value for the normal representation
-
withSparseRepresentation
Returns anApproximateDistinct.ApproximateDistinctFncombiner with a new sparse representation's precisionsp.Values above 32 are not yet supported by the AddThis version of HyperLogLog+.
Fore more information about the sparse representation, read Google's paper available here.
- Parameters:
sp- the precision of HyperLogLog+' sparse representation
-
createAccumulator
public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus createAccumulator()Description copied from class:Combine.CombineFnReturns a new, mutable accumulator value, representing the accumulation of zero input values.- Specified by:
createAccumulatorin classCombine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus>
-
addInput
public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus addInput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus acc, InputT record) Description copied from class:Combine.CombineFnAdds the given input value to the given accumulator, returning the new accumulator value.- Specified by:
addInputin classCombine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> - Parameters:
acc- may be modified and returned for efficiencyrecord- should not be mutated
-
extractOutput
public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus extractOutput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus accumulator) Output the whole structure so it can be queried, reused or stored easily.- Specified by:
extractOutputin classCombine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> - Parameters:
accumulator- can be modified for efficiency
-
mergeAccumulators
public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus mergeAccumulators(Iterable<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> accumulators) Description copied from class:Combine.CombineFnReturns an accumulator representing the accumulation of all the input values accumulated in the merging accumulators.- Specified by:
mergeAccumulatorsin classCombine.CombineFn<InputT,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> - Parameters:
accumulators- only the first accumulator may be modified and returned for efficiency; the other accumulators should not be mutated, because they may be shared with other code and mutating them could lead to incorrect results or data corruption.
-
populateDisplayData
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData). Implementations may callsuper.populateDisplayData(builder)in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayDatain interfaceHasDisplayData- Parameters:
builder- The builder to populate with display data.- See Also:
-
getAccumulatorCoder
public Coder<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> getAccumulatorCoder(CoderRegistry registry, Coder<InputT> inputCoder) throws CannotProvideCoderException Description copied from interface:CombineFnBase.GlobalCombineFnReturns theCoderto use for accumulatorAccumTvalues, or null if it is not able to be inferred.By default, uses the knowledge of the
Coderbeing used forInputTvalues and the enclosingPipeline'sCoderRegistryto try to infer the Coder forAccumTvalues.This is the Coder used to send data through a communication-intensive shuffle step, so a compact and efficient representation may have significant performance benefits.
- Specified by:
getAccumulatorCoderin interfaceCombineFnBase.GlobalCombineFn<InputT,AccumT, OutputT> - Throws:
CannotProvideCoderException
-
getDefaultOutputCoder
public Coder<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> getDefaultOutputCoder(CoderRegistry registry, Coder<InputT> inputCoder) throws CannotProvideCoderException Description copied from interface:CombineFnBase.GlobalCombineFnReturns theCoderto use by default for outputOutputTvalues, or null if it is not able to be inferred.By default, uses the knowledge of the
Coderbeing used for inputInputTvalues and the enclosingPipeline'sCoderRegistryto try to infer the Coder forOutputTvalues.- Specified by:
getDefaultOutputCoderin interfaceCombineFnBase.GlobalCombineFn<InputT,AccumT, OutputT> - Throws:
CannotProvideCoderException
-
getIncompatibleGlobalWindowErrorMessage
Description copied from interface:CombineFnBase.GlobalCombineFnReturns the error message for not supported default values in Combine.globally().- Specified by:
getIncompatibleGlobalWindowErrorMessagein interfaceCombineFnBase.GlobalCombineFn<InputT,AccumT, OutputT>
-
getInputTVariable
Returns theTypeVariableofInputT. -
getAccumTVariable
Returns theTypeVariableofAccumT. -
getOutputTVariable
Returns theTypeVariableofOutputT.
-