Class ApproximateDistinct.ApproximateDistinctFn<InputT>
- Type Parameters:
InputT
- the type of the elements in the inputPCollection
- All Implemented Interfaces:
Serializable
,CombineFnBase.GlobalCombineFn<InputT,
,com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> HasDisplayData
- Enclosing class:
ApproximateDistinct
Combine.CombineFn
of ApproximateDistinct
transforms.- See Also:
-
Method Summary
Modifier and TypeMethodDescriptioncom.clearspring.analytics.stream.cardinality.HyperLogLogPlus
Adds the given input value to the given accumulator, returning the new accumulator value.static <InputT> ApproximateDistinct.ApproximateDistinctFn
<InputT> Returns anApproximateDistinct.ApproximateDistinctFn
combiner with the given input coder.com.clearspring.analytics.stream.cardinality.HyperLogLogPlus
Returns a new, mutable accumulator value, representing the accumulation of zero input values.com.clearspring.analytics.stream.cardinality.HyperLogLogPlus
extractOutput
(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus accumulator) Output the whole structure so it can be queried, reused or stored easily.TypeVariable
<?> Returns theTypeVariable
ofAccumT
.Coder
<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> getAccumulatorCoder
(CoderRegistry registry, Coder<InputT> inputCoder) Returns theCoder
to use for accumulatorAccumT
values, or null if it is not able to be inferred.Coder
<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> getDefaultOutputCoder
(CoderRegistry registry, Coder<InputT> inputCoder) Returns theCoder
to use by default for outputOutputT
values, or null if it is not able to be inferred.Returns the error message for not supported default values in Combine.globally().TypeVariable
<?> Returns theTypeVariable
ofInputT
.TypeVariable
<?> Returns theTypeVariable
ofOutputT
.com.clearspring.analytics.stream.cardinality.HyperLogLogPlus
mergeAccumulators
(Iterable<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> accumulators) Returns an accumulator representing the accumulation of all the input values accumulated in the merging accumulators.void
populateDisplayData
(DisplayData.Builder builder) Register display data for the given transform or component.withPrecision
(int p) Returns anApproximateDistinct.ApproximateDistinctFn
combiner with a new precisionp
.withSparseRepresentation
(int sp) Returns anApproximateDistinct.ApproximateDistinctFn
combiner with a new sparse representation's precisionsp
.Methods inherited from class org.apache.beam.sdk.transforms.Combine.CombineFn
apply, compact, defaultValue, getInputType, getOutputType
-
Method Details
-
create
public static <InputT> ApproximateDistinct.ApproximateDistinctFn<InputT> create(Coder<InputT> coder) Returns anApproximateDistinct.ApproximateDistinctFn
combiner with the given input coder.- Parameters:
coder
- the coder that encodes the elements' type
-
withPrecision
Returns anApproximateDistinct.ApproximateDistinctFn
combiner with a new precisionp
.Keep in mind that
p
cannot be lower than 4, because the estimation would be too inaccurate.See
ApproximateDistinct.precisionForRelativeError(double)
andApproximateDistinct.relativeErrorForPrecision(int)
to have more information about the relationship between precision and relative error.- Parameters:
p
- the precision value for the normal representation
-
withSparseRepresentation
Returns anApproximateDistinct.ApproximateDistinctFn
combiner with a new sparse representation's precisionsp
.Values above 32 are not yet supported by the AddThis version of HyperLogLog+.
Fore more information about the sparse representation, read Google's paper available here.
- Parameters:
sp
- the precision of HyperLogLog+' sparse representation
-
createAccumulator
public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus createAccumulator()Description copied from class:Combine.CombineFn
Returns a new, mutable accumulator value, representing the accumulation of zero input values.- Specified by:
createAccumulator
in classCombine.CombineFn<InputT,
com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus>
-
addInput
public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus addInput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus acc, InputT record) Description copied from class:Combine.CombineFn
Adds the given input value to the given accumulator, returning the new accumulator value.- Specified by:
addInput
in classCombine.CombineFn<InputT,
com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> - Parameters:
acc
- may be modified and returned for efficiencyrecord
- should not be mutated
-
extractOutput
public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus extractOutput(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus accumulator) Output the whole structure so it can be queried, reused or stored easily.- Specified by:
extractOutput
in classCombine.CombineFn<InputT,
com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> - Parameters:
accumulator
- can be modified for efficiency
-
mergeAccumulators
public com.clearspring.analytics.stream.cardinality.HyperLogLogPlus mergeAccumulators(Iterable<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> accumulators) Description copied from class:Combine.CombineFn
Returns an accumulator representing the accumulation of all the input values accumulated in the merging accumulators.- Specified by:
mergeAccumulators
in classCombine.CombineFn<InputT,
com.clearspring.analytics.stream.cardinality.HyperLogLogPlus, com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> - Parameters:
accumulators
- only the first accumulator may be modified and returned for efficiency; the other accumulators should not be mutated, because they may be shared with other code and mutating them could lead to incorrect results or data corruption.
-
populateDisplayData
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData)
. Implementations may callsuper.populateDisplayData(builder)
in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)
to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayData
in interfaceHasDisplayData
- Parameters:
builder
- The builder to populate with display data.- See Also:
-
getAccumulatorCoder
public Coder<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> getAccumulatorCoder(CoderRegistry registry, Coder<InputT> inputCoder) throws CannotProvideCoderException Description copied from interface:CombineFnBase.GlobalCombineFn
Returns theCoder
to use for accumulatorAccumT
values, or null if it is not able to be inferred.By default, uses the knowledge of the
Coder
being used forInputT
values and the enclosingPipeline
'sCoderRegistry
to try to infer the Coder forAccumT
values.This is the Coder used to send data through a communication-intensive shuffle step, so a compact and efficient representation may have significant performance benefits.
- Specified by:
getAccumulatorCoder
in interfaceCombineFnBase.GlobalCombineFn<InputT,
AccumT, OutputT> - Throws:
CannotProvideCoderException
-
getDefaultOutputCoder
public Coder<com.clearspring.analytics.stream.cardinality.HyperLogLogPlus> getDefaultOutputCoder(CoderRegistry registry, Coder<InputT> inputCoder) throws CannotProvideCoderException Description copied from interface:CombineFnBase.GlobalCombineFn
Returns theCoder
to use by default for outputOutputT
values, or null if it is not able to be inferred.By default, uses the knowledge of the
Coder
being used for inputInputT
values and the enclosingPipeline
'sCoderRegistry
to try to infer the Coder forOutputT
values.- Specified by:
getDefaultOutputCoder
in interfaceCombineFnBase.GlobalCombineFn<InputT,
AccumT, OutputT> - Throws:
CannotProvideCoderException
-
getIncompatibleGlobalWindowErrorMessage
Description copied from interface:CombineFnBase.GlobalCombineFn
Returns the error message for not supported default values in Combine.globally().- Specified by:
getIncompatibleGlobalWindowErrorMessage
in interfaceCombineFnBase.GlobalCombineFn<InputT,
AccumT, OutputT>
-
getInputTVariable
Returns theTypeVariable
ofInputT
. -
getAccumTVariable
Returns theTypeVariable
ofAccumT
. -
getOutputTVariable
Returns theTypeVariable
ofOutputT
.
-