Class ApproximateDistinct.GloballyDistinct<InputT>

java.lang.Object
org.apache.beam.sdk.transforms.PTransform<PCollection<InputT>,PCollection<Long>>
org.apache.beam.sdk.extensions.sketching.ApproximateDistinct.GloballyDistinct<InputT>
Type Parameters:
InputT - the type of the elements in the input PCollection
All Implemented Interfaces:
Serializable, HasDisplayData
Enclosing class:
ApproximateDistinct

public abstract static class ApproximateDistinct.GloballyDistinct<InputT> extends PTransform<PCollection<InputT>,PCollection<Long>>
See Also:
  • Constructor Details

    • GloballyDistinct

      public GloballyDistinct()
  • Method Details

    • withPrecision

      public ApproximateDistinct.GloballyDistinct<InputT> withPrecision(int p)
      Sets the precision p.

      Keep in mind that p cannot be lower than 4, because the estimation would be too inaccurate.

      See ApproximateDistinct.precisionForRelativeError(double) and ApproximateDistinct.relativeErrorForPrecision(int) to have more information about the relationship between precision and relative error.

      Parameters:
      p - the precision value for the normal representation
    • withSparsePrecision

      public ApproximateDistinct.GloballyDistinct<InputT> withSparsePrecision(int sp)
      Sets the sparse representation's precision sp.

      Values above 32 are not yet supported by the AddThis version of HyperLogLog+.

      Fore more information about the sparse representation, read Google's paper available here.

      Parameters:
      sp - the precision of HyperLogLog+' sparse representation
    • expand

      public PCollection<Long> expand(PCollection<InputT> input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<PCollection<InputT>,PCollection<Long>>