Class HllCount.Extract

java.lang.Object
org.apache.beam.sdk.extensions.zetasketch.HllCount.Extract
Enclosing class:
HllCount

public static final class HllCount.Extract extends Object
Provides PTransforms to extract the estimated count of distinct elements (as Longs) from each HLL++ sketch.

When extracting from an "empty sketch" represented by an byte array of length 0, the result returned is 0.

Corresponds to the HLL_COUNT.EXTRACT(sketch) function in BigQuery.

  • Method Details

    • globally

      public static PTransform<PCollection<byte[]>,PCollection<Long>> globally()
      Returns a PTransform that takes an input PCollection<byte[]> of HLL++ sketches and returns a PCollection<Long> of the estimated count of distinct elements extracted from each sketch.

      Returns 0 if the input element is an "empty sketch" (byte array of length 0).

    • perKey

      public static <K> PTransform<PCollection<KV<K,byte[]>>,PCollection<KV<K,Long>>> perKey()
      Returns a PTransform that takes an input PCollection<KV<K, byte[]>> of (key, HLL++ sketch) pairs and returns a PCollection<KV<K, Long>> of (key, estimated count of distinct elements extracted from each sketch).