Class HllCount.MergePartial

java.lang.Object
org.apache.beam.sdk.extensions.zetasketch.HllCount.MergePartial
Enclosing class:
HllCount

public static final class HllCount.MergePartial extends Object
Provides PTransforms to merge HLL++ sketches into a new sketch.

Only sketches of the same type can be merged together. If incompatible sketches are provided, a runtime error will occur.

If sketches of different precisions are merged, the merged sketch will get the minimum precision encountered among all the input sketches.

An "empty sketch" represented by an byte array of length 0 is returned if the input PCollection is empty.

Corresponds to the HLL_COUNT.MERGE_PARTIAL(sketch) function in BigQuery.

  • Method Details

    • globally

      public static Combine.Globally<byte @Nullable [],byte[]> globally()
      Returns a Combine.Globally PTransform that takes an input PCollection<byte[]> of HLL++ sketches and returns a PCollection<byte[]> of a new sketch merged from the input sketches.

      Only sketches of the same type can be merged together. If incompatible sketches are provided, a runtime error will occur.

      If sketches of different precisions are merged, the merged sketch will get the minimum precision encountered among all the input sketches.

      Returns a singleton PCollection with an "empty sketch" (byte array of length 0) if the input PCollection is empty.

    • perKey

      public static <K> Combine.PerKey<K,byte @Nullable [],byte[]> perKey()
      Returns a Combine.PerKey PTransform that takes an input PCollection<KV<K, byte[]>> of (key, HLL++ sketch) pairs and returns a PCollection<KV<K, byte[]>> of (key, new sketch merged from the input sketches under the key).

      If sketches of different precisions are merged, the merged sketch will get the minimum precision encountered among all the input sketches.

      Only sketches of the same type can be merged together. If incompatible sketches are provided, a runtime error will occur.