Class ZstdCoder<T>

java.lang.Object
org.apache.beam.sdk.coders.Coder<T>
org.apache.beam.sdk.coders.ZstdCoder<T>
All Implemented Interfaces:
Serializable

public class ZstdCoder<T> extends Coder<T>
Wraps an existing coder with Zstandard compression. It makes sense to use this coder when it's likely that the encoded value is quite large and compressible or when a dictionary is available to improve compression performance.

This coder uses the Zstandard compression library's direct compression methods (from byte[] to byte[]) and thus requires that the inner coder's encoded value must fit in a byte[].

See Also:
  • Method Details

    • of

      public static <T> ZstdCoder<T> of(Coder<T> innerCoder, byte[] dict, int level)
      Wraps the given coder into a ZstdCoder.
    • of

      public static <T> ZstdCoder<T> of(Coder<T> innerCoder, byte[] dict)
      Wraps the given coder into a ZstdCoder.
    • of

      public static <T> ZstdCoder<T> of(Coder<T> innerCoder, int level)
      Wraps the given coder into a ZstdCoder.
    • of

      public static <T> ZstdCoder<T> of(Coder<T> innerCoder)
      Wraps the given coder into a ZstdCoder.
    • encode

      public void encode(T value, OutputStream os) throws IOException
      Description copied from class: Coder
      Encodes the given value of type T onto the given output stream. Multiple elements can be encoded next to each other on the output stream, each coder should encode information to know how many bytes to read when decoding. A common approach is to prefix the encoding with the element's encoded length.
      Specified by:
      encode in class Coder<T>
      Throws:
      IOException - if writing to the OutputStream fails for some reason
    • decode

      public T decode(InputStream is) throws IOException
      Description copied from class: Coder
      Decodes a value of type T from the given input stream in the given context. Returns the decoded value. Multiple elements can be encoded next to each other on the input stream, each coder should encode information to know how many bytes to read when decoding. A common approach is to prefix the encoding with the element's encoded length.
      Specified by:
      decode in class Coder<T>
      Throws:
      IOException - if reading from the InputStream fails for some reason
    • getCoderArguments

      public List<? extends Coder<?>> getCoderArguments()
      Description copied from class: Coder
      If this is a Coder for a parameterized type, returns the list of Coders being used for each of the parameters in the same order they appear within the parameterized type's type signature. If this cannot be done, or this Coder does not encode/decode a parameterized type, returns the empty list.
      Specified by:
      getCoderArguments in class Coder<T>
    • verifyDeterministic

      public void verifyDeterministic() throws Coder.NonDeterministicException
      Throw Coder.NonDeterministicException if the coding is not deterministic.

      In order for a Coder to be considered deterministic, the following must be true:

      • two values that compare as equal (via Object.equals() or Comparable.compareTo(), if supported) have the same encoding.
      • the Coder always produces a canonical encoding, which is the same for an instance of an object even if produced on different computers at different times.

      ZstdCoder is deterministic if the inner coder is deterministic.

      Specified by:
      verifyDeterministic in class Coder<T>
      Throws:
      Coder.NonDeterministicException - if this coder is not deterministic.
    • consistentWithEquals

      public boolean consistentWithEquals()
      Returns true if this Coder is injective with respect to Object.equals(java.lang.Object).

      Whenever the encoded bytes of two values are equal, then the original values are equal according to Objects.equals(). Note that this is well-defined for null.

      This condition is most notably false for arrays. More generally, this condition is false whenever equals() compares object identity, rather than performing a semantic/structural comparison.

      By default, returns false.

      ZstdCoder is consistent with equals if the inner coder is consistent with equals.

      Overrides:
      consistentWithEquals in class Coder<T>
      Returns:
      The same value as the inner coder.
    • structuralValue

      public Object structuralValue(T value)
      Returns an object with an Object.equals() method that represents structural equality on the argument.

      For any two values x and y of type T, if their encoded bytes are the same, then it must be the case that structuralValue(x).equals(structuralValue(y)).

      Most notably:

      • The structural value for an array coder should perform a structural comparison of the contents of the arrays, rather than the default behavior of comparing according to object identity.
      • The structural value for a coder accepting null should be a proper object with an equals() method, even if the input value is null.

      See also Coder.consistentWithEquals().

      By default, if this coder is Coder.consistentWithEquals(), and the value is not null, returns the provided object. Otherwise, encodes the value into a byte[], and returns an object that performs array equality on the encoded bytes.

      ZstdCoder uses the structural value of the inner coder.

      Overrides:
      structuralValue in class Coder<T>
      Returns:
      The structural value of the inner coder.
    • equals

      public boolean equals(@Nullable Object o)
      Overrides:
      equals in class Object
      Returns:
      true if the two ZstdCoder instances have the same class, inner coder, dictionary and compression level.
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object