Class CoderHelpers

java.lang.Object
org.apache.beam.runners.spark.coders.CoderHelpers

public final class CoderHelpers extends Object
Serialization utility class.
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static class 
    A function for converting a byte array pair to a key-value pair.
  • Method Summary

    Modifier and Type
    Method
    Description
    static <T> T
    fromByteArray(byte[] serialized, Coder<T> coder)
    Utility method for deserializing a byte array using the specified coder.
    static <T> Iterable<T>
    fromByteArrays(Collection<byte[]> serialized, Coder<T> coder)
    Utility method for deserializing a Iterable of byte arrays using the specified coder.
    static <T> org.apache.spark.api.java.function.Function<byte[],T>
    A function wrapper for converting a byte array to an object.
    static <K, V> org.apache.spark.api.java.function.PairFunction<scala.Tuple2<ByteArray,Iterable<byte[]>>,K,Iterable<V>>
    fromByteFunctionIterable(Coder<K> keyCoder, Coder<V> valueCoder)
    A function wrapper for converting a byte array pair to a key-value pair, where values are Iterable.
    static <T> byte[]
    toByteArray(T value, Coder<T> coder)
    Utility method for serializing an object using the specified coder.
    static <T> List<byte[]>
    toByteArrays(Iterable<T> values, Coder<T> coder)
    Utility method for serializing a Iterable of values using the specified coder.
    static <T> List<byte[]>
    toByteArrays(Iterator<T> values, Coder<T> coder)
    Utility method for serializing a Iterator of values using the specified coder.
    static <T> byte[]
    toByteArrayWithTs(T value, Coder<T> coder, Instant timestamp)
    Utility method for serializing an object using the specified coder, appending timestamp representation.
    static <K, V> org.apache.spark.api.java.function.PairFunction<scala.Tuple2<K,V>,ByteArray,byte[]>
    toByteFunction(Coder<K> keyCoder, Coder<V> valueCoder)
    A function wrapper for converting a key-value pair to a byte array pair.
    static <T> org.apache.spark.api.java.function.Function<T,byte[]>
    A function wrapper for converting an object to a bytearray.
    static <K, V> org.apache.spark.api.java.function.PairFunction<scala.Tuple2<K,V>,ByteArray,byte[]>
    toByteFunctionWithTs(Coder<K> keyCoder, Coder<V> valueCoder, org.apache.spark.api.java.function.Function<scala.Tuple2<K,V>,Instant> timestamp)
    A function wrapper for converting a key-value pair to a byte array pair, where the key in resulting ByteArray contains (key, timestamp).

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • toByteArray

      public static <T> byte[] toByteArray(T value, Coder<T> coder)
      Utility method for serializing an object using the specified coder.
      Type Parameters:
      T - type of value that is serialized
      Parameters:
      value - Value to serialize.
      coder - Coder to serialize with.
      Returns:
      Byte array representing serialized object.
    • toByteArrayWithTs

      public static <T> byte[] toByteArrayWithTs(T value, Coder<T> coder, Instant timestamp)
      Utility method for serializing an object using the specified coder, appending timestamp representation. This is useful when sorting by timestamp
      Type Parameters:
      T - type of value that is serialized
      Parameters:
      value - Value to serialize.
      coder - Coder to serialize with.
      timestamp - timestamp to be bundled into key's ByteArray representation
      Returns:
      Byte array representing serialized object.
    • toByteArrays

      public static <T> List<byte[]> toByteArrays(Iterable<T> values, Coder<T> coder)
      Utility method for serializing a Iterable of values using the specified coder.
      Type Parameters:
      T - type of value that is serialized
      Parameters:
      values - Values to serialize.
      coder - Coder to serialize with.
      Returns:
      List of bytes representing serialized objects.
    • toByteArrays

      public static <T> List<byte[]> toByteArrays(Iterator<T> values, Coder<T> coder)
      Utility method for serializing a Iterator of values using the specified coder.
      Type Parameters:
      T - type of value that is serialized
      Parameters:
      values - Values to serialize.
      coder - Coder to serialize with.
      Returns:
      List of bytes representing serialized objects.
    • fromByteArray

      public static <T> T fromByteArray(byte[] serialized, Coder<T> coder)
      Utility method for deserializing a byte array using the specified coder.
      Type Parameters:
      T - Type of object to be returned.
      Parameters:
      serialized - bytearray to be deserialized.
      coder - Coder to deserialize with.
      Returns:
      Deserialized object.
    • fromByteArrays

      public static <T> Iterable<T> fromByteArrays(Collection<byte[]> serialized, Coder<T> coder)
      Utility method for deserializing a Iterable of byte arrays using the specified coder.
      Type Parameters:
      T - Type of object to be returned.
      Parameters:
      serialized - bytearrays to be deserialized.
      coder - Coder to deserialize with.
      Returns:
      Iterable of deserialized objects.
    • toByteFunction

      public static <T> org.apache.spark.api.java.function.Function<T,byte[]> toByteFunction(Coder<T> coder)
      A function wrapper for converting an object to a bytearray.
      Type Parameters:
      T - The type of the object being serialized.
      Parameters:
      coder - Coder to serialize with.
      Returns:
      A function that accepts an object and returns its coder-serialized form.
    • fromByteFunction

      public static <T> org.apache.spark.api.java.function.Function<byte[],T> fromByteFunction(Coder<T> coder)
      A function wrapper for converting a byte array to an object.
      Type Parameters:
      T - The type of the object being deserialized.
      Parameters:
      coder - Coder to deserialize with.
      Returns:
      A function that accepts a byte array and returns its corresponding object.
    • toByteFunction

      public static <K, V> org.apache.spark.api.java.function.PairFunction<scala.Tuple2<K,V>,ByteArray,byte[]> toByteFunction(Coder<K> keyCoder, Coder<V> valueCoder)
      A function wrapper for converting a key-value pair to a byte array pair.
      Type Parameters:
      K - The type of the key being serialized.
      V - The type of the value being serialized.
      Parameters:
      keyCoder - Coder to serialize keys.
      valueCoder - Coder to serialize values.
      Returns:
      A function that accepts a key-value pair and returns a pair of byte arrays.
    • toByteFunctionWithTs

      public static <K, V> org.apache.spark.api.java.function.PairFunction<scala.Tuple2<K,V>,ByteArray,byte[]> toByteFunctionWithTs(Coder<K> keyCoder, Coder<V> valueCoder, org.apache.spark.api.java.function.Function<scala.Tuple2<K,V>,Instant> timestamp)
      A function wrapper for converting a key-value pair to a byte array pair, where the key in resulting ByteArray contains (key, timestamp).
      Type Parameters:
      K - The type of the key being serialized.
      V - The type of the value being serialized.
      Parameters:
      keyCoder - Coder to serialize keys.
      valueCoder - Coder to serialize values.
      timestamp - timestamp of the input Tuple2
      Returns:
      A function that accepts a key-value pair and returns a pair of byte arrays.
    • fromByteFunctionIterable

      public static <K, V> org.apache.spark.api.java.function.PairFunction<scala.Tuple2<ByteArray,Iterable<byte[]>>,K,Iterable<V>> fromByteFunctionIterable(Coder<K> keyCoder, Coder<V> valueCoder)
      A function wrapper for converting a byte array pair to a key-value pair, where values are Iterable.
      Type Parameters:
      K - The type of the key being deserialized.
      V - The type of the value being deserialized.
      Parameters:
      keyCoder - Coder to deserialize keys.
      valueCoder - Coder to deserialize values.
      Returns:
      A function that accepts a pair of byte arrays and returns a key-value pair.