Class Coder<T>
- Type Parameters:
T
- the type of values being encoded and decoded
- All Implemented Interfaces:
Serializable
- Direct Known Subclasses:
CustomCoder
,StructuredCoder
,ZstdCoder
Coder<T>
defines how to encode and decode values of type T
into
byte streams.
Coder
instances are serialized during job creation and deserialized before use. This
will generally be performed by serializing the object via Java Serialization.
Coder
classes for compound types are often composed from coder classes for types
contains therein. The composition of Coder
instances into a coder for the compound class
is the subject of the CoderProvider
type, which enables automatic generic composition of
Coder
classes within the CoderRegistry
. See CoderProvider
and CoderRegistry
for more information about how coders are inferred.
All methods of a Coder
are required to be thread safe.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Deprecated.static class
Exception thrown byverifyDeterministic()
if the encoding is not deterministic, including details of why the encoding is not deterministic. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionboolean
abstract T
decode
(InputStream inStream) Decodes a value of typeT
from the given input stream in the given context.decode
(InputStream inStream, Coder.Context context) Deprecated.only implement and calldecode(InputStream)
abstract void
encode
(T value, OutputStream outStream) Encodes the given value of typeT
onto the given output stream.void
encode
(T value, OutputStream outStream, Coder.Context context) Deprecated.only implement and callencode(Object value, OutputStream)
protected long
getEncodedElementByteSize
(T value) Returns the size in bytes of the encoded value using this coder.static <T> long
getEncodedElementByteSizeUsingCoder
(Coder<T> target, T value) Returns theTypeDescriptor
for the type encoded.boolean
isRegisterByteSizeObserverCheap
(T value) Returns whetherregisterByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver)
cheap enough to call for every element, that is, if thisCoder
can calculate the byte size of the element to be coded in roughly constant time (or lazily).void
registerByteSizeObserver
(T value, org.apache.beam.sdk.util.common.ElementByteSizeObserver observer) Notifies theElementByteSizeObserver
about the byte size of the encoded value using thisCoder
.structuralValue
(T value) Returns an object with anObject.equals()
method that represents structural equality on the argument.abstract void
ThrowCoder.NonDeterministicException
if the coding is not deterministic.static void
verifyDeterministic
(Coder<?> target, String message, Iterable<Coder<?>> coders) Verifies all of the provided coders are deterministic.static void
verifyDeterministic
(Coder<?> target, String message, Coder<?>... coders) Verifies all of the provided coders are deterministic.
-
Constructor Details
-
Coder
public Coder()
-
-
Method Details
-
encode
Encodes the given value of typeT
onto the given output stream. Multiple elements can be encoded next to each other on the output stream, each coder should encode information to know how many bytes to read when decoding. A common approach is to prefix the encoding with the element's encoded length.- Throws:
IOException
- if writing to theOutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reason
-
encode
@Deprecated public void encode(T value, OutputStream outStream, Coder.Context context) throws CoderException, IOException Deprecated.only implement and callencode(Object value, OutputStream)
Encodes the given value of typeT
onto the given output stream in the given context.- Throws:
IOException
- if writing to theOutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reason
-
decode
Decodes a value of typeT
from the given input stream in the given context. Returns the decoded value. Multiple elements can be encoded next to each other on the input stream, each coder should encode information to know how many bytes to read when decoding. A common approach is to prefix the encoding with the element's encoded length.- Throws:
IOException
- if reading from theInputStream
fails for some reasonCoderException
- if the value could not be decoded for some reason
-
decode
@Deprecated public T decode(InputStream inStream, Coder.Context context) throws CoderException, IOException Deprecated.only implement and calldecode(InputStream)
Decodes a value of typeT
from the given input stream in the given context. Returns the decoded value.- Throws:
IOException
- if reading from theInputStream
fails for some reasonCoderException
- if the value could not be decoded for some reason
-
getCoderArguments
-
verifyDeterministic
ThrowCoder.NonDeterministicException
if the coding is not deterministic.In order for a
Coder
to be considered deterministic, the following must be true:- two values that compare as equal (via
Object.equals()
orComparable.compareTo()
, if supported) have the same encoding. - the
Coder
always produces a canonical encoding, which is the same for an instance of an object even if produced on different computers at different times.
- Throws:
Coder.NonDeterministicException
- if this coder is not deterministic.
- two values that compare as equal (via
-
verifyDeterministic
public static void verifyDeterministic(Coder<?> target, String message, Iterable<Coder<?>> coders) throws Coder.NonDeterministicException Verifies all of the provided coders are deterministic. If any are not, throws aCoder.NonDeterministicException
for thetarget
Coder
.- Throws:
Coder.NonDeterministicException
-
getEncodedElementByteSizeUsingCoder
public static <T> long getEncodedElementByteSizeUsingCoder(Coder<T> target, T value) throws Exception - Throws:
Exception
-
verifyDeterministic
public static void verifyDeterministic(Coder<?> target, String message, Coder<?>... coders) throws Coder.NonDeterministicException Verifies all of the provided coders are deterministic. If any are not, throws aCoder.NonDeterministicException
for thetarget
Coder
.- Throws:
Coder.NonDeterministicException
-
consistentWithEquals
public boolean consistentWithEquals()Returnstrue
if thisCoder
is injective with respect toObject.equals(java.lang.Object)
.Whenever the encoded bytes of two values are equal, then the original values are equal according to
Objects.equals()
. Note that this is well-defined fornull
.This condition is most notably false for arrays. More generally, this condition is false whenever
equals()
compares object identity, rather than performing a semantic/structural comparison.By default, returns false.
-
structuralValue
Returns an object with anObject.equals()
method that represents structural equality on the argument.For any two values
x
andy
of typeT
, if their encoded bytes are the same, then it must be the case thatstructuralValue(x).equals(structuralValue(y))
.Most notably:
- The structural value for an array coder should perform a structural comparison of the contents of the arrays, rather than the default behavior of comparing according to object identity.
- The structural value for a coder accepting
null
should be a proper object with anequals()
method, even if the input value isnull
.
See also
consistentWithEquals()
.By default, if this coder is
consistentWithEquals()
, and the value is not null, returns the provided object. Otherwise, encodes the value into abyte[]
, and returns an object that performs array equality on the encoded bytes. -
isRegisterByteSizeObserverCheap
Returns whetherregisterByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver)
cheap enough to call for every element, that is, if thisCoder
can calculate the byte size of the element to be coded in roughly constant time (or lazily).Not intended to be called by user code, but instead by
PipelineRunner
implementations.By default, returns false. The default
registerByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver)
implementation invokesgetEncodedElementByteSize(T)
which requires re-encoding an element unless it is overridden. This is considered expensive. -
registerByteSizeObserver
public void registerByteSizeObserver(T value, org.apache.beam.sdk.util.common.ElementByteSizeObserver observer) throws Exception Notifies theElementByteSizeObserver
about the byte size of the encoded value using thisCoder
.Not intended to be called by user code, but instead by
PipelineRunner
implementations.By default, this notifies
observer
about the byte size of the encoded value using this coder as returned bygetEncodedElementByteSize(T)
.- Throws:
Exception
-
getEncodedElementByteSize
Returns the size in bytes of the encoded value using this coder.- Throws:
Exception
-
getEncodedTypeDescriptor
Returns theTypeDescriptor
for the type encoded.
-
Coder.Context
.