T
- the type of values being encoded and decodedpublic abstract class Coder<T>
extends java.lang.Object
implements java.io.Serializable
Coder<T>
defines how to encode and decode values of type T
into
byte streams.
Coder
instances are serialized during job creation and deserialized before use. This
will generally be performed by serializing the object via Java Serialization.
Coder
classes for compound types are often composed from coder classes for types
contains therein. The composition of Coder
instances into a coder for the compound class
is the subject of the CoderProvider
type, which enables automatic generic composition of
Coder
classes within the CoderRegistry
. See CoderProvider
and CoderRegistry
for more information about how coders are inferred.
All methods of a Coder
are required to be thread safe.
Modifier and Type | Class and Description |
---|---|
static class |
Coder.Context
Deprecated.
To implement a coder, do not use any
Coder.Context . Just implement only those
abstract methods which do not accept a Coder.Context and leave the default
implementations for methods accepting a Coder.Context . |
static class |
Coder.NonDeterministicException
Exception thrown by
verifyDeterministic() if the encoding is not deterministic,
including details of why the encoding is not deterministic. |
Constructor and Description |
---|
Coder() |
Modifier and Type | Method and Description |
---|---|
boolean |
consistentWithEquals()
|
abstract T |
decode(java.io.InputStream inStream)
Decodes a value of type
T from the given input stream in the given context. |
T |
decode(java.io.InputStream inStream,
Coder.Context context)
Deprecated.
only implement and call
decode(InputStream) |
abstract void |
encode(T value,
java.io.OutputStream outStream)
Encodes the given value of type
T onto the given output stream. |
void |
encode(T value,
java.io.OutputStream outStream,
Coder.Context context)
Deprecated.
only implement and call
encode(Object value, OutputStream) |
abstract java.util.List<? extends Coder<?>> |
getCoderArguments()
|
protected long |
getEncodedElementByteSize(T value)
Returns the size in bytes of the encoded value using this coder.
|
TypeDescriptor<T> |
getEncodedTypeDescriptor()
Returns the
TypeDescriptor for the type encoded. |
boolean |
isRegisterByteSizeObserverCheap(T value)
Returns whether
registerByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver) cheap enough to call for every element, that
is, if this Coder can calculate the byte size of the element to be coded in roughly
constant time (or lazily). |
void |
registerByteSizeObserver(T value,
org.apache.beam.sdk.util.common.ElementByteSizeObserver observer)
Notifies the
ElementByteSizeObserver about the byte size of the encoded value using
this Coder . |
java.lang.Object |
structuralValue(T value)
Returns an object with an
Object.equals() method that represents structural equality on
the argument. |
abstract void |
verifyDeterministic()
Throw
Coder.NonDeterministicException if the coding is not deterministic. |
static void |
verifyDeterministic(Coder<?> target,
java.lang.String message,
Coder<?>... coders)
Verifies all of the provided coders are deterministic.
|
static void |
verifyDeterministic(Coder<?> target,
java.lang.String message,
java.lang.Iterable<Coder<?>> coders)
Verifies all of the provided coders are deterministic.
|
public abstract void encode(T value, java.io.OutputStream outStream) throws CoderException, java.io.IOException
T
onto the given output stream.java.io.IOException
- if writing to the OutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reason@Deprecated public void encode(T value, java.io.OutputStream outStream, Coder.Context context) throws CoderException, java.io.IOException
encode(Object value, OutputStream)
T
onto the given output stream in the given context.java.io.IOException
- if writing to the OutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reasonpublic abstract T decode(java.io.InputStream inStream) throws CoderException, java.io.IOException
T
from the given input stream in the given context. Returns the
decoded value.java.io.IOException
- if reading from the InputStream
fails for some reasonCoderException
- if the value could not be decoded for some reason@Deprecated public T decode(java.io.InputStream inStream, Coder.Context context) throws CoderException, java.io.IOException
decode(InputStream)
T
from the given input stream in the given context. Returns the
decoded value.java.io.IOException
- if reading from the InputStream
fails for some reasonCoderException
- if the value could not be decoded for some reasonpublic abstract java.util.List<? extends Coder<?>> getCoderArguments()
public abstract void verifyDeterministic() throws Coder.NonDeterministicException
Coder.NonDeterministicException
if the coding is not deterministic.
In order for a Coder
to be considered deterministic, the following must be true:
Object.equals()
or Comparable.compareTo()
, if supported) have the same encoding.
Coder
always produces a canonical encoding, which is the same for an instance
of an object even if produced on different computers at different times.
Coder.NonDeterministicException
- if this coder is not deterministic.public static void verifyDeterministic(Coder<?> target, java.lang.String message, java.lang.Iterable<Coder<?>> coders) throws Coder.NonDeterministicException
Coder.NonDeterministicException
for the target
Coder
.Coder.NonDeterministicException
public static void verifyDeterministic(Coder<?> target, java.lang.String message, Coder<?>... coders) throws Coder.NonDeterministicException
Coder.NonDeterministicException
for the target
Coder
.Coder.NonDeterministicException
public boolean consistentWithEquals()
true
if this Coder
is injective with respect to Object.equals(java.lang.Object)
.
Whenever the encoded bytes of two values are equal, then the original values are equal
according to Objects.equals()
. Note that this is well-defined for null
.
This condition is most notably false for arrays. More generally, this condition is false
whenever equals()
compares object identity, rather than performing a
semantic/structural comparison.
By default, returns false.
public java.lang.Object structuralValue(T value)
Object.equals()
method that represents structural equality on
the argument.
For any two values x
and y
of type T
, if their encoded bytes are the
same, then it must be the case that structuralValue(x).equals(structuralValue(y))
.
Most notably:
null
should be a proper object with an
equals()
method, even if the input value is null
.
See also consistentWithEquals()
.
By default, if this coder is consistentWithEquals()
, and the value is not null,
returns the provided object. Otherwise, encodes the value into a byte[]
, and returns an
object that performs array equality on the encoded bytes.
public boolean isRegisterByteSizeObserverCheap(T value)
registerByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver)
cheap enough to call for every element, that
is, if this Coder
can calculate the byte size of the element to be coded in roughly
constant time (or lazily).
Not intended to be called by user code, but instead by PipelineRunner
implementations.
By default, returns false. The default registerByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver)
implementation
invokes getEncodedElementByteSize(T)
which requires re-encoding an element unless it is
overridden. This is considered expensive.
public void registerByteSizeObserver(T value, org.apache.beam.sdk.util.common.ElementByteSizeObserver observer) throws java.lang.Exception
ElementByteSizeObserver
about the byte size of the encoded value using
this Coder
.
Not intended to be called by user code, but instead by PipelineRunner
implementations.
By default, this notifies observer
about the byte size of the encoded value using
this coder as returned by getEncodedElementByteSize(T)
.
java.lang.Exception
protected long getEncodedElementByteSize(T value) throws java.lang.Exception
java.lang.Exception
public TypeDescriptor<T> getEncodedTypeDescriptor()
TypeDescriptor
for the type encoded.