T
- the type of the elements of the Iterable
s being transcodedIterableT
- the type of the Iterables being transcodedpublic abstract class IterableLikeCoder<T,IterableT extends java.lang.Iterable<T>> extends StructuredCoder<IterableT>
Coder
for a class that
implements Iterable
.
To complete a subclass, implement the decodeToIterable(java.util.List<T>)
method. This superclass will
decode the elements in the input stream into a List
and then pass them to that method to
be converted into the appropriate iterable type. Note that this means the input iterables must
fit into memory.
The format of this coder is as follows:
Iterable
has a known and finite size, then the size is written to the
output stream in big endian format, followed by all of the encoded elements.
Iterable
is not known to have a finite size, then each element of the
input is preceded by true
encoded as a byte (indicating "more data") followed by
the encoded element, and terminated by false
encoded as a byte.
Coder.Context, Coder.NonDeterministicException
Modifier | Constructor and Description |
---|---|
protected |
IterableLikeCoder(Coder<T> elementCoder,
java.lang.String iterableName) |
Modifier and Type | Method and Description |
---|---|
IterableT |
decode(java.io.InputStream inStream)
Decodes a value of type
T from the given input stream in the given context. |
protected abstract IterableT |
decodeToIterable(java.util.List<T> decodedElements)
Builds an instance of
IterableT , this coder's associated Iterable -like subtype,
from a list of decoded elements. |
protected IterableT |
decodeToIterable(java.util.List<T> decodedElements,
long terminatorValue,
java.io.InputStream in)
Builds an instance of
IterableT , this coder's associated Iterable -like subtype,
from a list of decoded elements with the InputStream at the position where this coder
detected the end of the stream. |
void |
encode(IterableT iterable,
java.io.OutputStream outStream)
Encodes the given value of type
T onto the given output stream. |
java.util.List<? extends Coder<?>> |
getCoderArguments()
|
Coder<T> |
getElemCoder() |
boolean |
isRegisterByteSizeObserverCheap(IterableT iterable)
Returns whether
Coder.registerByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver) cheap enough to call for every element, that
is, if this Coder can calculate the byte size of the element to be coded in roughly
constant time (or lazily). |
void |
registerByteSizeObserver(IterableT iterable,
org.apache.beam.sdk.util.common.ElementByteSizeObserver observer)
Notifies the
ElementByteSizeObserver about the byte size of the encoded value using
this Coder . |
void |
verifyDeterministic()
Throw
Coder.NonDeterministicException if the coding is not deterministic. |
equals, getComponents, hashCode, toString
consistentWithEquals, decode, encode, getEncodedElementByteSize, getEncodedTypeDescriptor, structuralValue, verifyDeterministic, verifyDeterministic
protected abstract IterableT decodeToIterable(java.util.List<T> decodedElements)
IterableT
, this coder's associated Iterable
-like subtype,
from a list of decoded elements.
Override decodeToIterable(List, long, InputStream)
if you need access to the
terminator value and the InputStream
.
protected IterableT decodeToIterable(java.util.List<T> decodedElements, long terminatorValue, java.io.InputStream in) throws java.io.IOException
IterableT
, this coder's associated Iterable
-like subtype,
from a list of decoded elements with the InputStream
at the position where this coder
detected the end of the stream.java.io.IOException
public void encode(IterableT iterable, java.io.OutputStream outStream) throws java.io.IOException, CoderException
Coder
T
onto the given output stream. Multiple elements can
be encoded next to each other on the output stream, each coder should encode information to
know how many bytes to read when decoding. A common approach is to prefix the encoding with the
element's encoded length.encode
in class Coder<IterableT extends java.lang.Iterable<T>>
java.io.IOException
- if writing to the OutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reasonpublic IterableT decode(java.io.InputStream inStream) throws java.io.IOException, CoderException
Coder
T
from the given input stream in the given context. Returns the
decoded value. Multiple elements can be encoded next to each other on the input stream, each
coder should encode information to know how many bytes to read when decoding. A common approach
is to prefix the encoding with the element's encoded length.decode
in class Coder<IterableT extends java.lang.Iterable<T>>
java.io.IOException
- if reading from the InputStream
fails for some reasonCoderException
- if the value could not be decoded for some reasonpublic java.util.List<? extends Coder<?>> getCoderArguments()
Coder
Coder
for a parameterized type, returns the list of Coder
s being
used for each of the parameters in the same order they appear within the parameterized type's
type signature. If this cannot be done, or this Coder
does not encode/decode a
parameterized type, returns the empty list.getCoderArguments
in class Coder<IterableT extends java.lang.Iterable<T>>
public void verifyDeterministic() throws Coder.NonDeterministicException
Coder.NonDeterministicException
if the coding is not deterministic.
In order for a Coder
to be considered deterministic, the following must be true:
Object.equals()
or Comparable.compareTo()
, if supported) have the same encoding.
Coder
always produces a canonical encoding, which is the same for an instance
of an object even if produced on different computers at different times.
verifyDeterministic
in class Coder<IterableT extends java.lang.Iterable<T>>
NonDeterministicException
- always. Encoding is not deterministic for the general Iterable
case, as it depends upon the type of iterable. This may allow two objects to
compare as equal while the encoding differs.Coder.NonDeterministicException
- if this coder is not deterministic.public boolean isRegisterByteSizeObserverCheap(IterableT iterable)
Coder.registerByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver)
cheap enough to call for every element, that
is, if this Coder
can calculate the byte size of the element to be coded in roughly
constant time (or lazily).
Not intended to be called by user code, but instead by PipelineRunner
implementations.
By default, returns false. The default Coder.registerByteSizeObserver(T, org.apache.beam.sdk.util.common.ElementByteSizeObserver)
implementation
invokes Coder.getEncodedElementByteSize(T)
which requires re-encoding an element unless it is
overridden. This is considered expensive.
isRegisterByteSizeObserverCheap
in class Coder<IterableT extends java.lang.Iterable<T>>
true
if the iterable is of a known class that supports lazy counting of byte
size, since that requires minimal extra computation.public void registerByteSizeObserver(IterableT iterable, org.apache.beam.sdk.util.common.ElementByteSizeObserver observer) throws java.lang.Exception
Coder
ElementByteSizeObserver
about the byte size of the encoded value using
this Coder
.
Not intended to be called by user code, but instead by PipelineRunner
implementations.
By default, this notifies observer
about the byte size of the encoded value using
this coder as returned by Coder.getEncodedElementByteSize(T)
.
registerByteSizeObserver
in class Coder<IterableT extends java.lang.Iterable<T>>
java.lang.Exception