T
- the type of elements handled by this coderpublic class AvroCoder<T> extends CustomCoder<T>
Coder
using Avro binary format.
Each instance of AvroCoder<T>
encapsulates an Avro schema for objects of type
T
.
The Avro schema may be provided explicitly via of(Class, Schema)
or
omitted via of(Class)
, in which case it will be inferred
using Avro's ReflectData
.
For complete details about schema generation and how it can be controlled please see
the org.apache.avro.reflect
package.
Only concrete classes with a no-argument constructor can be mapped to Avro records.
All inherited fields that are not static or transient are included. Fields are not permitted to
be null unless annotated by Nullable
or a Union
schema
containing "null"
.
To use, specify the Coder
type on a PCollection:
PCollection<MyCustomElement> records =
input.apply(...)
.setCoder(AvroCoder.of(MyCustomElement.class);
or annotate the element class using @DefaultCoder
.
@DefaultCoder(AvroCoder.class)
public class MyCustomElement {
...
}
The implementation attempts to determine if the Avro encoding of the given type will satisfy
the criteria of Coder.verifyDeterministic()
by inspecting both the type and the
Schema provided or generated by Avro. Only coders that are deterministic can be used in
GroupByKey
operations.
Coder.Context, Coder.NonDeterministicException
Modifier | Constructor and Description |
---|---|
protected |
AvroCoder(java.lang.Class<T> type,
Schema schema) |
Modifier and Type | Method and Description |
---|---|
T |
decode(java.io.InputStream inStream)
Decodes a value of type
T from the given input stream in
the given context. |
void |
encode(T value,
java.io.OutputStream outStream)
Encodes the given value of type
T onto the given output stream. |
boolean |
equals(java.lang.Object other) |
static CoderProvider |
getCoderProvider()
Returns a
CoderProvider which uses the AvroCoder if possible for
all types. |
TypeDescriptor<T> |
getEncodedTypeDescriptor()
Returns the
TypeDescriptor for the type encoded. |
Schema |
getSchema()
Returns the schema used by this coder.
|
java.lang.Class<T> |
getType()
Returns the type this coder encodes/decodes.
|
int |
hashCode() |
static <T> AvroCoder<T> |
of(java.lang.Class<T> clazz)
Returns an
AvroCoder instance for the provided element class. |
static <T> AvroCoder<T> |
of(java.lang.Class<T> type,
Schema schema)
Returns an
AvroCoder instance for the provided element type
using the provided Avro schema. |
static AvroCoder<GenericRecord> |
of(Schema schema)
Returns an
AvroCoder instance for the Avro schema. |
static <T> AvroCoder<T> |
of(TypeDescriptor<T> type)
Returns an
AvroCoder instance for the provided element type. |
void |
verifyDeterministic()
Throw
Coder.NonDeterministicException if the coding is not deterministic. |
getCoderArguments
consistentWithEquals, decode, encode, getEncodedElementByteSize, isRegisterByteSizeObserverCheap, registerByteSizeObserver, structuralValue, verifyDeterministic, verifyDeterministic
public static <T> AvroCoder<T> of(TypeDescriptor<T> type)
AvroCoder
instance for the provided element type.T
- the element typepublic static <T> AvroCoder<T> of(java.lang.Class<T> clazz)
AvroCoder
instance for the provided element class.T
- the element typepublic static AvroCoder<GenericRecord> of(Schema schema)
AvroCoder
instance for the Avro schema. The implicit
type is GenericRecord.public static <T> AvroCoder<T> of(java.lang.Class<T> type, Schema schema)
AvroCoder
instance for the provided element type
using the provided Avro schema.
If the type argument is GenericRecord, the schema may be arbitrary. Otherwise, the schema must correspond to the type provided.
T
- the element typepublic static CoderProvider getCoderProvider()
CoderProvider
which uses the AvroCoder
if possible for
all types.
It is unsafe to register this as a CoderProvider
because Avro will reflectively
accept dangerous types such as Object
.
This method is invoked reflectively from DefaultCoder
.
public java.lang.Class<T> getType()
public void encode(T value, java.io.OutputStream outStream) throws java.io.IOException
Coder
T
onto the given output stream.encode
in class Coder<T>
java.io.IOException
- if writing to the OutputStream
fails
for some reasonCoderException
- if the value could not be encoded for some reasonpublic T decode(java.io.InputStream inStream) throws java.io.IOException
Coder
T
from the given input stream in
the given context. Returns the decoded value.decode
in class Coder<T>
java.io.IOException
- if reading from the InputStream
fails
for some reasonCoderException
- if the value could not be decoded for some reasonpublic void verifyDeterministic() throws Coder.NonDeterministicException
CustomCoder
Coder.NonDeterministicException
if the coding is not deterministic.
In order for a Coder
to be considered deterministic,
the following must be true:
Object.equals()
or Comparable.compareTo()
, if supported) have the same
encoding.
Coder
always produces a canonical encoding, which is the
same for an instance of an object even if produced on different
computers at different times.
verifyDeterministic
in class CustomCoder<T>
NonDeterministicException
- when the type may not be deterministically
encoded using the given Schema
, the directBinaryEncoder
, and the
ReflectDatumWriter
or GenericDatumWriter
.Coder.NonDeterministicException
- if this coder is not deterministic.public Schema getSchema()
public TypeDescriptor<T> getEncodedTypeDescriptor()
Coder
TypeDescriptor
for the type encoded.getEncodedTypeDescriptor
in class Coder<T>
public boolean equals(java.lang.Object other)
equals
in class java.lang.Object
public int hashCode()
hashCode
in class java.lang.Object