T
- the type of elements handled by this coderpublic class AvroCoder<T> extends CustomCoder<T>
Coder
using Avro binary format.
Each instance of AvroCoder<T>
encapsulates an Avro datum factory and schema for
objects of type T
.
The Avro datum factory and schema may be provided explicitly via of(AvroDatumFactory, Schema)
or omitted via specific(Class)
or
reflect(Class)
in which case it will be inferred using Avro's SpecificData
or ReflectData
For complete details about schema generation and how it can be controlled please see the
org.apache.avro.specific
and org.apache.avro.reflect
packages.
To use, specify the Coder
type on a PCollection:
PCollection<MyCustomElement> records =
input.apply(...)
.setCoder(AvroCoder.of(MyCustomElement.class));
or annotate the element class using @DefaultCoder
.
@DefaultCoder(AvroCoder.class)
public class MyCustomElement {
...
}
The implementation attempts to determine if the Avro encoding of the given type will satisfy
the criteria of Coder.verifyDeterministic()
by inspecting both the type and the Schema
provided or generated by Avro. Only coders that are deterministic can be used in GroupByKey
operations.
Coder.Context, Coder.NonDeterministicException
Modifier | Constructor and Description |
---|---|
protected |
AvroCoder(AvroDatumFactory<T> datumFactory,
Schema schema) |
protected |
AvroCoder(java.lang.Class<T> type,
Schema schema) |
protected |
AvroCoder(java.lang.Class<T> type,
Schema schema,
boolean useReflectApi) |
Modifier and Type | Method and Description |
---|---|
T |
decode(java.io.InputStream inStream)
Decodes a value of type
T from the given input stream in the given context. |
void |
encode(T value,
java.io.OutputStream outStream)
Encodes the given value of type
T onto the given output stream. |
boolean |
equals(@Nullable java.lang.Object other) |
static AvroCoder<GenericRecord> |
generic(Schema schema)
Returns an
AvroCoder instance for the Avro schema. |
static CoderProvider |
getCoderProvider()
Returns a
CoderProvider which uses the AvroCoder if possible for all types. |
AvroDatumFactory<T> |
getDatumFactory()
Returns the datum factory used for encoding/decoding.
|
TypeDescriptor<T> |
getEncodedTypeDescriptor()
Returns the
TypeDescriptor for the type encoded. |
Schema |
getSchema()
Returns the schema used by this coder.
|
java.lang.Class<T> |
getType()
Returns the type this coder encodes/decodes.
|
int |
hashCode() |
static <T> AvroCoder<T> |
of(AvroDatumFactory<T> datumFactory,
Schema schema)
Returns an
AvroCoder instance for the provided AvroDatumFactory using the
provided Avro schema. |
static <T> AvroCoder<T> |
of(java.lang.Class<T> clazz)
Returns an
AvroCoder instance for the provided element class. |
static <T> AvroCoder<T> |
of(java.lang.Class<T> type,
boolean useReflectApi)
Returns an
AvroCoder instance for the given class, respecting whether to use Avro's
Reflect* or Specific* suite for encoding and decoding. |
static <T> AvroCoder<T> |
of(java.lang.Class<T> type,
Schema schema)
Returns an
AvroCoder instance for the provided element type using the provided Avro
schema |
static <T> AvroCoder<T> |
of(java.lang.Class<T> type,
Schema schema,
boolean useReflectApi)
Returns an
AvroCoder instance for the provided element type using the provided Avro
schema, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding. |
static AvroGenericCoder |
of(Schema schema)
Returns an
AvroGenericCoder instance for the Avro schema. |
static <T> AvroCoder<T> |
of(TypeDescriptor<T> type)
Returns an
AvroCoder instance for the provided element type. |
static <T> AvroCoder<T> |
of(TypeDescriptor<T> type,
boolean useReflectApi)
Returns an
AvroCoder instance for the provided element type, respecting whether to use
Avro's Reflect* or Specific* suite for encoding and decoding. |
static <T> AvroCoder<T> |
reflect(java.lang.Class<T> type)
Returns an
AvroCoder instance for the provided element type respecting Avro's Reflect*
suite for encoding and decoding. |
static <T> AvroCoder<T> |
reflect(java.lang.Class<T> type,
Schema schema)
Returns an
AvroCoder instance for the provided element type respecting Avro's Reflect*
suite for encoding and decoding. |
static <T> AvroCoder<T> |
reflect(TypeDescriptor<T> type)
Returns an
AvroCoder instance for the provided element type respecting Avro's Reflect*
suite for encoding and decoding. |
static <T> AvroCoder<T> |
specific(java.lang.Class<T> type)
Returns an
AvroCoder instance for the provided element type respecting Avro's Specific*
suite for encoding and decoding. |
static <T> AvroCoder<T> |
specific(java.lang.Class<T> type,
Schema schema)
Returns an
AvroCoder instance for the provided element type respecting Avro's Specific*
suite for encoding and decoding. |
static <T> AvroCoder<T> |
specific(TypeDescriptor<T> type)
Returns an
AvroCoder instance for the provided element type respecting Avro's Specific*
suite for encoding and decoding. |
boolean |
useReflectApi()
Deprecated.
kept for backward API compatibility only.
|
void |
verifyDeterministic()
Throw
Coder.NonDeterministicException if the coding is not deterministic. |
getCoderArguments
consistentWithEquals, decode, encode, getEncodedElementByteSize, isRegisterByteSizeObserverCheap, registerByteSizeObserver, structuralValue, verifyDeterministic, verifyDeterministic
protected AvroCoder(AvroDatumFactory<T> datumFactory, Schema schema)
public static AvroCoder<GenericRecord> generic(Schema schema)
AvroCoder
instance for the Avro schema. The implicit type is GenericRecord.public static <T> AvroCoder<T> specific(TypeDescriptor<T> type)
AvroCoder
instance for the provided element type respecting Avro's Specific*
suite for encoding and decoding.public static <T> AvroCoder<T> specific(java.lang.Class<T> type)
AvroCoder
instance for the provided element type respecting Avro's Specific*
suite for encoding and decoding.public static <T> AvroCoder<T> specific(java.lang.Class<T> type, Schema schema)
AvroCoder
instance for the provided element type respecting Avro's Specific*
suite for encoding and decoding.
The schema must correspond to the type provided.
public static <T> AvroCoder<T> reflect(TypeDescriptor<T> type)
AvroCoder
instance for the provided element type respecting Avro's Reflect*
suite for encoding and decoding.public static <T> AvroCoder<T> reflect(java.lang.Class<T> type)
AvroCoder
instance for the provided element type respecting Avro's Reflect*
suite for encoding and decoding.public static <T> AvroCoder<T> reflect(java.lang.Class<T> type, Schema schema)
AvroCoder
instance for the provided element type respecting Avro's Reflect*
suite for encoding and decoding.
The schema must correspond to the type provided.
public static AvroGenericCoder of(Schema schema)
AvroGenericCoder
instance for the Avro schema. The implicit type is
GenericRecord.public static <T> AvroCoder<T> of(TypeDescriptor<T> type)
AvroCoder
instance for the provided element type.T
- the element typepublic static <T> AvroCoder<T> of(TypeDescriptor<T> type, boolean useReflectApi)
AvroCoder
instance for the provided element type, respecting whether to use
Avro's Reflect* or Specific* suite for encoding and decoding.T
- the element typepublic static <T> AvroCoder<T> of(java.lang.Class<T> clazz)
AvroCoder
instance for the provided element class.T
- the element typepublic static <T> AvroCoder<T> of(java.lang.Class<T> type, boolean useReflectApi)
AvroCoder
instance for the given class, respecting whether to use Avro's
Reflect* or Specific* suite for encoding and decoding.T
- the element typepublic static <T> AvroCoder<T> of(java.lang.Class<T> type, Schema schema)
AvroCoder
instance for the provided element type using the provided Avro
schema
The schema must correspond to the type provided.
T
- the element typepublic static <T> AvroCoder<T> of(AvroDatumFactory<T> datumFactory, Schema schema)
AvroCoder
instance for the provided AvroDatumFactory
using the
provided Avro schema.
The schema must correspond to the provided datumFactory's type.
T
- the element typepublic static <T> AvroCoder<T> of(java.lang.Class<T> type, Schema schema, boolean useReflectApi)
AvroCoder
instance for the provided element type using the provided Avro
schema, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
The schema must correspond to the type provided.
T
- the element typepublic static CoderProvider getCoderProvider()
CoderProvider
which uses the AvroCoder
if possible for all types.
It is unsafe to register this as a CoderProvider
because Avro will reflectively
accept dangerous types such as Object
.
This method is invoked reflectively from DefaultCoder
.
public java.lang.Class<T> getType()
public AvroDatumFactory<T> getDatumFactory()
@Deprecated public boolean useReflectApi()
AvroDatumFactory
is an instance of AvroDatumFactory.ReflectDatumFactory
public void encode(T value, java.io.OutputStream outStream) throws java.io.IOException
Coder
T
onto the given output stream. Multiple elements can
be encoded next to each other on the output stream, each coder should encode information to
know how many bytes to read when decoding. A common approach is to prefix the encoding with the
element's encoded length.encode
in class Coder<T>
java.io.IOException
- if writing to the OutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reasonpublic T decode(java.io.InputStream inStream) throws java.io.IOException
Coder
T
from the given input stream in the given context. Returns the
decoded value. Multiple elements can be encoded next to each other on the input stream, each
coder should encode information to know how many bytes to read when decoding. A common approach
is to prefix the encoding with the element's encoded length.decode
in class Coder<T>
java.io.IOException
- if reading from the InputStream
fails for some reasonCoderException
- if the value could not be decoded for some reasonpublic void verifyDeterministic() throws Coder.NonDeterministicException
CustomCoder
Coder.NonDeterministicException
if the coding is not deterministic.
In order for a Coder
to be considered deterministic, the following must be true:
Object.equals()
or Comparable.compareTo()
, if supported) have the same encoding.
Coder
always produces a canonical encoding, which is the same for an instance
of an object even if produced on different computers at different times.
verifyDeterministic
in class CustomCoder<T>
NonDeterministicException
- when the type may not be deterministically encoded using the
given Schema
, the directBinaryEncoder
, and the ReflectDatumWriter
or GenericDatumWriter
.Coder.NonDeterministicException
- if this coder is not deterministic.public Schema getSchema()
public TypeDescriptor<T> getEncodedTypeDescriptor()
Coder
TypeDescriptor
for the type encoded.getEncodedTypeDescriptor
in class Coder<T>
public boolean equals(@Nullable java.lang.Object other)
equals
in class java.lang.Object
true
if the two AvroCoder
instances have the same class, type and
schema.public int hashCode()
hashCode
in class java.lang.Object