T
- the Protocol Buffers Message
handled by this Coder
.public class ProtoCoder<T extends com.google.protobuf.Message> extends CustomCoder<T>
Coder
using Google Protocol Buffers binary format. ProtoCoder
supports both
Protocol Buffers syntax versions 2 and 3.
To learn more about Protocol Buffers, visit: https://developers.google.com/protocol-buffers
ProtoCoder
is registered in the global CoderRegistry
as the default Coder
for any Message
object. Custom message extensions are also supported, but these
extensions must be registered for a particular ProtoCoder
instance and that instance must
be registered on the PCollection
that needs the extensions:
import MyProtoFile;
import MyProtoFile.MyMessage;
Coder<MyMessage> coder = ProtoCoder.of(MyMessage.class).withExtensionsFrom(MyProtoFile.class);
PCollection<MyMessage> records = input.apply(...).setCoder(coder);
ProtoCoder
supports both versions 2 and 3 of the Protocol Buffers syntax. However, the
Java runtime version of the google.com.protobuf
library must match exactly the
version of protoc
that was used to produce the JAR files containing the compiled
.proto
messages.
For more information, see the Protocol Buffers documentation.
ProtoCoder
and DeterminismIn general, Protocol Buffers messages can be encoded deterministically within a single pipeline as long as:
map
fields.
.proto
file JAR.
ProtoCoder
and Encoding StabilityWhen changing Protocol Buffers messages, follow the rules in the Protocol Buffers language
guides for proto2
and proto3
syntaxes, depending on your message type. Following these guidelines will ensure that the old
encoded data can be read by new versions of the code.
Generally, any change to the message type, registered extensions, runtime library, or compiled proto JARs may change the encoding. Thus even if both the original and updated messages can be encoded deterministically within a single job, these deterministic encodings may not be the same across jobs.
Coder.Context, Coder.NonDeterministicException
Modifier and Type | Field and Description |
---|---|
static long |
serialVersionUID |
Modifier | Constructor and Description |
---|---|
protected |
ProtoCoder(java.lang.Class<T> protoMessageClass,
java.util.Set<java.lang.Class<?>> extensionHostClasses)
Private constructor.
|
Modifier and Type | Method and Description |
---|---|
T |
decode(java.io.InputStream inStream)
Decodes a value of type
T from the given input stream in the given context. |
T |
decode(java.io.InputStream inStream,
Coder.Context context)
Decodes a value of type
T from the given input stream in the given context. |
void |
encode(T value,
java.io.OutputStream outStream)
Encodes the given value of type
T onto the given output stream. |
void |
encode(T value,
java.io.OutputStream outStream,
Coder.Context context)
Encodes the given value of type
T onto the given output stream in the given context. |
boolean |
equals(@Nullable java.lang.Object other) |
static CoderProvider |
getCoderProvider()
|
java.util.Set<java.lang.Class<?>> |
getExtensionHosts() |
com.google.protobuf.ExtensionRegistry |
getExtensionRegistry()
Returns the
ExtensionRegistry listing all known Protocol Buffers extension messages to
T registered with this ProtoCoder . |
java.lang.Class<T> |
getMessageType()
Returns the Protocol Buffers
Message type this ProtoCoder supports. |
protected com.google.protobuf.Parser<T> |
getParser()
Get the memoized
Parser , possibly initializing it lazily. |
int |
hashCode() |
static <T extends com.google.protobuf.Message> |
of(java.lang.Class<T> protoMessageClass)
Returns a
ProtoCoder for the given Protocol Buffers Message . |
static <T extends com.google.protobuf.Message> |
of(TypeDescriptor<T> protoMessageType)
|
void |
verifyDeterministic()
Throw
Coder.NonDeterministicException if the coding is not deterministic. |
ProtoCoder<T> |
withExtensionsFrom(java.lang.Class<?>... moreExtensionHosts)
|
ProtoCoder<T> |
withExtensionsFrom(java.lang.Iterable<java.lang.Class<?>> moreExtensionHosts)
Returns a
ProtoCoder like this one, but with the extensions from the given classes
registered. |
getCoderArguments
consistentWithEquals, getEncodedElementByteSize, getEncodedTypeDescriptor, isRegisterByteSizeObserverCheap, registerByteSizeObserver, structuralValue, verifyDeterministic, verifyDeterministic
public static final long serialVersionUID
protected ProtoCoder(java.lang.Class<T> protoMessageClass, java.util.Set<java.lang.Class<?>> extensionHostClasses)
public static <T extends com.google.protobuf.Message> ProtoCoder<T> of(java.lang.Class<T> protoMessageClass)
ProtoCoder
for the given Protocol Buffers Message
.public static <T extends com.google.protobuf.Message> ProtoCoder<T> of(TypeDescriptor<T> protoMessageType)
public ProtoCoder<T> withExtensionsFrom(java.lang.Iterable<java.lang.Class<?>> moreExtensionHosts)
ProtoCoder
like this one, but with the extensions from the given classes
registered.
Each of the extension host classes must be an class automatically generated by the Protocol
Buffers compiler, protoc
, that contains messages.
Does not modify this object.
public ProtoCoder<T> withExtensionsFrom(java.lang.Class<?>... moreExtensionHosts)
withExtensionsFrom(Iterable)
.
Does not modify this object.
public void encode(T value, java.io.OutputStream outStream) throws java.io.IOException
Coder
T
onto the given output stream. Multiple elements can
be encoded next to each other on the output stream, each coder should encode information to
know how many bytes to read when decoding. A common approach is to prefix the encoding with the
element's encoded length.encode
in class Coder<T extends com.google.protobuf.Message>
java.io.IOException
- if writing to the OutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reasonpublic void encode(T value, java.io.OutputStream outStream, Coder.Context context) throws java.io.IOException
Coder
T
onto the given output stream in the given context.encode
in class Coder<T extends com.google.protobuf.Message>
java.io.IOException
- if writing to the OutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reasonpublic T decode(java.io.InputStream inStream) throws java.io.IOException
Coder
T
from the given input stream in the given context. Returns the
decoded value. Multiple elements can be encoded next to each other on the input stream, each
coder should encode information to know how many bytes to read when decoding. A common approach
is to prefix the encoding with the element's encoded length.decode
in class Coder<T extends com.google.protobuf.Message>
java.io.IOException
- if reading from the InputStream
fails for some reasonCoderException
- if the value could not be decoded for some reasonpublic T decode(java.io.InputStream inStream, Coder.Context context) throws java.io.IOException
Coder
T
from the given input stream in the given context. Returns the
decoded value.decode
in class Coder<T extends com.google.protobuf.Message>
java.io.IOException
- if reading from the InputStream
fails for some reasonCoderException
- if the value could not be decoded for some reasonpublic boolean equals(@Nullable java.lang.Object other)
equals
in class java.lang.Object
public int hashCode()
hashCode
in class java.lang.Object
public void verifyDeterministic() throws Coder.NonDeterministicException
CustomCoder
Coder.NonDeterministicException
if the coding is not deterministic.
In order for a Coder
to be considered deterministic, the following must be true:
Object.equals()
or Comparable.compareTo()
, if supported) have the same encoding.
Coder
always produces a canonical encoding, which is the same for an instance
of an object even if produced on different computers at different times.
verifyDeterministic
in class CustomCoder<T extends com.google.protobuf.Message>
Coder.NonDeterministicException
- if this coder is not deterministic.public java.lang.Class<T> getMessageType()
Message
type this ProtoCoder
supports.public java.util.Set<java.lang.Class<?>> getExtensionHosts()
public com.google.protobuf.ExtensionRegistry getExtensionRegistry()
ExtensionRegistry
listing all known Protocol Buffers extension messages to
T
registered with this ProtoCoder
.protected com.google.protobuf.Parser<T> getParser()
Parser
, possibly initializing it lazily.public static CoderProvider getCoderProvider()
CoderProvider
which uses the ProtoCoder
for proto
messages
.
This method is invoked reflectively from DefaultCoder
.