apache_beam.coders.coders module¶
Collection of useful coders.
Only those coders listed in __all__ are part of the public API of this module.
-
class
apache_beam.coders.coders.
Coder
[source]¶ Bases:
future.types.newobject.newobject
Base class for coders.
-
is_deterministic
()[source]¶ Whether this coder is guaranteed to encode values deterministically.
A deterministic coder is required for key coders in GroupByKey operations to produce consistent results.
For example, note that the default coder, the PickleCoder, is not deterministic: the ordering of picked entries in maps may vary across executions since there is no defined order, and such a coder is not in general suitable for usage as a key coder in GroupByKey operations, since each instance of the same key may be encoded differently.
Returns: Whether coder is deterministic.
-
as_deterministic_coder
(step_label, error_message=None)[source]¶ Returns a deterministic version of self, if possible.
Otherwise raises a value error.
-
estimate_size
(value)[source]¶ Estimates the encoded size of the given value, in bytes.
Dataflow estimates the encoded size of a PCollection processed in a pipeline step by using the estimated size of a random sample of elements in that PCollection.
The default implementation encodes the given value and returns its byte size. If a coder can provide a fast estimate of the encoded size of a value (e.g., if the encoding has a fixed size), it can provide its estimate here to improve performance.
Parameters: value – the value whose encoded size is to be estimated. Returns: The estimated encoded size of the given value.
-
get_impl
()[source]¶ For internal use only; no backwards-compatibility guarantees.
Returns the CoderImpl backing this Coder.
-
as_cloud_object
(coders_context=None)[source]¶ For internal use only; no backwards-compatibility guarantees.
Returns Google Cloud Dataflow API description of this coder.
-
classmethod
register_urn
(urn, parameter_type, fn=None)[source]¶ Registers a urn with a constructor.
For example, if ‘beam:fn:foo’ had parameter type FooPayload, one could write RunnerApiFn.register_urn(‘bean:fn:foo’, FooPayload, foo_from_proto) where foo_from_proto took as arguments a FooPayload and a PipelineContext. This function can also be used as a decorator rather than passing the callable in as the final parameter.
A corresponding to_runner_api_parameter method would be expected that returns the tuple (‘beam:fn:foo’, FooPayload)
-
-
class
apache_beam.coders.coders.
StrUtf8Coder
[source]¶ Bases:
apache_beam.coders.coders.Coder
A coder used for reading and writing strings as UTF-8.
-
to_runner_api_parameter
(unused_context)¶
-
-
class
apache_beam.coders.coders.
BytesCoder
[source]¶ Bases:
apache_beam.coders.coders.FastCoder
Byte string coder.
-
to_runner_api_parameter
(unused_context)¶
-
-
class
apache_beam.coders.coders.
VarIntCoder
[source]¶ Bases:
apache_beam.coders.coders.FastCoder
Variable-length integer coder.
-
to_runner_api_parameter
(unused_context)¶
-
-
class
apache_beam.coders.coders.
FloatCoder
[source]¶ Bases:
apache_beam.coders.coders.FastCoder
A coder used for floating-point values.
-
to_runner_api_parameter
(unused_context)¶
-
-
class
apache_beam.coders.coders.
TimestampCoder
[source]¶ Bases:
apache_beam.coders.coders.FastCoder
A coder used for timeutil.Timestamp values.
-
class
apache_beam.coders.coders.
SingletonCoder
(value)[source]¶ Bases:
apache_beam.coders.coders.FastCoder
A coder that always encodes exactly one value.
-
class
apache_beam.coders.coders.
PickleCoder
[source]¶ Bases:
apache_beam.coders.coders._PickleCoderBase
Coder using Python’s pickle functionality.
-
class
apache_beam.coders.coders.
DillCoder
[source]¶ Bases:
apache_beam.coders.coders._PickleCoderBase
Coder using dill’s pickle functionality.
-
class
apache_beam.coders.coders.
FastPrimitivesCoder
(fallback_coder=PickleCoder)[source]¶ Bases:
apache_beam.coders.coders.FastCoder
Encodes simple primitives (e.g. str, int) efficiently.
For unknown types, falls back to another coder (e.g. PickleCoder).
-
class
apache_beam.coders.coders.
ProtoCoder
(proto_message_type)[source]¶ Bases:
apache_beam.coders.coders.FastCoder
A Coder for Google Protocol Buffers.
It supports both Protocol Buffers syntax versions 2 and 3. However, the runtime version of the python protobuf library must exactly match the version of the protoc compiler what was used to generate the protobuf messages.
ProtoCoder is registered in the global CoderRegistry as the default coder for any protobuf Message object.
-
class
apache_beam.coders.coders.
TupleCoder
(components)[source]¶ Bases:
apache_beam.coders.coders.FastCoder
Coder of tuple objects.
-
class
apache_beam.coders.coders.
TupleSequenceCoder
(elem_coder)[source]¶ Bases:
apache_beam.coders.coders.FastCoder
Coder of homogeneous tuple objects.
-
class
apache_beam.coders.coders.
IterableCoder
(elem_coder)[source]¶ Bases:
apache_beam.coders.coders.FastCoder
Coder of iterables of homogeneous objects.
-
to_runner_api_parameter
(unused_context)¶
-