apache_beam.typehints.schemas module

Support for mapping python types to proto Schemas and back again.

Imposes a mapping between common Python types and Beam portable schemas (https://s.apache.org/beam-schemas):

Python              Schema
np.int8     <-----> BYTE
np.int16    <-----> INT16
np.int32    <-----> INT32
np.int64    <-----> INT64
int         ------> INT64
np.float32  <-----> FLOAT
np.float64  <-----> DOUBLE
float       ------> DOUBLE
bool        <-----> BOOLEAN
str         <-----> STRING
bytes       <-----> BYTES
ByteString  ------> BYTES
Timestamp   <-----> LogicalType(urn="beam:logical_type:micros_instant:v1")
Mapping     <-----> MapType
Sequence    <-----> ArrayType
NamedTuple  <-----> RowType
beam.Row    ------> RowType

Note that some of these mappings are provided as conveniences, but they are lossy and will not survive a roundtrip from python to Beam schemas and back. For example, the Python type int will map to INT64 in Beam schemas but converting that back to a Python type will yield np.int64.

nullable=True on a Beam FieldType is represented in Python by wrapping the type in Optional.

This module is intended for internal use only. Nothing defined here provides any backwards-compatibility guarantee.

class apache_beam.typehints.schemas.SchemaTypeRegistry[source]

Bases: object

add(typing, schema)[source]
apache_beam.typehints.schemas.schema_from_element_type(element_type: type) → schema_pb2.Schema[source]

Get a schema for the given PCollection element_type.

Returns schema as a list of (name, python_type) tuples

apache_beam.typehints.schemas.named_fields_from_element_type(element_type: type) → List[Tuple[str, type]][source]
class apache_beam.typehints.schemas.LogicalTypeRegistry[source]

Bases: object

add(urn, logical_type)[source]
class apache_beam.typehints.schemas.LogicalType[source]

Bases: typing.Generic

classmethod urn()[source]

Return the URN used to identify this logical type

classmethod language_type()[source]

Return the language type this LogicalType encodes.

The returned type should match LanguageT

classmethod representation_type()[source]

Return the type of the representation this LogicalType uses to encode the language type.

The returned type should match RepresentationT

classmethod argument_type()[source]

Return the type of the argument used for variations of this LogicalType.

The returned type should match ArgT


Return the argument for this instance of the LogicalType.


Convert an instance of LanguageT to RepresentationT.


Convert an instance of RepresentationT to LanguageT.

classmethod register_logical_type(logical_type_cls)[source]

Register an implementation of LogicalType.

classmethod from_typing(typ)[source]

Construct an instance of a registered LogicalType implementation given a typing.

Raises ValueError if no registered LogicalType implementation can encode the given typing.

classmethod from_runner_api(logical_type_proto)[source]

Construct an instance of a registered LogicalType implementation given a proto LogicalType.

Raises ValueError if no LogicalType registered for the given URN.

class apache_beam.typehints.schemas.NoArgumentLogicalType[source]

Bases: apache_beam.typehints.schemas.LogicalType

classmethod argument_type()[source]
class apache_beam.typehints.schemas.MicrosInstantRepresentation(seconds, micros)

Bases: tuple

Create new instance of MicrosInstantRepresentation(seconds, micros)


Alias for field number 1


Alias for field number 0