apache_beam.typehints.row_type module

class apache_beam.typehints.row_type.RowTypeConstraint(fields: Sequence[Tuple[str, type]], user_type, schema_options: Optional[Sequence[Tuple[str, Any]]] = None, field_options: Optional[Dict[str, Sequence[Tuple[str, Any]]]] = None)[source]

Bases: apache_beam.typehints.typehints.TypeConstraint

For internal use only, no backwards comatibility guaratees. See https://beam.apache.org/documentation/programming-guide/#schemas-for-pl-types for guidance on creating PCollections with inferred schemas.

Note RowTypeConstraint does not currently store arbitrary functions for converting to/from the user type. Instead, we only support NamedTuple user types and make the follow assumptions:

  • The user type can be constructed with field values as arguments in order (i.e. constructor(*field_values)).
  • Field values can be accessed from instances of the user type by attribute (i.e. with getattr(obj, field_name)).

In the future we will add support for dataclasses ([#22085](https://github.com/apache/beam/issues/22085)) which also satisfy these assumptions.

The RowTypeConstraint constructor should not be called directly (even internally to Beam). Prefer static methods from_user_type or from_fields.

Parameters:
  • fields – a list of (name, type) tuples, representing the schema inferred from user_type.
  • user_type – constructor for a user type (e.g. NamedTuple class) that is used to represent this schema in user code.
  • schema_options – A list of (key, value) tuples representing schema-level options.
  • field_options – A dictionary representing field-level options. Dictionary keys are field names, and dictionary values are lists of (key, value) tuples representing field-level options for that field.
static from_user_type(user_type: type, schema_options: Optional[Sequence[Tuple[str, Any]]] = None, field_options: Optional[Dict[str, Sequence[Tuple[str, Any]]]] = None) → Optional[apache_beam.typehints.row_type.RowTypeConstraint][source]
static from_fields(fields: Sequence[Tuple[str, type]], schema_id: Optional[str] = None, schema_options: Optional[Sequence[Tuple[str, Any]]] = None, field_options: Optional[Dict[str, Sequence[Tuple[str, Any]]]] = None, schema_registry: Optional[apache_beam.typehints.schema_registry.SchemaTypeRegistry] = None) → apache_beam.typehints.row_type.RowTypeConstraint[source]
user_type
set_schema_id(schema_id)[source]
schema_id
schema_options
field_options(field_name)[source]
type_check(instance)[source]
get_type_for(name)[source]
class apache_beam.typehints.row_type.GeneratedClassRowTypeConstraint(fields, schema_id: Optional[str] = None, schema_options: Optional[Sequence[Tuple[str, Any]]] = None, field_options: Optional[Dict[str, Sequence[Tuple[str, Any]]]] = None, schema_registry: Optional[apache_beam.typehints.schema_registry.SchemaTypeRegistry] = None)[source]

Bases: apache_beam.typehints.row_type.RowTypeConstraint

Specialization of RowTypeConstraint which relies on a generated user_type.

Since the generated user_type cannot be pickled, we supply a custom __reduce__ function that will regenerate the user_type.