apache_beam.yaml.yaml_provider module

This module defines Providers usable from yaml, which is a specification for where to find and how to invoke services that vend implementations of various PTransforms.

class apache_beam.yaml.yaml_provider.Provider[source]

Bases: object

Maps transform types names and args to concrete PTransform instances.

available() → bool[source]

Returns whether this provider is available to use in this environment.

cache_artifacts() → Optional[Iterable[str]][source]
provided_transforms() → Iterable[str][source]

Returns a list of transform type names this provider can handle.

config_schema(type)[source]
requires_inputs(typ: str, args: Mapping[str, Any]) → bool[source]

Returns whether this transform requires inputs.

Specifically, if this returns True and inputs are not provided than an error will be thrown.

This is best-effort, primarily for better and earlier error messages.

create_transform(typ: str, args: Mapping[str, Any], yaml_create_transform: Callable[[Mapping[str, Any], Iterable[apache_beam.pvalue.PCollection]], apache_beam.transforms.ptransform.PTransform]) → apache_beam.transforms.ptransform.PTransform[source]

Creates a PTransform instance for the given transform type and arguments.

underlying_provider()[source]

If this provider is simply a proxy to another provider, return the provider that should actually be used for affinity checking.

affinity(other: apache_beam.yaml.yaml_provider.Provider)[source]

Returns a value approximating how good it would be for this provider to be used immediately following a transform from the other provider (e.g. to encourage fusion).

apache_beam.yaml.yaml_provider.as_provider(name, provider_or_constructor)[source]
apache_beam.yaml.yaml_provider.as_provider_list(name, lst)[source]
class apache_beam.yaml.yaml_provider.ExternalProvider(urns, service)[source]

Bases: apache_beam.yaml.yaml_provider.Provider

A Provider implemented via the cross language transform service.

provided_transforms()[source]
schema_transforms()[source]
config_schema(type)[source]
requires_inputs(typ, args)[source]
create_transform(type, args, yaml_create_transform)[source]
create_external_transform(urn, args)[source]
classmethod provider_from_spec(spec)[source]
classmethod register_provider_type(type_name)[source]
apache_beam.yaml.yaml_provider.java_jar(urns, jar: str)[source]
apache_beam.yaml.yaml_provider.maven_jar(urns, *, artifact_id, group_id, version, repository='https://repo.maven.apache.org/maven2', classifier=None, appendix=None)[source]
apache_beam.yaml.yaml_provider.beam_jar(urns, *, gradle_target, appendix=None, version='2.52.0', artifact_id=None)[source]
apache_beam.yaml.yaml_provider.docker(urns, **config)[source]
class apache_beam.yaml.yaml_provider.RemoteProvider(urns, address: str)[source]

Bases: apache_beam.yaml.yaml_provider.ExternalProvider

available()[source]
cache_artifacts()[source]
class apache_beam.yaml.yaml_provider.ExternalJavaProvider(urns, jar_provider)[source]

Bases: apache_beam.yaml.yaml_provider.ExternalProvider

available()[source]
cache_artifacts()[source]
apache_beam.yaml.yaml_provider.python(urns, packages=())[source]
class apache_beam.yaml.yaml_provider.ExternalPythonProvider(urns, packages)[source]

Bases: apache_beam.yaml.yaml_provider.ExternalProvider

available()[source]
cache_artifacts()[source]
create_external_transform(urn, args)[source]
apache_beam.yaml.yaml_provider.fix_pycallable()[source]
class apache_beam.yaml.yaml_provider.InlineProvider(transform_factories, no_input_transforms=())[source]

Bases: apache_beam.yaml.yaml_provider.Provider

available()[source]
cache_artifacts()[source]
provided_transforms()[source]
config_schema(typ)[source]
create_transform(type, args, yaml_create_transform)[source]
to_json()[source]
requires_inputs(typ, args)[source]
class apache_beam.yaml.yaml_provider.MetaInlineProvider(transform_factories, no_input_transforms=())[source]

Bases: apache_beam.yaml.yaml_provider.InlineProvider

create_transform(type, args, yaml_create_transform)[source]
class apache_beam.yaml.yaml_provider.SqlBackedProvider(transforms: Mapping[str, Callable[[...], apache_beam.transforms.ptransform.PTransform]], sql_provider: Optional[apache_beam.yaml.yaml_provider.Provider] = None)[source]

Bases: apache_beam.yaml.yaml_provider.Provider

sql_provider()[source]
provided_transforms()[source]
available()[source]
cache_artifacts()[source]
underlying_provider()[source]
to_json()[source]
create_transform(typ: str, args: Mapping[str, Any], yaml_create_transform: Any) → apache_beam.transforms.ptransform.PTransform[source]
apache_beam.yaml.yaml_provider.element_to_rows(e)[source]
apache_beam.yaml.yaml_provider.dicts_to_rows(o)[source]
apache_beam.yaml.yaml_provider.create_builtin_provider()[source]
class apache_beam.yaml.yaml_provider.PypiExpansionService(packages, base_python='/home/runner/work/beam/beam/beam/sdks/python/target/.tox/py38-docs/bin/python')[source]

Bases: object

Expands transforms by fully qualified name in a virtual environment with the given dependencies.

VENV_CACHE = '/home/runner/.apache_beam/cache/venvs'
class apache_beam.yaml.yaml_provider.RenamingProvider(transforms, mappings, underlying_provider, defaults=None)[source]

Bases: apache_beam.yaml.yaml_provider.Provider

static expand_mappings(mappings)[source]
available() → bool[source]
provided_transforms() → Iterable[str][source]
config_schema(type)[source]
requires_inputs(typ, args)[source]
create_transform(typ: str, args: Mapping[str, Any], yaml_create_transform: Callable[[Mapping[str, Any], Iterable[apache_beam.pvalue.PCollection]], apache_beam.transforms.ptransform.PTransform]) → apache_beam.transforms.ptransform.PTransform[source]

Creates a PTransform instance for the given transform type and arguments.

underlying_provider()[source]
cache_artifacts()[source]
apache_beam.yaml.yaml_provider.parse_providers(provider_specs)[source]
apache_beam.yaml.yaml_provider.merge_providers(*provider_sets)[source]
apache_beam.yaml.yaml_provider.standard_providers()[source]
apache_beam.yaml.yaml_provider.list_providers()[source]