apache_beam.runners package¶
Subpackages¶
- apache_beam.runners.dataflow package- Subpackages
- Submodules
- apache_beam.runners.dataflow.dataflow_metrics module
- apache_beam.runners.dataflow.dataflow_runner module
- apache_beam.runners.dataflow.ptransform_overrides module
- apache_beam.runners.dataflow.test_dataflow_runner module
- Module contents
 
- apache_beam.runners.direct package- Submodules
- apache_beam.runners.direct.bundle_factory module
- apache_beam.runners.direct.clock module
- apache_beam.runners.direct.consumer_tracking_pipeline_visitor module
- apache_beam.runners.direct.direct_metrics module
- apache_beam.runners.direct.direct_runner module
- apache_beam.runners.direct.evaluation_context module
- apache_beam.runners.direct.executor module
- apache_beam.runners.direct.helper_transforms module
- apache_beam.runners.direct.transform_evaluator module
- apache_beam.runners.direct.util module
- apache_beam.runners.direct.watermark_manager module
- Module contents
 
Submodules¶
apache_beam.runners.common module¶
Worker operations executor.
For internal use only; no backwards-compatibility guarantees.
- 
class apache_beam.runners.common.DoFnContext(label, element=None, state=None)[source]¶
- Bases: - object- For internal use only; no backwards-compatibility guarantees. - 
element¶
 - 
timestamp¶
 - 
windows¶
 
- 
- 
class apache_beam.runners.common.DoFnInvoker(output_processor, signature)[source]¶
- Bases: - object- An abstraction that can be used to execute DoFn methods. - A DoFnInvoker describes a particular way for invoking methods of a DoFn represented by a given DoFnSignature. - 
static create_invoker(output_processor, signature, context, side_inputs, input_args, input_kwargs)[source]¶
- Creates a new DoFnInvoker based on given arguments. - Parameters: - signature – a DoFnSignature for the DoFn being invoked.
- context – Context to be used when invoking the DoFn (deprecated).
- side_inputs – side inputs to be used when invoking th process method.
- input_args – arguments to be used when invoking the process method
- input_kwargs – kwargs to be used when invoking the process method.
 
 
- 
static 
- 
class apache_beam.runners.common.DoFnMethodWrapper(do_fn, method_name)[source]¶
- Bases: - object- For internal use only; no backwards-compatibility guarantees. - Represents a method of a DoFn object. 
- 
class apache_beam.runners.common.DoFnRunner(fn, args, kwargs, side_inputs, windowing, context=None, tagged_receivers=None, logger=None, step_name=None, logging_context=None, state=None, scoped_metrics_container=None)[source]¶
- Bases: - apache_beam.runners.common.Receiver- For internal use only; no backwards-compatibility guarantees. - A helper class for executing ParDo operations. 
- 
class apache_beam.runners.common.DoFnSignature(do_fn)[source]¶
- Bases: - object- Represents the signature of a given - DoFnobject.- Signature of a - DoFnprovides a view of the properties of a given- DoFn. Among other things, this will give an extensible way for for (1) accessing the structure of the- DoFnincluding methods and method parameters (2) identifying features that a given- DoFnsupport, for example, whether a given- DoFnis a Splittable- DoFn( https://s.apache.org/splittable-do-fn) (3) validating a- DoFnbased on the feature set offered by it.
- 
class apache_beam.runners.common.DoFnState(counter_factory)[source]¶
- Bases: - object- For internal use only; no backwards-compatibility guarantees. - Keeps track of state that DoFns want, currently, user counters. 
- 
class apache_beam.runners.common.LoggingContext[source]¶
- Bases: - object- For internal use only; no backwards-compatibility guarantees. 
- 
class apache_beam.runners.common.PerWindowInvoker(output_processor, signature, context, side_inputs, input_args, input_kwargs)[source]¶
- Bases: - apache_beam.runners.common.DoFnInvoker- An invoker that processes elements considering windowing information. 
- 
class apache_beam.runners.common.Receiver[source]¶
- Bases: - object- For internal use only; no backwards-compatibility guarantees. - An object that consumes a WindowedValue. - This class can be efficiently used to pass values between the sdk and worker harnesses. 
- 
class apache_beam.runners.common.SimpleInvoker(output_processor, signature)[source]¶
- Bases: - apache_beam.runners.common.DoFnInvoker- An invoker that processes elements ignoring windowing information. 
apache_beam.runners.pipeline_context module¶
Utility class for serializing pipelines via the runner API.
For internal use only; no backwards-compatibility guarantees.
apache_beam.runners.runner module¶
PipelineRunner, an abstract base runner object.
- 
class apache_beam.runners.runner.PipelineRunner[source]¶
- Bases: - object- A runner of a pipeline object. - The base runner provides a run() method for visiting every node in the pipeline’s DAG and executing the transforms computing the PValue in the node. - A custom runner will typically provide implementations for some of the transform methods (ParDo, GroupByKey, Create, etc.). It may also provide a new implementation for clear_pvalue(), which is used to wipe out materialized values in order to reduce footprint. - 
apply(transform, input)[source]¶
- Runner callback for a pipeline.apply call. - Parameters: - transform – the transform to apply.
- input – transform’s input (typically a PCollection).
 - A concrete implementation of the Runner class may want to do custom pipeline construction for a given transform. To override the behavior for a transform class Xyz, implement an apply_Xyz method with this same signature. 
 - 
run_transform(transform_node)[source]¶
- Runner callback for a pipeline.run call. - Parameters: - transform_node – transform node for the transform to run. - A concrete implementation of the Runner class must implement run_Abc for some class Abc in the method resolution order for every non-composite transform Xyz in the pipeline. 
 
- 
- 
class apache_beam.runners.runner.PipelineState[source]¶
- Bases: - object- State of the Pipeline, as returned by PipelineResult.state. - This is meant to be the union of all the states any runner can put a pipeline in. Currently, it represents the values of the dataflow API JobState enum. - 
CANCELLED= 'CANCELLED'¶
 - 
DONE= 'DONE'¶
 - 
DRAINED= 'DRAINED'¶
 - 
DRAINING= 'DRAINING'¶
 - 
FAILED= 'FAILED'¶
 - 
RUNNING= 'RUNNING'¶
 - 
STOPPED= 'STOPPED'¶
 - 
UNKNOWN= 'UNKNOWN'¶
 - 
UPDATED= 'UPDATED'¶
 
- 
- 
class apache_beam.runners.runner.PipelineResult(state)[source]¶
- Bases: - object- A PipelineResult provides access to info about a pipeline. - 
aggregated_values(aggregator_or_name)[source]¶
- Return a dict of step names to values of the Aggregator. 
 - 
cancel()[source]¶
- Cancels the pipeline execution. - Raises: - IOError– If there is a persistent problem getting job information.
- NotImplementedError– If the runner does not support this operation.
 - Returns: - The final state of the pipeline. 
 - 
metrics()[source]¶
- Returns MetricsResult object to query metrics from the runner. - Raises: - NotImplementedError– If the runner does not support this operation.
 - 
state¶
- Return the current state of the pipeline execution. 
 - 
wait_until_finish(duration=None)[source]¶
- Waits until the pipeline finishes and returns the final status. - Parameters: - duration – The time to wait (in milliseconds) for job to finish. If it is set to None, it will wait indefinitely until the job is finished. - Raises: - IOError– If there is a persistent problem getting job information.
- NotImplementedError– If the runner does not support this operation.
 - Returns: - The final state of the pipeline, or None on timeout. 
 
- 
Module contents¶
Runner objects execute a Pipeline.
This package defines runners, which are used to execute a pipeline.