apache_beam.runners.dataflow.dataflow_runner module
A runner implementation that submits a job for remote execution.
The runner will create a JSON description of the job graph and then submit it
to the Dataflow Service for remote execution by a worker.
-
class
apache_beam.runners.dataflow.dataflow_runner.
DataflowRunner
(cache=None)[source]
Bases: apache_beam.runners.runner.PipelineRunner
A runner that creates job graphs and submits them for remote execution.
Every execution of the run() method will submit an independent job for
remote execution that consists of the nodes reachable from the passed in
node argument or entire graph if node is None. The run() method returns
after the service created the job and will not wait for the job to finish
if blocking is set to False.
-
static
poll_for_job_completion
(runner, result, duration)[source]
Polls for the specified job to finish running (successfully or not).
Updates the result with the new job information before returning.
Parameters: |
- runner – DataflowRunner instance to use for polling job state.
- result – DataflowPipelineResult instance used for job information.
- duration (int) – The time to wait (in milliseconds) for job to finish.
If it is set to
None , it will wait indefinitely until the job
is finished.
|
-
static
group_by_key_input_visitor
()[source]
-
static
flatten_input_visitor
()[source]
-
run
(pipeline)[source]
Remotely executes entire pipeline or parts reachable from node.
-
run_Impulse
(transform_node)[source]
-
run_Flatten
(transform_node)[source]
-
apply_WriteToBigQuery
(transform, pcoll)[source]
-
apply_GroupByKey
(transform, pcoll)[source]
-
run_GroupByKey
(transform_node)[source]
-
run_ParDo
(transform_node)[source]
-
apply_CombineValues
(transform, pcoll)[source]
-
run_CombineValues
(transform_node)[source]
-
run_Read
(transform_node)[source]
-
run__NativeWrite
(transform_node)[source]
-
classmethod
serialize_windowing_strategy
(windowing)[source]
-
classmethod
deserialize_windowing_strategy
(serialized_data)[source]
-
static
byte_array_to_json_string
(raw_bytes)[source]
Implements org.apache.beam.sdk.util.StringUtils.byteArrayToJsonString.
-
static
json_string_to_byte_array
(encoded_string)[source]
Implements org.apache.beam.sdk.util.StringUtils.jsonStringToByteArray.
-
class
CreatePTransformOverride
Bases: apache_beam.pipeline.PTransformOverride
A PTransformOverride
for Create
in streaming mode.
-
get_matcher
()
-
get_replacement_transform
(ptransform)
-
static
is_streaming_create
(applied_ptransform)