apache_beam.runners.direct.sdf_direct_runner module¶
This module contains Splittable DoFn logic that is specific to DirectRunner.
-
class
apache_beam.runners.direct.sdf_direct_runner.
SplittableParDoOverride
[source]¶ Bases:
apache_beam.pipeline.PTransformOverride
A transform override for ParDo transformss of SplittableDoFns.
Replaces the ParDo transform with a SplittableParDo transform that performs SDF specific logic.
-
class
apache_beam.runners.direct.sdf_direct_runner.
SplittableParDo
(ptransform)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
A transform that processes a PCollection using a Splittable DoFn.
-
class
apache_beam.runners.direct.sdf_direct_runner.
ElementAndRestriction
(element, restriction)[source]¶ Bases:
future.types.newobject.newobject
A holder for an element and a restriction.
-
class
apache_beam.runners.direct.sdf_direct_runner.
PairWithRestrictionFn
(do_fn)[source]¶ Bases:
apache_beam.transforms.core.DoFn
A transform that pairs each element with a restriction.
-
class
apache_beam.runners.direct.sdf_direct_runner.
SplitRestrictionFn
(do_fn)[source]¶ Bases:
apache_beam.transforms.core.DoFn
A transform that perform initial splitting of Splittable DoFn inputs.
-
class
apache_beam.runners.direct.sdf_direct_runner.
ExplodeWindowsFn
(*unused_args, **unused_kwargs)[source]¶ Bases:
apache_beam.transforms.core.DoFn
A transform that forces the runner to explode windows.
This is done to make sure that Splittable DoFn proceses an element for each of the windows that element belongs to.
-
class
apache_beam.runners.direct.sdf_direct_runner.
RandomUniqueKeyFn
(*unused_args, **unused_kwargs)[source]¶ Bases:
apache_beam.transforms.core.DoFn
A transform that assigns a unique key to each element.
-
class
apache_beam.runners.direct.sdf_direct_runner.
ProcessKeyedElements
(sdf, element_coder, restriction_coder, windowing_strategy, ptransform_args, ptransform_kwargs, ptransform_side_inputs)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
A primitive transform that performs SplittableDoFn magic.
Input to this transform should be a PCollection of keyed ElementAndRestriction objects.
-
class
apache_beam.runners.direct.sdf_direct_runner.
ProcessKeyedElementsViaKeyedWorkItemsOverride
[source]¶ Bases:
apache_beam.pipeline.PTransformOverride
A transform override for ProcessElements transform.
-
class
apache_beam.runners.direct.sdf_direct_runner.
ProcessKeyedElementsViaKeyedWorkItems
(process_keyed_elements_transform)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
A transform that processes Splittable DoFn input via KeyedWorkItems.
-
class
apache_beam.runners.direct.sdf_direct_runner.
ProcessElements
(process_keyed_elements_transform)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
A primitive transform for processing keyed elements or KeyedWorkItems.
Will be evaluated by runners.direct.transform_evaluator._ProcessElementsEvaluator.
-
class
apache_beam.runners.direct.sdf_direct_runner.
ProcessFn
(sdf, args_for_invoker, kwargs_for_invoker)[source]¶ Bases:
apache_beam.transforms.core.DoFn
A DoFn that executes machineary for invoking a Splittable DoFn.
Input to the ParDo step that includes a ProcessFn will be a PCollection of ElementAndRestriction objects.
This class is mainly responsible for following. (1) setup environment for properly invoking a Splittable DoFn. (2) invoke process() method of a Splittable DoFn. (3) after the process() invocation of the Splittable DoFn, determine if a re-invocation of the element is needed. If this is the case, set state and a timer for a re-invocation and hold output watermark till this re-invocation. (4) after the final invocation of a given element clear any previous state set for re-invoking the element and release the output watermark.
-
step_context
¶
-
-
class
apache_beam.runners.direct.sdf_direct_runner.
SDFProcessElementInvoker
(max_num_outputs, max_duration)[source]¶ Bases:
future.types.newobject.newobject
A utility that invokes SDF process() method and requests checkpoints.
This class is responsible for invoking the process() method of a Splittable DoFn and making sure that invocation terminated properly. Based on the input configuration, this class may decide to request a checkpoint for a process() execution so that runner can process current output and resume the invocation at a later time.
More specifically, when initializing a SDFProcessElementInvoker, caller may specify the number of output elements or processing time after which a checkpoint should be requested. This class is responsible for properly requesting a checkpoint based on either of these criteria. When the process() call of Splittable DoFn ends, this class performs validations to make sure that processing ended gracefully and returns a SDFProcessElementInvoker.Result that contains information which can be used by the caller to perform another process() invocation for the residual.
A process() invocation may decide to give up processing voluntarily by returning a ProcessContinuation object (see documentation of ProcessContinuation for more details). So if a ‘ProcessContinuation’ is produced this class ends the execution and performs steps to finalize the current invocation.
-
class
Result
(residual_restriction=None, process_continuation=None, future_output_watermark=None)[source]¶ Bases:
future.types.newobject.newobject
Returned as a result of a invoke_process_element() invocation.
Parameters: - residual_restriction – a restriction for the unprocessed part of the element.
- process_continuation – a ProcessContinuation if one was returned as the last element of the SDF process() invocation.
- future_output_watermark – output watermark of the results that will be produced when invoking the Splittable DoFn for the current element with residual_restriction.
-
invoke_process_element
(sdf_invoker, element, tracker, *args, **kwargs)[source]¶ Invokes process() method of a Splittable DoFn for a given element.
Parameters: - sdf_invoker – a DoFnInvoker for the Splittable DoFn.
- element – the element to process
- tracker – a RestrictionTracker for the element that will be passed when invoking the process() method of the Splittable DoFn.
Returns: a SDFProcessElementInvoker.Result object.
-
class