apache_beam.runners.dataflow.native_io package¶
Submodules¶
apache_beam.runners.dataflow.native_io.iobase module¶
Dataflow native sources and sinks.
For internal use only; no backwards-compatibility guarantees.
-
class
apache_beam.runners.dataflow.native_io.iobase.
ConcatPosition
(index, position)[source]¶ Bases:
object
A position that encapsulate an inner position and an index.
This is used to represent the position of a source that encapsulate several other sources.
-
class
apache_beam.runners.dataflow.native_io.iobase.
DynamicSplitRequest
(progress)[source]¶ Bases:
object
Specifies how ‘NativeSourceReader.request_dynamic_split’ should split.
-
class
apache_beam.runners.dataflow.native_io.iobase.
DynamicSplitResultWithPosition
(stop_position)[source]¶ Bases:
apache_beam.runners.dataflow.native_io.iobase.DynamicSplitResult
-
class
apache_beam.runners.dataflow.native_io.iobase.
NativeSink
[source]¶ Bases:
apache_beam.transforms.display.HasDisplayData
A sink implemented by Dataflow service.
This class is to be only inherited by sinks natively implemented by Cloud Dataflow service, hence should not be sub-classed by users.
-
class
apache_beam.runners.dataflow.native_io.iobase.
NativeSinkWriter
[source]¶ Bases:
object
A writer for a sink implemented by Dataflow service.
-
takes_windowed_values
¶ Returns whether this writer takes windowed values.
-
-
class
apache_beam.runners.dataflow.native_io.iobase.
NativeSource
[source]¶ Bases:
apache_beam.transforms.display.HasDisplayData
A source implemented by Dataflow service.
This class is to be only inherited by sources natively implemented by Cloud Dataflow service, hence should not be sub-classed by users.
This class is deprecated and should not be used to define new sources.
-
class
apache_beam.runners.dataflow.native_io.iobase.
NativeSourceReader
[source]¶ Bases:
object
A reader for a source implemented by Dataflow service.
-
get_progress
()[source]¶ Returns a representation of how far the reader has read.
Returns: A SourceReaderProgress object that gives the current progress of the reader.
-
request_dynamic_split
(dynamic_split_request)[source]¶ Attempts to split the input in two parts.
The two parts are named the “primary” part and the “residual” part. The current ‘NativeSourceReader’ keeps processing the primary part, while the residual part will be processed elsewhere (e.g. perhaps on a different worker).
The primary and residual parts, if concatenated, must represent the same input as the current input of this ‘NativeSourceReader’ before this call.
The boundary between the primary part and the residual part is specified in a framework-specific way using ‘DynamicSplitRequest’ e.g., if the framework supports the notion of positions, it might be a position at which the input is asked to split itself (which is not necessarily the same position at which it will split itself); it might be an approximate fraction of input, or something else.
This function returns a ‘DynamicSplitResult’, which encodes, in a framework-specific way, the information sufficient to construct a description of the resulting primary and residual inputs. For example, it might, again, be a position demarcating these parts, or it might be a pair of fully-specified input descriptions, or something else.
After a successful call to ‘request_dynamic_split()’, subsequent calls should be interpreted relative to the new primary.
Parameters: dynamic_split_request – A ‘DynamicSplitRequest’ describing the split request. Returns: ‘None’ if the ‘DynamicSplitRequest’ cannot be honored (in that case the input represented by this ‘NativeSourceReader’ stays the same), or a ‘DynamicSplitResult’ describing how the input was split into a primary and residual part.
-
returns_windowed_values
¶ Returns whether this reader returns windowed values.
-
-
class
apache_beam.runners.dataflow.native_io.iobase.
ReaderPosition
(end=None, key=None, byte_offset=None, record_index=None, shuffle_position=None, concat_position=None)[source]¶ Bases:
object
A representation of position in an iteration of a ‘NativeSourceReader’.
-
class
apache_beam.runners.dataflow.native_io.iobase.
ReaderProgress
(position=None, percent_complete=None, remaining_time=None, consumed_split_points=None, remaining_split_points=None)[source]¶ Bases:
object
A representation of how far a NativeSourceReader has read.
-
consumed_split_points
¶
-
percent_complete
¶ Returns progress, represented as a percentage of total work.
Progress range from 0.0 (beginning, nothing complete) to 1.0 (end of the work range, entire WorkItem complete).
Returns: Progress represented as a percentage of total work.
-
position
¶ Returns progress, represented as a ReaderPosition object.
-
remaining_split_points
¶
-
remaining_time
¶ Returns progress, represented as an estimated time remaining.
-
apache_beam.runners.dataflow.native_io.streaming_create module¶
Create transform for streaming.
-
class
apache_beam.runners.dataflow.native_io.streaming_create.
StreamingCreate
(values, coder)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
A specialized implementation for
Create
transform in streaming mode.Note: There is no unbounded source API in python to wrap the Create source, so we map this to composite of Impulse primitive and an SDF.
-
class
DecodeAndEmitDoFn
(encoded_values, coder)[source]¶ Bases:
apache_beam.transforms.core.DoFn
A DoFn which stores encoded versions of elements.
It also stores a Coder to decode and emit those elements. TODO: BEAM-2422 - Make this a SplittableDoFn.
-
class
Impulse
(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
The Dataflow specific override for the impulse primitive.
-
class