Package org.apache.beam.runners.dataflow
package org.apache.beam.runners.dataflow
Provides a Beam runner that executes pipelines on the Google Cloud Dataflow service.
-
ClassDescription
PTransformOverrideFactories
that expands to correctly implement statefulParDo
using window-unawareBatchViewOverrides.GroupByKeyAndSortValuesOnly
to linearize processing per key.BatchStatefulParDoOverrides.BatchStatefulDoFn<K,V, OutputT> A key-preservingDoFn
that explodes an iterable that has been grouped by key and window.CreateDataflowView<ElemT,ViewT> ADataflowRunner
marker class for creating aPCollectionView
.Wrapper around the generatedDataflow
client to provide common functionality.An exception that is thrown if the unique job name constraint of the Dataflow service is broken because an existing job with the same job name is currently active.An exception that is thrown if the existing job has already been updated within the Dataflow service and is no longer able to be updated.ARuntimeException
that contains information about aDataflowPipelineJob
.A DataflowPipelineJob represents a job submitted to Dataflow usingDataflowRunner
.Register theDataflowPipelineOptions
.Register theDataflowRunner
.DataflowPipelineTranslator
knows how to translatePipeline
objects into Cloud Dataflow Service APIJob
s.The result of a job translation.APipelineRunner
that executes the operations in the pipeline by first translating them to the Dataflow representation using theDataflowPipelineTranslator
and then submitting them to a Dataflow service for execution.A markerDoFn
for writing the contents of aPCollection
to a streamingPCollectionView
backend implementation.An instance of this class can be passed to theDataflowRunner
to add user defined hooks to be invoked at various times during pipeline execution.Populates versioning and other information forDataflowRunner
.Signals there was an error retrieving information about a job from the Cloud Dataflow Service.PrimitiveParDoSingleFactory<InputT,OutputT> APTransformOverrideFactory
that producesPrimitiveParDoSingleFactory.ParDoSingle
instances fromParDo.SingleOutput
instances.PrimitiveParDoSingleFactory.ParDoSingle<InputT,OutputT> A single-output primitiveParDo
.A translator forPrimitiveParDoSingleFactory.ParDoSingle
.A set of options used to configure theTestPipeline
.TestDataflowRunner
is a pipeline runner that wraps aDataflowRunner
when running tests against theTestPipeline
.TransformTranslator<TransformT extends PTransform>ATransformTranslator
knows how to translate a particular subclass ofPTransform
for the Cloud Dataflow service.The interface for aTransformTranslator
to build a Dataflow step.The interface provided to registered callbacks for interacting with theDataflowRunner
, including reading and writing the values ofPCollection
s and side inputs.