Package org.apache.beam.runners.dataflow
package org.apache.beam.runners.dataflow
Provides a Beam runner that executes pipelines on the Google Cloud Dataflow service.
-
ClassDescription
PTransformOverrideFactoriesthat expands to correctly implement statefulParDousing window-unawareBatchViewOverrides.GroupByKeyAndSortValuesOnlyto linearize processing per key.BatchStatefulParDoOverrides.BatchStatefulDoFn<K,V, OutputT> A key-preservingDoFnthat explodes an iterable that has been grouped by key and window.CreateDataflowView<ElemT,ViewT> ADataflowRunnermarker class for creating aPCollectionView.Wrapper around the generatedDataflowclient to provide common functionality.An exception that is thrown if the unique job name constraint of the Dataflow service is broken because an existing job with the same job name is currently active.An exception that is thrown if the existing job has already been updated within the Dataflow service and is no longer able to be updated.ARuntimeExceptionthat contains information about aDataflowPipelineJob.A DataflowPipelineJob represents a job submitted to Dataflow usingDataflowRunner.Register theDataflowPipelineOptions.Register theDataflowRunner.DataflowPipelineTranslatorknows how to translatePipelineobjects into Cloud Dataflow Service APIJobs.The result of a job translation.APipelineRunnerthat executes the operations in the pipeline by first translating them to the Dataflow representation using theDataflowPipelineTranslatorand then submitting them to a Dataflow service for execution.A markerDoFnfor writing the contents of aPCollectionto a streamingPCollectionViewbackend implementation.An instance of this class can be passed to theDataflowRunnerto add user defined hooks to be invoked at various times during pipeline execution.Populates versioning and other information forDataflowRunner.Signals there was an error retrieving information about a job from the Cloud Dataflow Service.PrimitiveParDoSingleFactory<InputT,OutputT> APTransformOverrideFactorythat producesPrimitiveParDoSingleFactory.ParDoSingleinstances fromParDo.SingleOutputinstances.PrimitiveParDoSingleFactory.ParDoSingle<InputT,OutputT> A single-output primitiveParDo.A translator forPrimitiveParDoSingleFactory.ParDoSingle.A set of options used to configure theTestPipeline.TestDataflowRunneris a pipeline runner that wraps aDataflowRunnerwhen running tests against theTestPipeline.TransformTranslator<TransformT extends PTransform>ATransformTranslatorknows how to translate a particular subclass ofPTransformfor the Cloud Dataflow service.The interface for aTransformTranslatorto build a Dataflow step.The interface provided to registered callbacks for interacting with theDataflowRunner, including reading and writing the values ofPCollections and side inputs.