org.apache.beam.runners.spark.io.SparkUnboundedSource

public class SparkUnboundedSource extends Object

A "composite" InputDStream implementation for UnboundedSources.

This read is a composite of the following steps:

Create a single-element (per-partition) stream, that contains the (partitioned) Source and an optional UnboundedSource.CheckpointMark to start from.
Read from within a stateful operation JavaPairDStream.mapWithState(StateSpec) using the StateSpecFunctions.mapSourceFunction(org.apache.beam.runners.core.construction.SerializablePipelineOptions, java.lang.String) mapping function, which manages the state of the CheckpointMark per partition.
Since the stateful operation is a map operation, the read iterator needs to be flattened, while reporting the properties of the read (such as number of records) to the tracker.

Constructor Details
- SparkUnboundedSource
  
  public SparkUnboundedSource()
Method Details
- read
  
  public static <T, CheckpointMarkT extends UnboundedSource.CheckpointMark> UnboundedDataset<T> read(org.apache.spark.streaming.api.java.JavaStreamingContext jssc, org.apache.beam.runners.core.construction.SerializablePipelineOptions rc, UnboundedSource<T,CheckpointMarkT> source, String stepName)

Class SparkUnboundedSource

Nested Class Summary