Class StateSpecFunctions
java.lang.Object
org.apache.beam.runners.spark.stateful.StateSpecFunctions
A class containing 
StateSpec mappingFunctions.- 
Constructor Summary
Constructors - 
Method Summary
Modifier and TypeMethodDescriptionstatic <T,CheckpointMarkT extends UnboundedSource.CheckpointMark> 
scala.Function3<Source<T>, scala.Option<CheckpointMarkT>, org.apache.spark.streaming.State<scala.Tuple2<byte[], Instant>>, scala.Tuple2<Iterable<byte[]>, SparkUnboundedSource.Metadata>> mapSourceFunction(org.apache.beam.runners.core.construction.SerializablePipelineOptions options, String stepName) AStateSpecfunction to support reading from anUnboundedSource. 
- 
Constructor Details
- 
StateSpecFunctions
public StateSpecFunctions() 
 - 
 - 
Method Details
- 
mapSourceFunction
public static <T,CheckpointMarkT extends UnboundedSource.CheckpointMark> scala.Function3<Source<T>,scala.Option<CheckpointMarkT>, mapSourceFunctionorg.apache.spark.streaming.State<scala.Tuple2<byte[], Instant>>, scala.Tuple2<Iterable<byte[]>, SparkUnboundedSource.Metadata>> (org.apache.beam.runners.core.construction.SerializablePipelineOptions options, String stepName) AStateSpecfunction to support reading from anUnboundedSource.This StateSpec function expects the following:
- Key: The (partitioned) Source to read from.
 - Value: An optional 
UnboundedSource.CheckpointMarkto start from. - State: A byte representation of the (previously) persisted CheckpointMark.
 
This stateful operation could be described as a flatMap over a single-element stream, which outputs all the elements read from the
UnboundedSourcefor this micro-batch. Since micro-batches are bounded, the provided UnboundedSource is wrapped by aMicrobatchSourcethat applies bounds in the form of duration and max records (per micro-batch).In order to avoid using Spark Guava's classes which pollute the classpath, we use the
StateSpec.function(scala.Function3)signature which employs scala's nativeOption, instead of theStateSpec.function(org.apache.spark.api.java.function.Function3)signature, which employs Guava'sOptional.See also SPARK-4819.
- Type Parameters:
 T- The type of the input stream elements.CheckpointMarkT- The type of theUnboundedSource.CheckpointMark.- Parameters:
 options- A serializableSerializablePipelineOptions.- Returns:
 - The appropriate 
StateSpecfunction. 
 
 -