Class OrderedProcessingHandler<EventT,KeyT,StateT extends MutableState<EventT,?>,ResultT>

java.lang.Object
org.apache.beam.sdk.extensions.ordered.OrderedProcessingHandler<EventT,KeyT,StateT,ResultT>
Type Parameters:
EventT - type of events to be processed
KeyT - type of keys which will be used to group the events
StateT - type of internal State which will be used for processing
ResultT - type of the result of the processing which will be output
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
OrderedProcessingHandler.OrderedProcessingGlobalSequenceHandler

public abstract class OrderedProcessingHandler<EventT,KeyT,StateT extends MutableState<EventT,?>,ResultT> extends Object implements Serializable
Parent class for Ordered Processing configuration handlers.

There are two types of processing - when the sequence numbers are contiguous per key and these sequences per keys are independent of each other, and when there is a global sequence shared by all keys. In case of the global sequence processing the custom handler must extend from .

See Also:
  • Field Details

    • DEFAULT_MAX_ELEMENTS_TO_OUTPUT

      public static final int DEFAULT_MAX_ELEMENTS_TO_OUTPUT
      See Also:
  • Constructor Details

    • OrderedProcessingHandler

      public OrderedProcessingHandler(Class<EventT> eventTClass, Class<KeyT> keyTClass, Class<StateT> stateTClass, Class<ResultT> resultTClass)
      Provide concrete classes which will be used by the ordered processing transform.
      Parameters:
      eventTClass - class of the events
      keyTClass - class of the keys
      stateTClass - class of the state
      resultTClass - class of the results
  • Method Details

    • getEventExaminer

      public abstract @NonNull EventExaminer<EventT,StateT> getEventExaminer()
      Returns:
      the event examiner instance which will be used by the transform.
    • getEventCoder

      public @NonNull Coder<EventT> getEventCoder(Pipeline pipeline, Coder<KV<KeyT,KV<Long,EventT>>> inputCoder) throws CannotProvideCoderException
      Provide the event coder.

      The default implementation of the method will use the event coder from the input PCollection. If the input PCollection doesn't use KVCoder, it will attempt to get the coder from the pipeline's coder registry.

      Parameters:
      pipeline - of the transform
      inputCoder - input coder of the transform
      Returns:
      event coder
      Throws:
      CannotProvideCoderException - if the method can't determine the coder based on the above algorithm.
    • getStateCoder

      public @NonNull Coder<StateT> getStateCoder(Pipeline pipeline) throws CannotProvideCoderException
      Provide the state coder.

      The default implementation will attempt to get the coder from the pipeline's code registry.

      Parameters:
      pipeline - of the transform
      Returns:
      the state coder
      Throws:
      CannotProvideCoderException
    • getKeyCoder

      public @NonNull Coder<KeyT> getKeyCoder(Pipeline pipeline, Coder<KV<KeyT,KV<Long,EventT>>> inputCoder) throws CannotProvideCoderException
      Provide the key coder.

      The default implementation of the method will use the event coder from the input PCollection. If the input PCollection doesn't use KVCoder, it will attempt to get the coder from the pipeline's coder registry.

      Parameters:
      pipeline -
      inputCoder -
      Returns:
      Throws:
      CannotProvideCoderException - if the method can't determine the coder based on the above algorithm.
    • getResultCoder

      public @NonNull Coder<ResultT> getResultCoder(Pipeline pipeline) throws CannotProvideCoderException
      Provide the result coder.

      The default implementation will attempt to get the coder from the pipeline's code registry.

      Parameters:
      pipeline -
      Returns:
      result coder
      Throws:
      CannotProvideCoderException
    • getStatusUpdateFrequency

      public @Nullable Duration getStatusUpdateFrequency()
      Determines the frequency of emission of the OrderedProcessingStatus elements.

      Default is 5 seconds.

      Returns:
      the frequency of updates. If null is returned, no updates will be emitted on a scheduled basis.
    • setStatusUpdateFrequency

      public void setStatusUpdateFrequency(Duration statusUpdateFrequency)
      Changes the default status update frequency. Updates will be disabled if set to null.
      Parameters:
      statusUpdateFrequency -
    • isProduceStatusUpdateOnEveryEvent

      public boolean isProduceStatusUpdateOnEveryEvent()
      Indicates if the status update needs to be sent after each event's processing.

      Default is false.

      Returns:
      See Also:
    • setProduceStatusUpdateOnEveryEvent

      public void setProduceStatusUpdateOnEveryEvent(boolean value)
      Sets the indicator of whether the status notification needs to be produced on every event.
      Parameters:
      value -
    • getMaxOutputElementsPerBundle

      public int getMaxOutputElementsPerBundle()
      Returns the maximum number of elements which will be output per each bundle. The default is 10,000 elements.

      This is used to limit the amount of data produced for each bundle - many runners have limitations on how much data can be output from a single bundle. If many events arrive out of sequence and are buffered then at some point a single event can cause processing of a large number of buffered events.

      Returns:
    • setMaxOutputElementsPerBundle

      public void setMaxOutputElementsPerBundle(int maxOutputElementsPerBundle)
      Overrides the default value.
      Parameters:
      maxOutputElementsPerBundle -