java.lang.Object
org.apache.beam.runners.spark.structuredstreaming.translation.EvaluationContext

@Internal public final class EvaluationContext extends Object
The EvaluationContext is the result of a pipeline translation and can be used to evaluate / run the pipeline.

However, in some cases pipeline translation involves the early evaluation of some parts of the pipeline. For example, this is necessary to materialize side-inputs. The EvaluationContext won't re-evaluate such datasets.

  • Method Summary

    Modifier and Type
    Method
    Description
    static <T extends @NonNull Object>
    T[]
    collect(String name, org.apache.spark.sql.Dataset<T> ds)
    The purpose of this utility is to mark the evaluation of Spark actions, both during Pipeline translation, when evaluation is required, and when finally evaluating the pipeline.
    void
    Trigger evaluation of all leaf datasets.
    static <T> void
    evaluate(String name, org.apache.spark.sql.Dataset<T> ds)
    The purpose of this utility is to mark the evaluation of Spark actions, both during Pipeline translation, when evaluation is required, and when finally evaluating the pipeline.
    org.apache.spark.sql.SparkSession
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • evaluate

      public void evaluate()
      Trigger evaluation of all leaf datasets.
    • evaluate

      public static <T> void evaluate(String name, org.apache.spark.sql.Dataset<T> ds)
      The purpose of this utility is to mark the evaluation of Spark actions, both during Pipeline translation, when evaluation is required, and when finally evaluating the pipeline.
    • collect

      public static <T extends @NonNull Object> T[] collect(String name, org.apache.spark.sql.Dataset<T> ds)
      The purpose of this utility is to mark the evaluation of Spark actions, both during Pipeline translation, when evaluation is required, and when finally evaluating the pipeline.
    • getSparkSession

      public org.apache.spark.sql.SparkSession getSparkSession()