Interface DataflowPipelineOptions

All Superinterfaces:
ApplicationNameOptions, BigQueryOptions, DataflowPipelineDebugOptions, DataflowPipelineWorkerPoolOptions, DataflowProfilingOptions, DataflowStreamingPipelineOptions, DataflowWorkerLoggingOptions, ExperimentalOptions, FileStagingOptions, GcpOptions, GcsOptions, GoogleApiDebugOptions, HasDisplayData, MemoryMonitorOptions, PipelineOptions, PubsubOptions, StreamingOptions
All Known Subinterfaces:
DataflowWorkerHarnessOptions, TestDataflowPipelineOptions

Options that can be used to configure the DataflowRunner.
  • Method Details

    • getProject

      Description copied from interface: GcpOptions
      Project id to use when launching jobs.
      Specified by:
      getProject in interface GcpOptions
    • setProject

      void setProject(String value)
      Specified by:
      setProject in interface GcpOptions
    • getStagingLocation

      GCS path for staging local files, e.g. gs://bucket/object

      Must be a valid Cloud Storage URL, beginning with the prefix "gs://"

      If getStagingLocation() is not set, it will default to GcpOptions.getGcpTempLocation(). GcpOptions.getGcpTempLocation() must be a valid GCS path.

    • setStagingLocation

      void setStagingLocation(String value)
    • isUpdate

      boolean isUpdate()
      Whether to update the currently running pipeline with the same name as this one.
    • setUpdate

      void setUpdate(boolean value)
    • getCreateFromSnapshot

      String getCreateFromSnapshot()
      If set, the snapshot from which the job should be created.
    • setCreateFromSnapshot

      void setCreateFromSnapshot(String value)
    • getTemplateLocation

      String getTemplateLocation()
      Where the runner should generate a template file. Must either be local or Cloud Storage.
    • setTemplateLocation

      void setTemplateLocation(String value)
      Sets the Cloud Storage path where the Dataflow template will be stored. Required for creating Flex Templates or Classic Templates.

      Example:

      
       DataflowPipelineOptions options = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
       options.setTemplateLocation("gs://your-bucket/templates/my-template");
       
      Parameters:
      value - Cloud Storage path for storing the Dataflow template.
    • getDataflowServiceOptions

      List<String> getDataflowServiceOptions()
      Service options are set by the user and configure the service. This decouples service side feature availability from the Apache Beam release cycle.
    • setDataflowServiceOptions

      void setDataflowServiceOptions(List<String> options)
    • getServiceAccount

      String getServiceAccount()
      Run the job as a specific service account, instead of the default GCE robot.
    • setServiceAccount

      void setServiceAccount(String value)
    • getRegion

      The Google Compute Engine region for creating Dataflow jobs.
    • setRegion

      void setRegion(String region)
    • getDataflowEndpoint

      @String("") String getDataflowEndpoint()
      Dataflow endpoint to use.

      Defaults to the current version of the Google Cloud Dataflow API, at the time the current SDK version was released.

      If the string contains "://", then this is treated as a URL, otherwise DataflowPipelineDebugOptions.getApiRootUrl() is used as the root URL.

      Specified by:
      getDataflowEndpoint in interface DataflowPipelineDebugOptions
    • setDataflowEndpoint

      void setDataflowEndpoint(String value)
      Specified by:
      setDataflowEndpoint in interface DataflowPipelineDebugOptions
    • getLabels

      Map<String,String> getLabels()
      Labels that will be applied to the billing records for this job.
    • setLabels

      void setLabels(Map<String,String> labels)
    • getPipelineUrl

      String getPipelineUrl()
      The URL of the staged portable pipeline.
    • setPipelineUrl

      void setPipelineUrl(String urlString)
    • getDataflowWorkerJar

      String getDataflowWorkerJar()
    • setDataflowWorkerJar

      void setDataflowWorkerJar(String dataflowWorkerJar)
    • getFlexRSGoal

      This option controls Flexible Resource Scheduling mode.
    • setFlexRSGoal

    • isHotKeyLoggingEnabled

      boolean isHotKeyLoggingEnabled()
      If enabled then the literal key will be logged to Cloud Logging if a hot key is detected.
    • setHotKeyLoggingEnabled

      void setHotKeyLoggingEnabled(boolean value)
    • getJdkAddOpenModules

      List<String> getJdkAddOpenModules()
      Open modules needed for reflection that access JDK internals with Java 9+

      With JDK 16+, JDK internals are strongly encapsulated and can result in an InaccessibleObjectException being thrown if a tool or library uses reflection that access JDK internals. If you see these errors in your worker logs, you can pass in modules to open using the format module/package=target-module(,target-module)* to allow access to the library. E.g. java.base/java.lang=jamm

      You may see warnings that jamm, a library used to more accurately size objects, is unable to make a private field accessible. To resolve the warning, open the specified module/package to jamm.

    • setJdkAddOpenModules

      void setJdkAddOpenModules(List<String> options)