public interface DataflowPipelineOptions extends PipelineOptions, GcpOptions, ApplicationNameOptions, DataflowPipelineDebugOptions, DataflowPipelineWorkerPoolOptions, BigQueryOptions, GcsOptions, StreamingOptions, DataflowWorkerLoggingOptions, DataflowStreamingPipelineOptions, DataflowProfilingOptions, PubsubOptions
DataflowRunner.| Modifier and Type | Interface and Description | 
|---|---|
| static class  | DataflowPipelineOptions.FlexResourceSchedulingGoalSet of available Flexible Resource Scheduling goals. | 
| static class  | DataflowPipelineOptions.StagingLocationFactoryReturns a default staging location under  GcpOptions.getGcpTempLocation(). | 
DataflowPipelineDebugOptions.DataflowClientFactory, DataflowPipelineDebugOptions.StagerFactory, DataflowPipelineDebugOptions.UnboundedReaderMaxReadTimeFactoryDataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmTypeGcsOptions.ExecutorServiceFactory, GcsOptions.PathValidatorFactoryDataflowWorkerLoggingOptions.Level, DataflowWorkerLoggingOptions.WorkerLogLevelOverridesDataflowStreamingPipelineOptions.EnableWindmillServiceDirectPathFactory, DataflowStreamingPipelineOptions.GlobalConfigRefreshPeriodFactory, DataflowStreamingPipelineOptions.HarnessUpdateReportingPeriodFactory, DataflowStreamingPipelineOptions.LocalWindmillHostportFactory, DataflowStreamingPipelineOptions.MaxStackTraceDepthToReportFactory, DataflowStreamingPipelineOptions.PeriodicStatusPageDirectoryFactory, DataflowStreamingPipelineOptions.WindmillServiceStreamingRpcBatchLimitFactoryDataflowProfilingOptions.DataflowProfilingAgentConfigurationGcpOptions.DefaultProjectFactory, GcpOptions.EnableStreamingEngineFactory, GcpOptions.GcpOAuthScopesFactory, GcpOptions.GcpTempLocationFactory, GcpOptions.GcpUserCredentialsFactoryGoogleApiDebugOptions.GoogleApiTracerSTATE_CACHE_SIZE, STATE_SAMPLING_PERIOD_MILLISSTREAMING_ENGINE_EXPERIMENT, WINDMILL_SERVICE_EXPERIMENT| Modifier and Type | Method and Description | 
|---|---|
| java.lang.String | getCreateFromSnapshot()If set, the snapshot from which the job should be created. | 
| java.lang.String | getDataflowEndpoint()Dataflow endpoint to use. | 
| java.util.List<java.lang.String> | getDataflowServiceOptions()Service options are set by the user and configure the service. | 
| java.lang.String | getDataflowWorkerJar() | 
| DataflowPipelineOptions.FlexResourceSchedulingGoal | getFlexRSGoal()This option controls Flexible Resource Scheduling mode. | 
| java.util.List<java.lang.String> | getJdkAddOpenModules()Open modules needed for reflection that access JDK internals with Java 9+ | 
| java.util.Map<java.lang.String,java.lang.String> | getLabels()Labels that will be applied to the billing records for this job. | 
| java.lang.String | getPipelineUrl()The URL of the staged portable pipeline. | 
| java.lang.String | getProject()Project id to use when launching jobs. | 
| java.lang.String | getRegion()The Google Compute Engine region for
 creating Dataflow jobs. | 
| java.lang.String | getServiceAccount()Run the job as a specific service account, instead of the default GCE robot. | 
| java.lang.String | getStagingLocation()GCS path for staging local files, e.g. | 
| java.lang.String | getTemplateLocation()Where the runner should generate a template file. | 
| boolean | isHotKeyLoggingEnabled()If enabled then the literal key will be logged to Cloud Logging if a hot key is detected. | 
| boolean | isUpdate()Whether to update the currently running pipeline with the same name as this one. | 
| void | setCreateFromSnapshot(java.lang.String value) | 
| void | setDataflowEndpoint(java.lang.String value) | 
| void | setDataflowServiceOptions(java.util.List<java.lang.String> options) | 
| void | setDataflowWorkerJar(java.lang.String dataflowWorkerJar) | 
| void | setFlexRSGoal(DataflowPipelineOptions.FlexResourceSchedulingGoal goal) | 
| void | setHotKeyLoggingEnabled(boolean value) | 
| void | setJdkAddOpenModules(java.util.List<java.lang.String> options) | 
| void | setLabels(java.util.Map<java.lang.String,java.lang.String> labels) | 
| void | setPipelineUrl(java.lang.String urlString) | 
| void | setProject(java.lang.String value) | 
| void | setRegion(java.lang.String region) | 
| void | setServiceAccount(java.lang.String value) | 
| void | setStagingLocation(java.lang.String value) | 
| void | setTemplateLocation(java.lang.String value) | 
| void | setUpdate(boolean value) | 
getApiRootUrl, getDataflowClient, getDataflowJobFile, getDesiredNumUnboundedSourceSplits, getDumpHeapOnOOM, getJfrRecordingDurationSec, getNumberOfWorkerHarnessThreads, getReaderCacheTimeoutSec, getRecordJfrOnGcThrashing, getSaveHeapDumpsToGcsPath, getSdkHarnessContainerImageOverrides, getStager, getStagerClass, getTransformNameMapping, getUnboundedReaderMaxElements, getUnboundedReaderMaxReadTimeMs, getUnboundedReaderMaxReadTimeSec, getUnboundedReaderMaxWaitForElementsMs, getWorkerCacheMb, setApiRootUrl, setDataflowClient, setDataflowJobFile, setDesiredNumUnboundedSourceSplits, setDumpHeapOnOOM, setJfrRecordingDurationSec, setNumberOfWorkerHarnessThreads, setReaderCacheTimeoutSec, setRecordJfrOnGcThrashing, setSaveHeapDumpsToGcsPath, setSdkHarnessContainerImageOverrides, setStager, setStagerClass, setTransformNameMapping, setUnboundedReaderMaxElements, setUnboundedReaderMaxReadTimeMs, setUnboundedReaderMaxReadTimeSec, setUnboundedReaderMaxWaitForElementsMs, setWorkerCacheMbaddExperiment, getExperiments, getExperimentValue, hasExperiment, setExperimentsgetGCThrashingPercentagePerPeriod, setGCThrashingPercentagePerPeriodgetAutoscalingAlgorithm, getDiskSizeGb, getMaxNumWorkers, getMinCpuPlatform, getNetwork, getNumWorkers, getSdkContainerImage, getSubnetwork, getUsePublicIps, getWorkerDiskType, getWorkerHarnessContainerImage, getWorkerMachineType, setAutoscalingAlgorithm, setDiskSizeGb, setMaxNumWorkers, setMinCpuPlatform, setNetwork, setNumWorkers, setSdkContainerImage, setSubnetwork, setUsePublicIps, setWorkerDiskType, setWorkerHarnessContainerImage, setWorkerMachineTypegetFilesToStage, setFilesToStagegetBigQueryEndpoint, getBigQueryProject, getBqStreamingApiLoggingFrequencySec, getEnableStorageReadApiV2, getHTTPReadTimeout, getHTTPWriteTimeout, getInsertBundleParallelism, getJobLabelsMap, getMaxBufferingDurationMilliSec, getMaxConnectionPoolConnections, getMaxStreamingBatchSize, getMaxStreamingRowsToBatch, getMinConnectionPoolConnections, getNumStorageWriteApiStreamAppendClients, getNumStorageWriteApiStreams, getNumStreamingKeys, getStorageApiAppendThresholdBytes, getStorageApiAppendThresholdRecordCount, getStorageWriteApiMaxRequestSize, getStorageWriteApiMaxRetries, getStorageWriteApiTriggeringFrequencySec, getStorageWriteMaxInflightBytes, getStorageWriteMaxInflightRequests, getTempDatasetId, getUseStorageApiConnectionPool, getUseStorageWriteApi, getUseStorageWriteApiAtLeastOnce, setBigQueryEndpoint, setBigQueryProject, setBqStreamingApiLoggingFrequencySec, setEnableStorageReadApiV2, setHTTPReadTimeout, setHTTPWriteTimeout, setInsertBundleParallelism, setJobLabelsMap, setMaxBufferingDurationMilliSec, setMaxConnectionPoolConnections, setMaxStreamingBatchSize, setMaxStreamingRowsToBatch, setMinConnectionPoolConnections, setNumStorageWriteApiStreamAppendClients, setNumStorageWriteApiStreams, setNumStreamingKeys, setStorageApiAppendThresholdBytes, setStorageApiAppendThresholdRecordCount, setStorageWriteApiMaxRequestSize, setStorageWriteApiMaxRetries, setStorageWriteApiTriggeringFrequencySec, setStorageWriteMaxInflightBytes, setStorageWriteMaxInflightRequests, setTempDatasetId, setUseStorageApiConnectionPool, setUseStorageWriteApi, setUseStorageWriteApiAtLeastOncegetEnableBucketReadMetricCounter, getEnableBucketWriteMetricCounter, getExecutorService, getGcsEndpoint, getGcsHttpRequestReadTimeout, getGcsHttpRequestWriteTimeout, getGcsPerformanceMetrics, getGcsReadCounterPrefix, getGcsRewriteDataOpBatchLimit, getGcsUploadBufferSizeBytes, getGcsUtil, getGcsWriteCounterPrefix, getGoogleCloudStorageReadOptions, getPathValidator, getPathValidatorClass, setEnableBucketReadMetricCounter, setEnableBucketWriteMetricCounter, setExecutorService, setGcsEndpoint, setGcsHttpRequestReadTimeout, setGcsHttpRequestWriteTimeout, setGcsPerformanceMetrics, setGcsReadCounterPrefix, setGcsRewriteDataOpBatchLimit, setGcsUploadBufferSizeBytes, setGcsUtil, setGcsWriteCounterPrefix, setGoogleCloudStorageReadOptions, setPathValidator, setPathValidatorClassgetDefaultWorkerLogLevel, getWorkerLogLevelOverrides, getWorkerSystemErrMessageLevel, getWorkerSystemOutMessageLevel, setDefaultWorkerLogLevel, setWorkerLogLevelOverrides, setWorkerSystemErrMessageLevel, setWorkerSystemOutMessageLevelgetActiveWorkRefreshPeriodMillis, getChannelzShowOnlyWindmillServiceChannels, getGlobalConfigRefreshPeriod, getIsWindmillServiceDirectPathEnabled, getLocalWindmillHostport, getMaxBundlesFromWindmillOutstanding, getMaxBytesFromWindmillOutstanding, getMaxStackTraceDepthToReport, getOverrideWindmillBinary, getPeriodicStatusPageOutputDirectory, getPerWorkerMetricsUpdateReportingPeriodMillis, getStreamingSideInputCacheExpirationMillis, getStreamingSideInputCacheMb, getStuckCommitDurationMillis, getUseSeparateWindmillHeartbeatStreams, getUseWindmillIsolatedChannels, getWindmillGetDataStreamCount, getWindmillHarnessUpdateReportingPeriod, getWindmillMessagesBetweenIsReadyChecks, getWindmillServiceCommitThreads, getWindmillServiceEndpoint, getWindmillServicePort, getWindmillServiceRpcChannelAliveTimeoutSec, getWindmillServiceStreamingLogEveryNStreamFailures, getWindmillServiceStreamingRpcBatchLimit, getWindmillServiceStreamingRpcHealthCheckPeriodMs, getWindmillServiceStreamMaxBackoffMillis, setActiveWorkRefreshPeriodMillis, setChannelzShowOnlyWindmillServiceChannels, setGlobalConfigRefreshPeriod, setIsWindmillServiceDirectPathEnabled, setLocalWindmillHostport, setMaxBundlesFromWindmillOutstanding, setMaxBytesFromWindmillOutstanding, setMaxStackTraceDepthToReport, setOverrideWindmillBinary, setPeriodicStatusPageOutputDirectory, setPerWorkerMetricsUpdateReportingPeriodMillis, setStreamingSideInputCacheExpirationMillis, setStreamingSideInputCacheMb, setStuckCommitDurationMillis, setUseSeparateWindmillHeartbeatStreams, setUseWindmillIsolatedChannels, setWindmillGetDataStreamCount, setWindmillHarnessUpdateReportingPeriod, setWindmillMessagesBetweenIsReadyChecks, setWindmillServiceCommitThreads, setWindmillServiceEndpoint, setWindmillServicePort, setWindmillServiceRpcChannelAliveTimeoutSec, setWindmillServiceStreamingLogEveryNStreamFailures, setWindmillServiceStreamingRpcBatchLimit, setWindmillServiceStreamingRpcHealthCheckPeriodMs, setWindmillServiceStreamMaxBackoffMillisgetProfilingAgentConfiguration, getSaveProfilesToGcs, setProfilingAgentConfiguration, setSaveProfilesToGcsgetPubsubRootUrl, setPubsubRootUrl, targetForRootUrlgetCredentialFactoryClass, getDataflowKmsKey, getGcpCredential, getGcpOauthScopes, getGcpTempLocation, getImpersonateServiceAccount, getWorkerRegion, getWorkerZone, getZone, isEnableStreamingEngine, setCredentialFactoryClass, setDataflowKmsKey, setEnableStreamingEngine, setGcpCredential, setGcpOauthScopes, setGcpTempLocation, setImpersonateServiceAccount, setWorkerRegion, setWorkerZone, setZonegetGoogleApiTrace, setGoogleApiTrace@Validation.Required @Default.InstanceFactory(value=GcpOptions.DefaultProjectFactory.class) java.lang.String getProject()
GcpOptionsgetProject in interface GcpOptionsvoid setProject(java.lang.String value)
setProject in interface GcpOptions@Default.InstanceFactory(value=DataflowPipelineOptions.StagingLocationFactory.class) java.lang.String getStagingLocation()
Must be a valid Cloud Storage URL, beginning with the prefix "gs://"
If getStagingLocation() is not set, it will default to GcpOptions.getGcpTempLocation(). GcpOptions.getGcpTempLocation() must be a valid GCS
 path.
void setStagingLocation(java.lang.String value)
boolean isUpdate()
void setUpdate(boolean value)
java.lang.String getCreateFromSnapshot()
void setCreateFromSnapshot(java.lang.String value)
java.lang.String getTemplateLocation()
void setTemplateLocation(java.lang.String value)
java.util.List<java.lang.String> getDataflowServiceOptions()
void setDataflowServiceOptions(java.util.List<java.lang.String> options)
java.lang.String getServiceAccount()
void setServiceAccount(java.lang.String value)
@Default.InstanceFactory(value=DefaultGcpRegionFactory.class) java.lang.String getRegion()
void setRegion(java.lang.String region)
@Default.String(value="") java.lang.String getDataflowEndpoint()
Defaults to the current version of the Google Cloud Dataflow API, at the time the current SDK version was released.
If the string contains "://", then this is treated as a URL, otherwise DataflowPipelineDebugOptions.getApiRootUrl() is used as the root URL.
getDataflowEndpoint in interface DataflowPipelineDebugOptionsvoid setDataflowEndpoint(java.lang.String value)
setDataflowEndpoint in interface DataflowPipelineDebugOptionsjava.util.Map<java.lang.String,java.lang.String> getLabels()
void setLabels(java.util.Map<java.lang.String,java.lang.String> labels)
java.lang.String getPipelineUrl()
void setPipelineUrl(java.lang.String urlString)
java.lang.String getDataflowWorkerJar()
void setDataflowWorkerJar(java.lang.String dataflowWorkerJar)
@Default.Enum(value="UNSPECIFIED") DataflowPipelineOptions.FlexResourceSchedulingGoal getFlexRSGoal()
void setFlexRSGoal(DataflowPipelineOptions.FlexResourceSchedulingGoal goal)
boolean isHotKeyLoggingEnabled()
void setHotKeyLoggingEnabled(boolean value)
java.util.List<java.lang.String> getJdkAddOpenModules()
With JDK 16+, JDK internals are strongly encapsulated and can result in an InaccessibleObjectException being thrown if a tool or library uses reflection that access JDK internals. If you see these errors in your worker logs, you can pass in modules to open using the format module/package=target-module(,target-module)* to allow access to the library. E.g. java.base/java.lang=jamm
You may see warnings that jamm, a library used to more accurately size objects, is unable to make a private field accessible. To resolve the warning, open the specified module/package to jamm.
void setJdkAddOpenModules(java.util.List<java.lang.String> options)