public interface SparkPipelineOptions extends PipelineOptions, StreamingOptions, ApplicationNameOptions
PipelineOptions
handles Spark execution-related configurations, such as the
master address, batch-interval, and other user-related knobs.Modifier and Type | Interface and Description |
---|---|
static class |
SparkPipelineOptions.TmpCheckpointDirFactory
Returns the default checkpoint directory of /tmp/${job.name}.
|
PipelineOptions.AtomicLongFactory, PipelineOptions.CheckEnabled, PipelineOptions.DirectRunner, PipelineOptions.JobNameFactory, PipelineOptions.UserAgentFactory
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
DEFAULT_MASTER_URL |
Modifier and Type | Method and Description |
---|---|
java.lang.Long |
getBatchIntervalMillis() |
java.lang.Long |
getBundleSize() |
java.lang.String |
getCheckpointDir() |
java.lang.Long |
getCheckpointDurationMillis() |
java.lang.Boolean |
getEnableSparkMetricSinks() |
java.util.List<java.lang.String> |
getFilesToStage()
List of local files to make available to workers.
|
java.lang.Long |
getMaxRecordsPerBatch() |
java.lang.Long |
getMinReadTimeMillis() |
java.lang.Double |
getReadTimePercentage() |
java.lang.String |
getSparkMaster() |
java.lang.String |
getStorageLevel() |
boolean |
getUsesProvidedSparkContext() |
boolean |
isCacheDisabled() |
static void |
prepareFilesToStageForRemoteClusterExecution(SparkPipelineOptions options)
Local configurations work in the same JVM and have no problems with improperly formatted files
on classpath (eg.
|
void |
setBatchIntervalMillis(java.lang.Long batchInterval) |
void |
setBundleSize(java.lang.Long value) |
void |
setCacheDisabled(boolean value) |
void |
setCheckpointDir(java.lang.String checkpointDir) |
void |
setCheckpointDurationMillis(java.lang.Long durationMillis) |
void |
setEnableSparkMetricSinks(java.lang.Boolean enableSparkMetricSinks) |
void |
setFilesToStage(java.util.List<java.lang.String> value) |
void |
setMaxRecordsPerBatch(java.lang.Long maxRecordsPerBatch) |
void |
setMinReadTimeMillis(java.lang.Long minReadTimeMillis) |
void |
setReadTimePercentage(java.lang.Double readTimePercentage) |
void |
setSparkMaster(java.lang.String master) |
void |
setStorageLevel(java.lang.String storageLevel) |
void |
setUsesProvidedSparkContext(boolean value) |
isStreaming, setStreaming
getAppName, setAppName
as, getJobName, getOptionsId, getRunner, getStableUniqueNames, getTempLocation, getUserAgent, outputRuntimeOptions, setJobName, setOptionsId, setRunner, setStableUniqueNames, setTempLocation, setUserAgent
populateDisplayData
static final java.lang.String DEFAULT_MASTER_URL
@Default.String(value="local[4]") java.lang.String getSparkMaster()
void setSparkMaster(java.lang.String master)
@Default.Long(value=500L) java.lang.Long getBatchIntervalMillis()
void setBatchIntervalMillis(java.lang.Long batchInterval)
@Default.String(value="MEMORY_ONLY") java.lang.String getStorageLevel()
void setStorageLevel(java.lang.String storageLevel)
@Default.Long(value=200L) java.lang.Long getMinReadTimeMillis()
void setMinReadTimeMillis(java.lang.Long minReadTimeMillis)
@Default.Long(value=-1L) java.lang.Long getMaxRecordsPerBatch()
void setMaxRecordsPerBatch(java.lang.Long maxRecordsPerBatch)
@Default.Double(value=0.1) java.lang.Double getReadTimePercentage()
void setReadTimePercentage(java.lang.Double readTimePercentage)
@Default.InstanceFactory(value=SparkPipelineOptions.TmpCheckpointDirFactory.class) java.lang.String getCheckpointDir()
void setCheckpointDir(java.lang.String checkpointDir)
@Default.Long(value=-1L) java.lang.Long getCheckpointDurationMillis()
void setCheckpointDurationMillis(java.lang.Long durationMillis)
@Default.Long(value=0L) java.lang.Long getBundleSize()
@Experimental void setBundleSize(java.lang.Long value)
@Default.Boolean(value=true) java.lang.Boolean getEnableSparkMetricSinks()
void setEnableSparkMetricSinks(java.lang.Boolean enableSparkMetricSinks)
@Default.Boolean(value=false) boolean getUsesProvidedSparkContext()
void setUsesProvidedSparkContext(boolean value)
java.util.List<java.lang.String> getFilesToStage()
Jars are placed on the worker's classpath.
The default value is the list of jars from the main program's classpath.
void setFilesToStage(java.util.List<java.lang.String> value)
@Default.Boolean(value=false) boolean isCacheDisabled()
void setCacheDisabled(boolean value)
static void prepareFilesToStageForRemoteClusterExecution(SparkPipelineOptions options)