apache_beam.options.pipeline_options_validator module¶
Pipeline options validator.
For internal use only; no backwards-compatibility guarantees.
-
class
apache_beam.options.pipeline_options_validator.
PipelineOptionsValidator
(options, runner)[source]¶ Bases:
object
Validates PipelineOptions.
Goes through a list of known PipelineOption subclassess and calls:
validate(validator)
if one is implemented. Aggregates a list of validation errors from all and returns an aggregated list.
-
OPTIONS
= [<class 'apache_beam.options.pipeline_options.DebugOptions'>, <class 'apache_beam.options.pipeline_options.GoogleCloudOptions'>, <class 'apache_beam.options.pipeline_options.PortableOptions'>, <class 'apache_beam.options.pipeline_options.SetupOptions'>, <class 'apache_beam.options.pipeline_options.StandardOptions'>, <class 'apache_beam.options.pipeline_options.TestOptions'>, <class 'apache_beam.options.pipeline_options.TypeOptions'>, <class 'apache_beam.options.pipeline_options.WorkerOptions'>]¶
-
REQUIRED_ENVIRONMENT_OPTIONS
= {'DOCKER': [], 'EXTERNAL': ['external_service_address'], 'LOOPBACK': [], 'PROCESS': ['process_command']}¶
-
OPTIONAL_ENVIRONMENT_OPTIONS
= {'DOCKER': ['docker_container_image'], 'EXTERNAL': [], 'LOOPBACK': [], 'PROCESS': ['process_variables']}¶
-
ERR_MISSING_OPTION
= 'Missing required option: %s.'¶
-
ERR_MISSING_GCS_PATH
= 'Missing GCS path option: %s.'¶
-
ERR_INVALID_GCS_PATH
= 'Invalid GCS path (%s), given for the option: %s.'¶
-
ERR_INVALID_GCS_BUCKET
= 'Invalid GCS bucket (%s), given for the option: %s. See https://developers.google.com/storage/docs/bucketnaming for more details.'¶
-
ERR_INVALID_GCS_OBJECT
= 'Invalid GCS object (%s), given for the option: %s.'¶
-
ERR_INVALID_JOB_NAME
= 'Invalid job_name (%s); the name must consist of only the characters [-a-z0-9], starting with a letter and ending with a letter or number'¶
-
ERR_INVALID_PROJECT_NUMBER
= 'Invalid Project ID (%s). Please make sure you specified the Project ID, not project number.'¶
-
ERR_INVALID_PROJECT_ID
= 'Invalid Project ID (%s). Please make sure you specified the Project ID, not project description.'¶
-
ERR_INVALID_ENDPOINT
= 'Invalid url (%s) for dataflow endpoint. Please provide a valid url.'¶
-
ERR_INVALID_NOT_POSITIVE
= 'Invalid value (%s) for option: %s. Value needs to be positive.'¶
-
ERR_INVALID_TEST_MATCHER_TYPE
= 'Invalid value (%s) for option: %s. Please extend your matcher object from hamcrest.core.base_matcher.BaseMatcher.'¶
-
ERR_INVALID_TEST_MATCHER_UNPICKLABLE
= 'Invalid value (%s) for option: %s. Please make sure the test matcher is unpicklable.'¶
-
ERR_INVALID_TRANSFORM_NAME_MAPPING
= 'Invalid transform name mapping format. Please make sure the mapping is string key-value pairs. Invalid pair: (%s:%s)'¶
-
ERR_INVALID_ENVIRONMENT
= 'Option %s is not compatible with environment type %s.'¶
-
ERR_ENVIRONMENT_CONFIG
= 'Option environment_config is incompatible with option(s) %s.'¶
-
ERR_MISSING_REQUIRED_ENVIRONMENT_OPTION
= 'Option %s is required for environment type %s.'¶
-
ERR_NUM_WORKERS_TOO_HIGH
= 'num_workers (%s) cannot exceed max_num_workers (%s)'¶
-
ERR_REPEATABLE_OPTIONS_NOT_SET_AS_LIST
= '(%s) is a string. Programmatically set PipelineOptions like (%s) options need to be specified as a list.'¶
-
GCS_URI
= '(?P<SCHEME>[^:]+)://(?P<BUCKET>[^/]+)(/(?P<OBJECT>.*))?'¶
-
GCS_BUCKET
= '^[a-z0-9][-_a-z0-9.]+[a-z0-9]$'¶
-
GCS_SCHEME
= 'gs'¶
-
JOB_PATTERN
= '[a-z]([-a-z0-9]*[a-z0-9])?'¶
-
PROJECT_ID_PATTERN
= '[a-z][-a-z0-9:.]+[a-z0-9]'¶
-
PROJECT_NUMBER_PATTERN
= '[0-9]*'¶
-
validate
()[source]¶ Calls validate on subclassess and returns a list of errors.
validate will call validate method on subclasses, accumulate the returned list of errors, and returns the aggregate list.
Returns: Aggregate list of errors after all calling all possible validate methods.
-
is_full_string_match
(pattern, string)[source]¶ Returns True if the pattern matches the whole string.
-
validate_gcs_path
(view, arg_name)[source]¶ Validates a GCS path against gs://bucket/object URI format.
-
validate_worker_region_zone
(view)[source]¶ Validates Dataflow worker region and zone arguments are consistent.
-
validate_optional_argument_positive
(view, arg_name)[source]¶ Validates that an optional argument (if set) has a positive value.
-
validate_test_matcher
(view, arg_name)[source]¶ Validates that on_success_matcher argument if set.
Validates that on_success_matcher is unpicklable and is instance of hamcrest.core.base_matcher.BaseMatcher.
-
validate_repeatable_argument_passed_as_list
(view, arg_name)[source]¶ Validates that repeatable PipelineOptions like dataflow_service_options or experiments are specified as a list when set programmatically. This way, users do not inadvertently specify it as a string, mirroring the way they are set via the command lineRepeatable options, which are as passed a list.
-