apache_beam.options.pipeline_options_validator module¶
Pipeline options validator.
For internal use only; no backwards-compatibility guarantees.
- 
class apache_beam.options.pipeline_options_validator.PipelineOptionsValidator(options, runner)[source]¶
- Bases: - object- Validates PipelineOptions. - Goes through a list of known PipelineOption subclassess and calls: - validate(validator) - if one is implemented. Aggregates a list of validation errors from all and returns an aggregated list. - 
OPTIONS= [<class 'apache_beam.options.pipeline_options.DebugOptions'>, <class 'apache_beam.options.pipeline_options.GoogleCloudOptions'>, <class 'apache_beam.options.pipeline_options.PortableOptions'>, <class 'apache_beam.options.pipeline_options.SetupOptions'>, <class 'apache_beam.options.pipeline_options.StandardOptions'>, <class 'apache_beam.options.pipeline_options.TestOptions'>, <class 'apache_beam.options.pipeline_options.TypeOptions'>, <class 'apache_beam.options.pipeline_options.WorkerOptions'>]¶
 - 
REQUIRED_ENVIRONMENT_OPTIONS= {'DOCKER': [], 'EXTERNAL': ['external_service_address'], 'LOOPBACK': [], 'PROCESS': ['process_command']}¶
 - 
OPTIONAL_ENVIRONMENT_OPTIONS= {'DOCKER': ['docker_container_image'], 'EXTERNAL': [], 'LOOPBACK': [], 'PROCESS': ['process_variables']}¶
 - 
ERR_MISSING_OPTION= 'Missing required option: %s.'¶
 - 
ERR_MISSING_GCS_PATH= 'Missing GCS path option: %s.'¶
 - 
ERR_INVALID_GCS_PATH= 'Invalid GCS path (%s), given for the option: %s.'¶
 - 
ERR_INVALID_GCS_BUCKET= 'Invalid GCS bucket (%s), given for the option: %s. See https://developers.google.com/storage/docs/bucketnaming for more details.'¶
 - 
ERR_INVALID_GCS_OBJECT= 'Invalid GCS object (%s), given for the option: %s.'¶
 - 
ERR_INVALID_JOB_NAME= 'Invalid job_name (%s); the name must consist of only the characters [-a-z0-9], starting with a letter and ending with a letter or number'¶
 - 
ERR_INVALID_PROJECT_NUMBER= 'Invalid Project ID (%s). Please make sure you specified the Project ID, not project number.'¶
 - 
ERR_INVALID_PROJECT_ID= 'Invalid Project ID (%s). Please make sure you specified the Project ID, not project description.'¶
 - 
ERR_INVALID_ENDPOINT= 'Invalid url (%s) for dataflow endpoint. Please provide a valid url.'¶
 - 
ERR_INVALID_NOT_POSITIVE= 'Invalid value (%s) for option: %s. Value needs to be positive.'¶
 - 
ERR_INVALID_TEST_MATCHER_TYPE= 'Invalid value (%s) for option: %s. Please extend your matcher object from hamcrest.core.base_matcher.BaseMatcher.'¶
 - 
ERR_INVALID_TEST_MATCHER_UNPICKLABLE= 'Invalid value (%s) for option: %s. Please make sure the test matcher is unpicklable.'¶
 - 
ERR_INVALID_TRANSFORM_NAME_MAPPING= 'Invalid transform name mapping format. Please make sure the mapping is string key-value pairs. Invalid pair: (%s:%s)'¶
 - 
ERR_INVALID_ENVIRONMENT= 'Option %s is not compatible with environment type %s.'¶
 - 
ERR_ENVIRONMENT_CONFIG= 'Option environment_config is incompatible with option(s) %s.'¶
 - 
ERR_MISSING_REQUIRED_ENVIRONMENT_OPTION= 'Option %s is required for environment type %s.'¶
 - 
ERR_NUM_WORKERS_TOO_HIGH= 'num_workers (%s) cannot exceed max_num_workers (%s)'¶
 - 
ERR_REPEATABLE_OPTIONS_NOT_SET_AS_LIST= '(%s) is a string. Programmatically set PipelineOptions like (%s) options need to be specified as a list.'¶
 - 
GCS_URI= '(?P<SCHEME>[^:]+)://(?P<BUCKET>[^/]+)(/(?P<OBJECT>.*))?'¶
 - 
GCS_BUCKET= '^[a-z0-9][-_a-z0-9.]+[a-z0-9]$'¶
 - 
GCS_SCHEME= 'gs'¶
 - 
JOB_PATTERN= '[a-z]([-a-z0-9]*[a-z0-9])?'¶
 - 
PROJECT_ID_PATTERN= '[a-z][-a-z0-9:.]+[a-z0-9]'¶
 - 
PROJECT_NUMBER_PATTERN= '[0-9]*'¶
 - 
validate()[source]¶
- Calls validate on subclassess and returns a list of errors. - validate will call validate method on subclasses, accumulate the returned list of errors, and returns the aggregate list. - Returns: - Aggregate list of errors after all calling all possible validate methods. 
 - 
is_full_string_match(pattern, string)[source]¶
- Returns True if the pattern matches the whole string. 
 - 
validate_gcs_path(view, arg_name)[source]¶
- Validates a GCS path against gs://bucket/object URI format. 
 - 
validate_worker_region_zone(view)[source]¶
- Validates Dataflow worker region and zone arguments are consistent. 
 - 
validate_optional_argument_positive(view, arg_name)[source]¶
- Validates that an optional argument (if set) has a positive value. 
 - 
validate_test_matcher(view, arg_name)[source]¶
- Validates that on_success_matcher argument if set. - Validates that on_success_matcher is unpicklable and is instance of hamcrest.core.base_matcher.BaseMatcher. 
 - 
validate_repeatable_argument_passed_as_list(view, arg_name)[source]¶
- Validates that repeatable PipelineOptions like dataflow_service_options or experiments are specified as a list when set programmatically. This way, users do not inadvertently specify it as a string, mirroring the way they are set via the command lineRepeatable options, which are as passed a list. 
 
-