apache_beam.options.pipeline_options module¶
Pipeline options obtained from command line parsing.
- 
class apache_beam.options.pipeline_options.PipelineOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.transforms.display.HasDisplayData- This class and subclasses are used as containers for command line options. - These classes are wrappers over the standard argparse Python module (see https://docs.python.org/3/library/argparse.html). To define one option or a group of options, create a subclass from PipelineOptions. - Example Usage: - class XyzOptions(PipelineOptions): @classmethod def _add_argparse_args(cls, parser): parser.add_argument('--abc', default='start') parser.add_argument('--xyz', default='end') - The arguments for the add_argument() method are exactly the ones described in the argparse public documentation. - Pipeline objects require an options object during initialization. This is obtained simply by initializing an options class as defined above. - Example Usage: - p = Pipeline(options=XyzOptions()) if p.options.xyz == 'end': raise ValueError('Option xyz has an invalid value.') - Instances of PipelineOptions or any of its subclass have access to values defined by other PipelineOption subclasses (see get_all_options()), and can be converted to an instance of another PipelineOptions subclass (see view_as()). All views share the underlying data structure that stores option key-value pairs. - By default the options classes will use command line arguments to initialize the options. - Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 - 
classmethod from_dictionary(options)[source]¶
- Returns a PipelineOptions from a dictionary of arguments. - Parameters: - options – Dictionary of argument value pairs. - Returns: - A PipelineOptions object representing the given arguments. 
 - 
get_all_options(drop_default=False, add_extra_args_fn=None, retain_unknown_options=False)[source]¶
- Returns a dictionary of all defined arguments. - Returns a dictionary of all defined arguments (arguments that are defined in any subclass of PipelineOptions) into a dictionary. - Parameters: - drop_default – If set to true, options that are equal to their default values, are not returned as part of the result dictionary.
- add_extra_args_fn – Callback to populate additional arguments, can be used by runner to supply otherwise unknown args.
- retain_unknown_options – If set to true, options not recognized by any known pipeline options class will still be included in the result. If set to false, they will be discarded.
 - Returns: - Dictionary of all args and values. 
 - 
view_as(cls)[source]¶
- Returns a view of current object as provided PipelineOption subclass. - Example Usage: - options = PipelineOptions(['--runner', 'Direct', '--streaming']) standard_options = options.view_as(StandardOptions) if standard_options.streaming: # ... start a streaming job ... - Note that options objects may have multiple views, and modifications of values in any view-object will apply to current object and other view-objects. - Parameters: - cls – PipelineOptions class or any of its subclasses. - Returns: - An instance of cls that is intitialized using options contained in current object. 
 
- 
class apache_beam.options.pipeline_options.StandardOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 - 
DEFAULT_RUNNER= 'DirectRunner'¶
 - 
ALL_KNOWN_RUNNERS= ('apache_beam.runners.dataflow.dataflow_runner.DataflowRunner', 'apache_beam.runners.direct.direct_runner.BundleBasedDirectRunner', 'apache_beam.runners.direct.direct_runner.DirectRunner', 'apache_beam.runners.direct.direct_runner.SwitchingDirectRunner', 'apache_beam.runners.interactive.interactive_runner.InteractiveRunner', 'apache_beam.runners.portability.flink_runner.FlinkRunner', 'apache_beam.runners.portability.portable_runner.PortableRunner', 'apache_beam.runners.portability.spark_runner.SparkRunner', 'apache_beam.runners.test.TestDirectRunner', 'apache_beam.runners.test.TestDataflowRunner')¶
 - 
KNOWN_RUNNER_NAMES= ['DataflowRunner', 'BundleBasedDirectRunner', 'DirectRunner', 'SwitchingDirectRunner', 'InteractiveRunner', 'FlinkRunner', 'PortableRunner', 'SparkRunner', 'TestDirectRunner', 'TestDataflowRunner']¶
 
- 
class apache_beam.options.pipeline_options.TypeOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.DirectOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- DirectRunner-specific execution options. - Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.GoogleCloudOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Google Cloud Dataflow service execution options. - Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 - 
BIGQUERY_API_SERVICE= 'bigquery.googleapis.com'¶
 - 
COMPUTE_API_SERVICE= 'compute.googleapis.com'¶
 - 
STORAGE_API_SERVICE= 'storage.googleapis.com'¶
 - 
DATAFLOW_ENDPOINT= 'https://dataflow.googleapis.com'¶
 - 
OAUTH_SCOPES= ['https://www.googleapis.com/auth/bigquery', 'https://www.googleapis.com/auth/cloud-platform', 'https://www.googleapis.com/auth/devstorage.full_control', 'https://www.googleapis.com/auth/userinfo.email', 'https://www.googleapis.com/auth/datastore', 'https://www.googleapis.com/auth/spanner.admin', 'https://www.googleapis.com/auth/spanner.data']¶
 
- 
class apache_beam.options.pipeline_options.AzureOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Azure Blob Storage options. - Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.HadoopFileSystemOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- HadoopFileSystemconnection options.- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.WorkerOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Worker pool configuration options. - Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.DebugOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.ProfilingOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.SetupOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.TestOptions(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).
 
- 
class apache_beam.options.pipeline_options.S3Options(flags=None, **kwargs)[source]¶
- Bases: - apache_beam.options.pipeline_options.PipelineOptions- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters: - flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments.
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips).