apache_beam.ml.gcp.cloud_dlp module¶
PTransforms
that implement Google Cloud Data Loss Prevention
functionality.
-
class
apache_beam.ml.gcp.cloud_dlp.
MaskDetectedDetails
(project=None, deidentification_template_name=None, deidentification_config=None, inspection_template_name=None, inspection_config=None, timeout=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
Scrubs sensitive information detected in text. The
PTransform
returns aPCollection
ofstr
Example usage:pipeline | MaskDetectedDetails(project='example-gcp-project', deidentification_config={ 'info_type_transformations: { 'transformations': [{ 'primitive_transformation': { 'character_mask_config': { 'masking_character': '#' } } }] } }, inspection_config={'info_types': [{'name': 'EMAIL_ADDRESS'}]})
Initializes a
MaskDetectedDetails
transform.Parameters: - project – Optional. GCP project name in which inspection will be performed
- deidentification_template_name (str) – Either this or deidentification_config required. Name of deidentification template to be used on detected sensitive information instances in text.
- deidentification_config – (
Union[dict, google.cloud.dlp_v2.types.DeidentifyConfig]
): Configuration for the de-identification of the content item. If both template name and config are supplied, config is more important. - inspection_template_name (str) – This or inspection_config required. Name of inspection template to be used to detect sensitive data in text.
- inspection_config – (
Union[dict, google.cloud.dlp_v2.types.InspectConfig]
): Configuration for the inspector used to detect sensitive data in text. If both template name and config are supplied, config takes precedence. - timeout (float) – Optional. The amount of time, in seconds, to wait for the request to complete.
-
class
apache_beam.ml.gcp.cloud_dlp.
InspectForDetails
(project=None, inspection_template_name=None, inspection_config=None, timeout=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
Inspects input text for sensitive information. the
PTransform
returns aPCollection
ofList[google.cloud.dlp_v2.proto.dlp_pb2.Finding]
Example usage:pipeline | InspectForDetails(project='example-gcp-project', inspection_config={'info_types': [{'name': 'EMAIL_ADDRESS'}]})
Initializes a
InspectForDetails
transform.Parameters: - project – Optional. GCP project name in which inspection will be performed
- inspection_template_name (str) – This or inspection_config required. Name of inspection template to be used to detect sensitive data in text.
- inspection_config – (
Union[dict, google.cloud.dlp_v2.types.InspectConfig]
): Configuration for the inspector used to detect sensitive data in text. If both template name and config are supplied, config takes precedence. - timeout (float) – Optional. The amount of time, in seconds, to wait for the request to complete.