apache_beam.ml.gcp.visionml module¶
A connector for sending API requests to the GCP Vision API.
-
class
apache_beam.ml.gcp.visionml.
AnnotateImage
(features, retry=None, timeout=120, max_batch_size=None, min_batch_size=None, client_options=None, context_side_input=None, metadata=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
A
PTransform
for annotating images using the GCP Vision API. ref: https://cloud.google.com/vision/docs/Batches elements together using
util.BatchElements
PTransform and sends each batch of elements to the GCP Vision API. Element is a Union[text_type, binary_type] of either an URI (e.g. a GCS URI) or binary_type base64-encoded image data. Accepts an AsDict side input that maps each image to an image context.Parameters: - features – (List[
vision.types.Feature.enums.Feature
]) Required. The Vision API features to detect - retry – (google.api_core.retry.Retry) Optional. A retry object used to retry requests. If None is specified (default), requests will not be retried.
- timeout – (float) Optional. The time in seconds to wait for the response from the Vision API. Default is 120.
- max_batch_size – (int) Optional. Maximum number of images to batch in the same request to the Vision API. Default is 5 (which is also the Vision API max). This parameter is primarily intended for testing.
- min_batch_size – (int) Optional. Minimum number of images to batch in the same request to the Vision API. Default is None. This parameter is primarily intended for testing.
- client_options – (Union[dict, google.api_core.client_options.ClientOptions]) Optional. Client options used to set user options on the client. API Endpoint should be set through client_options.
- context_side_input –
(beam.pvalue.AsDict) Optional. An
AsDict
of a PCollection to be passed to the _ImageAnnotateFn as the image context mapping containing additional image context and/or feature-specific parameters. Example usage:image_contexts = [(''gs://cloud-samples-data/vision/ocr/sign.jpg'', Union[dict, ``vision.types.ImageContext()``]), (''gs://cloud-samples-data/vision/ocr/sign.jpg'', Union[dict, ``vision.types.ImageContext()``]),] context_side_input = ( p | "Image contexts" >> beam.Create(image_contexts) ) visionml.AnnotateImage(features, context_side_input=beam.pvalue.AsDict(context_side_input)))
- metadata – (Optional[Sequence[Tuple[str, str]]]): Optional. Additional metadata that is provided to the method.
-
MAX_BATCH_SIZE
= 5¶
-
MIN_BATCH_SIZE
= 1¶
- features – (List[
-
class
apache_beam.ml.gcp.visionml.
AnnotateImageWithContext
(features, retry=None, timeout=120, max_batch_size=None, min_batch_size=None, client_options=None, metadata=None)[source]¶ Bases:
apache_beam.ml.gcp.visionml.AnnotateImage
A
PTransform
for annotating images using the GCP Vision API. ref: https://cloud.google.com/vision/docs/ Batches elements together usingutil.BatchElements
PTransform and sends each batch of elements to the GCP Vision API.Element is a tuple of:
(Union[text_type, binary_type], Optional[``vision.types.ImageContext``])
where the former is either an URI (e.g. a GCS URI) or binary_type base64-encoded image data.
Parameters: - features – (List[
vision.types.Feature.enums.Feature
]) Required. The Vision API features to detect - retry – (google.api_core.retry.Retry) Optional. A retry object used to retry requests. If None is specified (default), requests will not be retried.
- timeout – (float) Optional. The time in seconds to wait for the response from the Vision API. Default is 120.
- max_batch_size – (int) Optional. Maximum number of images to batch in the same request to the Vision API. Default is 5 (which is also the Vision API max). This parameter is primarily intended for testing.
- min_batch_size – (int) Optional. Minimum number of images to batch in the same request to the Vision API. Default is None. This parameter is primarily intended for testing.
- client_options – (Union[dict, google.api_core.client_options.ClientOptions]) Optional. Client options used to set user options on the client. API Endpoint should be set through client_options.
- metadata – (Optional[Sequence[Tuple[str, str]]]): Optional. Additional metadata that is provided to the method.
- features – (List[