apache_beam.ml.inference.vertex_ai_inference module

class apache_beam.ml.inference.vertex_ai_inference.VertexAIModelHandlerJSON(endpoint_id: str, project: str, location: str, experiment: Optional[str] = None, **kwargs)[source]

Bases: apache_beam.ml.inference.base.ModelHandler

Implementation of the ModelHandler interface for Vertex AI. NOTE: This API and its implementation are under development and do not provide backward compatibility guarantees. Unlike other ModelHandler implementations, this does not load the model being used onto the worker and instead makes remote queries to a Vertex AI endpoint. In that way it functions more like a mid-pipeline IO. At present this implementation only supports public endpoints with a maximum request size of 1.5 MB. :param endpoint_id: the numerical ID of the Vertex AI endpoint to query :param project: the GCP project name where the endpoint is deployed :param location: the GCP location where the endpoint is deployed :param experiment: experiment label to apply to the queries :type experiment: Optional

load_model() → google.cloud.aiplatform.models.Endpoint[source]

Loads the Endpoint object used to build and send prediction request to Vertex AI.

get_request(batch: Sequence[Any], model: google.cloud.aiplatform.models.Endpoint, throttle_delay_secs: int, inference_args: Optional[Dict[str, Any]])[source]
run_inference(batch: Sequence[Any], model: google.cloud.aiplatform.models.Endpoint, inference_args: Optional[Dict[str, Any]] = None) → Iterable[apache_beam.ml.inference.base.PredictionResult][source]

Sends a prediction request to a Vertex AI endpoint containing batch of inputs and matches that input with the prediction response from the endpoint as an iterable of PredictionResults.

Parameters:
  • batch – a sequence of any values to be passed to the Vertex AI endpoint. Should be encoded as the model expects.
  • model – an aiplatform.Endpoint object configured to access the desired model.
  • inference_args – any additional arguments to send as part of the prediction request.
Returns:

An iterable of Predictions.