apache_beam.ml.rag.embeddings.vertex_ai module

RAG-specific embedding implementations using Vertex AI models.

class apache_beam.ml.rag.embeddings.vertex_ai.VertexAITextEmbeddings(model_name: str, *, title: str | None = None, task_type: str = 'RETRIEVAL_DOCUMENT', project: str | None = None, location: str | None = None, credentials: Credentials | None = None, **kwargs)[source]

Bases: EmbeddingsManager

Utilizes Vertex AI text embeddings for semantic search and RAG pipelines.

Parameters:

model_name – Name of the Vertex AI text embedding model
title – Optional title for the text content
task_type – Task type for embeddings (default: RETRIEVAL_DOCUMENT)
project – GCP project ID
location – GCP location
credentials – Optional GCP credentials
**kwargs – Additional arguments passed to EmbeddingsManager.

get_model_handler()[source]: Returns model handler configured with RAG adapter.

get_ptransform_for_processing(**kwargs) → PTransform[PCollection[EmbeddableItem], PCollection[EmbeddableItem]][source]: Returns PTransform that uses the RAG adapter.

class apache_beam.ml.rag.embeddings.vertex_ai.VertexAIImageEmbeddings(model_name: str, *, dimension: int | None = None, project: str | None = None, location: str | None = None, credentials: Credentials | None = None, **kwargs)[source]

Bases: EmbeddingsManager

Vertex AI image embeddings for RAG pipelines.

Generates embeddings for images using Vertex AI multimodal embedding models.

Parameters:

model_name – Name of the Vertex AI model.
dimension – Embedding dimension. Must be one of 128, 256, 512, or 1408.
project – GCP project ID.
location – GCP location.
credentials – Optional GCP credentials.
**kwargs – Additional arguments passed to EmbeddingsManager.

get_model_handler()[source]: Returns model handler for image embedding.

get_ptransform_for_processing(**kwargs) → PTransform[source]: Returns PTransform for image embedding.