apache_beam.ml.rag.embeddings.vertex_ai module

RAG-specific embedding implementations using Vertex AI models.

class apache_beam.ml.rag.embeddings.vertex_ai.VertexAITextEmbeddings(model_name: str, *, title: str | None = None, task_type: str = 'RETRIEVAL_DOCUMENT', project: str | None = None, location: str | None = None, credentials: Credentials | None = None, **kwargs)[source]

Bases: EmbeddingsManager

Utilizes Vertex AI text embeddings for semantic search and RAG pipelines.

Parameters:
  • model_name – Name of the Vertex AI text embedding model

  • title – Optional title for the text content

  • task_type – Task type for embeddings (default: RETRIEVAL_DOCUMENT)

  • project – GCP project ID

  • location – GCP location

  • credentials – Optional GCP credentials

  • **kwargs – Additional arguments passed to EmbeddingsManager.

get_model_handler()[source]

Returns model handler configured with RAG adapter.

get_ptransform_for_processing(**kwargs) PTransform[PCollection[EmbeddableItem], PCollection[EmbeddableItem]][source]

Returns PTransform that uses the RAG adapter.

class apache_beam.ml.rag.embeddings.vertex_ai.VertexAIImageEmbeddings(model_name: str, *, dimension: int | None = None, project: str | None = None, location: str | None = None, credentials: Credentials | None = None, **kwargs)[source]

Bases: EmbeddingsManager

Vertex AI image embeddings for RAG pipelines.

Generates embeddings for images using Vertex AI multimodal embedding models.

Parameters:
  • model_name – Name of the Vertex AI model.

  • dimension – Embedding dimension. Must be one of 128, 256, 512, or 1408.

  • project – GCP project ID.

  • location – GCP location.

  • credentials – Optional GCP credentials.

  • **kwargs – Additional arguments passed to EmbeddingsManager.

get_model_handler()[source]

Returns model handler for image embedding.

get_ptransform_for_processing(**kwargs) PTransform[source]

Returns PTransform for image embedding.