apache_beam.ml.rag.embeddings.huggingface module

RAG-specific embedding implementations using HuggingFace models.

class apache_beam.ml.rag.embeddings.huggingface.HuggingfaceTextEmbeddings(model_name: str, *, max_seq_length: int | None = None, **kwargs)[source]

Bases: EmbeddingsManager

HuggingFace text embeddings for RAG pipelines.

Parameters:
  • model_name – Name of the sentence-transformers model to use.

  • max_seq_length – Maximum sequence length for the model.

  • **kwargs

    Additional arguments passed to EmbeddingsManager, including:

    • load_model_args: dict passed to SentenceTransformer() constructor (e.g. device, cache_folder).

    • min_batch_size / max_batch_size: Control batching for inference.

    • large_model: If True, share the model across processes to reduce memory usage.

    • inference_args: dict passed to model.encode() (e.g. normalize_embeddings).

get_model_handler()[source]

Returns model handler configured with RAG adapter.

get_ptransform_for_processing(**kwargs) PTransform[PCollection[EmbeddableItem], PCollection[EmbeddableItem]][source]

Returns PTransform that uses the RAG adapter.

class apache_beam.ml.rag.embeddings.huggingface.HuggingfaceImageEmbeddings(model_name: str, *, max_seq_length: int | None = None, **kwargs)[source]

Bases: EmbeddingsManager

HuggingFace image embeddings for RAG pipelines.

Generates embeddings for images using sentence-transformers models that support image input (e.g. clip-ViT-B-32).

Parameters:
  • model_name – Name of the sentence-transformers model. Must be an image-text model. See https://www.sbert.net/docs/sentence_transformer/pretrained_models.html#image-text-models

  • max_seq_length – Maximum sequence length for the model if applicable.

  • **kwargs

    Additional arguments passed to EmbeddingsManager, including:

    • load_model_args: dict passed to SentenceTransformer() constructor (e.g. device, cache_folder, trust_remote_code).

    • min_batch_size / max_batch_size: Control batching for inference.

    • large_model: If True, share the model across processes to reduce memory usage.

    • inference_args: dict passed to model.encode() (e.g. normalize_embeddings).

get_model_handler()[source]

Returns model handler configured with RAG adapter.

get_ptransform_for_processing(**kwargs) PTransform[source]

Returns PTransform for image embedding.