apache_beam.ml.transforms.embeddings.huggingface module¶
- 
class apache_beam.ml.transforms.embeddings.huggingface.SentenceTransformerEmbeddings(model_name: str, columns: List[str], max_seq_length: Optional[int] = None, **kwargs)[source]¶
- Bases: - apache_beam.ml.transforms.base.EmbeddingsManager- Embedding config for sentence-transformers. This config can be used with MLTransform to embed text data. Models are loaded using the RunInference PTransform with the help of ModelHandler. - Parameters: - model_name – Name of the model to use. The model should be hosted on HuggingFace Hub or compatible with sentence_transformers.
- columns – List of columns to be embedded.
- max_seq_length – Max sequence length to use for the model if applicable.
- min_batch_size – The minimum batch size to be used for inference.
- max_batch_size – The maximum batch size to be used for inference.
- large_model – Whether to share the model across processes.