apache_beam.ml.transforms.embeddings.tensorflow_hub module¶
-
class
apache_beam.ml.transforms.embeddings.tensorflow_hub.
TensorflowHubTextEmbeddings
(columns: List[str], hub_url: str, preprocessing_url: Optional[str] = None, **kwargs)[source]¶ Bases:
apache_beam.ml.transforms.base.EmbeddingsManager
Embedding config for tensorflow hub models. This config can be used with MLTransform to embed text data. Models are loaded using the RunInference PTransform with the help of a ModelHandler.
Parameters: - columns – The columns containing the text to be embedded.
- hub_url – The url of the tensorflow hub model.
- preprocessing_url – The url of the preprocessing model. This is optional. If provided, the preprocessing model will be used to preprocess the text before feeding it to the main model.
- min_batch_size – The minimum batch size to be used for inference.
- max_batch_size – The maximum batch size to be used for inference.
- large_model – Whether to share the model across processes.
-
class
apache_beam.ml.transforms.embeddings.tensorflow_hub.
TensorflowHubImageEmbeddings
(columns: List[str], hub_url: str, **kwargs)[source]¶ Bases:
apache_beam.ml.transforms.base.EmbeddingsManager
Embedding config for tensorflow hub models. This config can be used with MLTransform to embed image data. Models are loaded using the RunInference PTransform with the help of a ModelHandler.
Parameters: - columns – The columns containing the images to be embedded.
- hub_url – The url of the tensorflow hub model.
- min_batch_size – The minimum batch size to be used for inference.
- max_batch_size – The maximum batch size to be used for inference.
- large_model – Whether to share the model across processes.