apache_beam.ml.inference.sklearn_inference module¶
-
class
apache_beam.ml.inference.sklearn_inference.
ModelFileType
[source]¶ Bases:
enum.Enum
Defines how a model file is serialized. Options are pickle or joblib.
-
PICKLE
= 1¶
-
JOBLIB
= 2¶
-
-
class
apache_beam.ml.inference.sklearn_inference.
SklearnModelHandlerNumpy
(model_uri: str, model_file_type: apache_beam.ml.inference.sklearn_inference.ModelFileType = <ModelFileType.PICKLE: 1>)[source]¶ Bases:
apache_beam.ml.inference.base.ModelHandler
Implementation of the ModelHandler interface for scikit-learn using numpy arrays as input.
Example Usage:
pcoll | RunInference(SklearnModelHandlerNumpy(model_uri="my_uri"))
Parameters: - model_uri – The URI to where the model is saved.
- model_file_type – The method of serialization of the argument. default=pickle
-
run_inference
(batch: Sequence[numpy.ndarray], model: sklearn.base.BaseEstimator, inference_args: Optional[Dict[str, Any]] = None) → Iterable[apache_beam.ml.inference.base.PredictionResult][source]¶ Runs inferences on a batch of numpy arrays.
Parameters: - batch – A sequence of examples as numpy arrays. They should be single examples.
- model – A numpy model or pipeline. Must implement predict(X). Where the parameter X is a numpy array.
- inference_args – Any additional arguments for an inference.
Returns: An Iterable of type PredictionResult.
-
class
apache_beam.ml.inference.sklearn_inference.
SklearnModelHandlerPandas
(model_uri: str, model_file_type: apache_beam.ml.inference.sklearn_inference.ModelFileType = <ModelFileType.PICKLE: 1>)[source]¶ Bases:
apache_beam.ml.inference.base.ModelHandler
Implementation of the ModelHandler interface for scikit-learn that supports pandas dataframes.
Example Usage:
pcoll | RunInference(SklearnModelHandlerPandas(model_uri="my_uri"))
NOTE: This API and its implementation are under development and do not provide backward compatibility guarantees.
Parameters: - model_uri – The URI to where the model is saved.
- model_file_type – The method of serialization of the argument. default=pickle
-
run_inference
(batch: Sequence[pandas.core.frame.DataFrame], model: sklearn.base.BaseEstimator, inference_args: Optional[Dict[str, Any]] = None) → Iterable[apache_beam.ml.inference.base.PredictionResult][source]¶ Runs inferences on a batch of pandas dataframes.
Parameters: - batch – A sequence of examples as numpy arrays. They should be single examples.
- model – A dataframe model or pipeline. Must implement predict(X). Where the parameter X is a pandas dataframe.
- inference_args – Any additional arguments for an inference.
Returns: An Iterable of type PredictionResult.