apache_beam.ml.anomaly.base module
Base classes for anomaly detection
- class apache_beam.ml.anomaly.base.AnomalyPrediction(model_id: str | None = None, score: float | None = None, label: int | None = None, threshold: float | None = None, info: str = '', source_predictions: Iterable[AnomalyPrediction] | None = None)[source]
Bases:
object
A dataclass for anomaly detection predictions.
- score: float | None = None
The outlier score resulting from applying the detector to the input data.
- source_predictions: Iterable[AnomalyPrediction] | None = None
If enabled, a list of AnomalyPrediction objects used to derive the aggregated prediction.
- class apache_beam.ml.anomaly.base.AnomalyResult(example: Row, predictions: Iterable[AnomalyPrediction])[source]
Bases:
object
A dataclass for the anomaly detection results
- predictions: Iterable[AnomalyPrediction]
The iterable of AnomalyPrediction objects containing the predictions. Expect length 1 if it is a result for a non-ensemble detector or an ensemble detector with an aggregation strategy applied.
- class apache_beam.ml.anomaly.base.ThresholdFn(normal_label: int = 0, outlier_label: int = 1, missing_label: int = -2)[source]
Bases:
ABC
An abstract base class for threshold functions.
- Parameters:
normal_label – The integer label used to identify normal data. Defaults to 0.
outlier_label – The integer label used to identify outlier data. Defaults to 1.
missing_label – The integer label used when a score is missing because the model is not ready to score.
- abstract property threshold: float | None
Retrieves the current threshold value, or None if not set.
- abstract apply(score: float | None) int | None [source]
Applies the threshold function to a given score to classify it as normal or outlier.
- Parameters:
score – The outlier score generated from the detector (model).
- Returns:
The label assigned to the score, either self._normal_label or self._outlier_label
- class apache_beam.ml.anomaly.base.AggregationFn[source]
Bases:
ABC
An abstract base class for aggregation functions.
- abstract apply(predictions: Iterable[AnomalyPrediction]) AnomalyPrediction [source]
Applies the aggregation function to an iterable of predictions, either on their outlier scores or labels.
- Parameters:
predictions – An Iterable of AnomalyPrediction objects to aggregate.
- Returns:
An AnomalyPrediction object containing the aggregated result.
- class apache_beam.ml.anomaly.base.AnomalyDetector(model_id: str | None = None, features: Iterable[str] | None = None, target: str | None = None, threshold_criterion: ThresholdFn | None = None, **kwargs)[source]
Bases:
ABC
An abstract base class for anomaly detectors.
- Parameters:
model_id – The ID of detector (model). Defaults to the value of the spec_type attribute, or ‘unknown’ if not set.
features – An Iterable of strings representing the names of the input features in the beam.Row
target – The name of the target field in the beam.Row.
threshold_criterion – An optional ThresholdFn to apply to the outlier score and yield a label.
- class apache_beam.ml.anomaly.base.EnsembleAnomalyDetector(sub_detectors: List[AnomalyDetector] | None = None, aggregation_strategy: AggregationFn | None = None, **kwargs)[source]
Bases:
AnomalyDetector
An abstract base class for an ensemble of anomaly (sub-)detectors.
- Parameters:
sub_detectors – A List of AnomalyDetector used in this ensemble model.
aggregation_strategy – An optional AggregationFn to apply to the predictions from all sub-detectors and yield an aggregated result.
model_id – Inherited from AnomalyDetector.
features – Inherited from AnomalyDetector.
target – Inherited from AnomalyDetector.
threshold_criterion – Inherited from AnomalyDetector.