apache_beam.ml.anomaly.transforms module
- class apache_beam.ml.anomaly.transforms.RunScoreAndLearn(detector: AnomalyDetector)[source]
Bases:
PTransform
[PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,Row
]]],PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,AnomalyResult
]]]]Applies the _ScoreAndLearnDoFn to a PCollection of data.
This PTransform scores and learns from data points using an anomaly detection model.
- Parameters:
detector – The anomaly detection model to use.
- expand(input: PCollection[Tuple[KeyT, Tuple[TempKeyT, Row]]]) PCollection[Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]]] [source]
- class apache_beam.ml.anomaly.transforms.RunThresholdCriterion(threshold_criterion: ThresholdFn)[source]
Bases:
PTransform
[PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,AnomalyResult
]]],PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,AnomalyResult
]]]]Applies a threshold criterion to anomaly detection results.
This PTransform applies a ThresholdFn to the anomaly scores in AnomalyResult objects, updating the prediction labels. It handles both stateful and stateless ThresholdFn implementations.
- Parameters:
threshold_criterion – The ThresholdFn to apply.
- expand(input: PCollection[Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]]]) PCollection[Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]]] [source]
- class apache_beam.ml.anomaly.transforms.RunAggregationStrategy(aggregation_strategy: AggregationFn | None, agg_model_id: str)[source]
Bases:
PTransform
[PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,AnomalyResult
]]],PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,AnomalyResult
]]]]Applies an aggregation strategy to grouped anomaly detection results.
This PTransform aggregates anomaly predictions from multiple models or data points using an AggregationFn. It handles both custom and simple aggregation strategies.
- Parameters:
aggregation_strategy – The AggregationFn to use.
agg_model_id – The model ID for aggregation.
- expand(input: PCollection[Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]]]) PCollection[Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]]] [source]
- class apache_beam.ml.anomaly.transforms.RunOneDetector(detector)[source]
Bases:
PTransform
[PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,Row
]]],PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,AnomalyResult
]]]]Runs a single anomaly detector on a PCollection of data.
This PTransform applies a single AnomalyDetector to the input data, including scoring, learning, and thresholding.
- Parameters:
detector – The AnomalyDetector to run.
- expand(input: PCollection[Tuple[KeyT, Tuple[TempKeyT, Row]]]) PCollection[Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]]] [source]
- class apache_beam.ml.anomaly.transforms.RunOfflineDetector(offline_detector: OfflineDetector)[source]
Bases:
PTransform
[PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,Row
]]],PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,AnomalyResult
]]]]Runs a offline anomaly detector on a PCollection of data.
This PTransform applies a OfflineDetector to the input data, handling custom input/output conversion and inference.
- Parameters:
offline_detector – The OfflineDetector to run.
- unnest_and_convert(nested: Tuple[Tuple[Any, Any], dict[str, List]]) Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]] [source]
Unnests and converts the model output to AnomalyResult.
- Parameters:
nested – A tuple containing the combined key (origin key, temp key) and a dictionary of input and output from RunInference.
- Returns:
A tuple containing the original key and AnomalyResult.
- expand(input: PCollection[Tuple[KeyT, Tuple[TempKeyT, Row]]]) PCollection[Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]]] [source]
- class apache_beam.ml.anomaly.transforms.RunEnsembleDetector(ensemble_detector: EnsembleAnomalyDetector)[source]
Bases:
PTransform
[PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,Row
]]],PCollection
[Tuple
[KeyT
,Tuple
[TempKeyT
,AnomalyResult
]]]]Runs an ensemble of anomaly detectors on a PCollection of data.
This PTransform applies an EnsembleAnomalyDetector to the input data, running each sub-detector and aggregating the results.
- Parameters:
ensemble_detector – The EnsembleAnomalyDetector to run.
- expand(input: PCollection[Tuple[KeyT, Tuple[TempKeyT, Row]]]) PCollection[Tuple[KeyT, Tuple[TempKeyT, AnomalyResult]]] [source]
- class apache_beam.ml.anomaly.transforms.AnomalyDetection(detector: AnomalyDetector)[source]
Bases:
PTransform
[PCollection
[Tuple
[KeyT
,Row
]],PCollection
[Tuple
[KeyT
,AnomalyResult
]]]Performs anomaly detection on a PCollection of data.
This PTransform applies an AnomalyDetector or EnsembleAnomalyDetector to the input data and returns a PCollection of AnomalyResult objects.
Examples:
# Run a single anomaly detector p | AnomalyDetection(ZScore(features=["x1"])) # Run an ensemble anomaly detector sub_detectors = [ZScore(features=["x1"]), IQR(features=["x2"])] p | AnomalyDetection( EnsembleAnomalyDetector(sub_detectors, aggregation_strategy=AnyVote()))
- Parameters:
detector – The AnomalyDetector or EnsembleAnomalyDetector to use.
- expand(input: PCollection[Tuple[KeyT, Row]]) PCollection[Tuple[KeyT, AnomalyResult]] [source]