Optimization

Optimize the parameters of a model by maximizing the performance on a given metric.

ecgan.evaluation.optimization.optimize_svm(metric, errors, labels, kernel=SklearnSVMKernels.RBF)[source]

Optimize metric via Support Vector Machines (SVMs).

Parameters
  • metric (MetricType) -- The metric that has to be optimized.

  • errors (List[Tensor]) -- The errors used to train the SVM.

  • labels (Tensor) -- Real input labels.

  • kernel (SklearnSVMKernels) -- Kernel used in SVM.

Return type

Tuple[Tensor, SVC]

Returns

Label predictions from SVM.

ecgan.evaluation.optimization.query_svm(clf, errors, labels)[source]

Query an already trained SVM (usually during train/vali) on test data.

Parameters
  • clf (SVC) -- Trained classifier (SVM).

  • errors (List[Tensor]) -- The errors used to test the SVM.

  • labels (Tensor) -- Real input labels.

Return type

Tensor

Returns

Label predictions from classifier.

ecgan.evaluation.optimization.optimize_metric(metric, errors, taus, params, ground_truth_labels)[source]

Optimize the given metric by weighting multiple errors using a grid-search approach.

To achieve this, a weighted (anomaly) score will be created. If there is only one parameter, the score will be aggregated and it will be checked if it exceeds a given value \(\tau\). If the score is higher than a given \(\tau\), the value is labeled as an anomaly.

If there are multiple error components \(e_i\), all combinations of error weights (params) which are less or equal than 1 are used to calculate an error score. For two errors (e.g. AnoGAN), a datum will be anomalous if \(\lambda_1 \cdot e_1+(1-\lambda_1) \cdot e_2 >= \tau\).

This holds true for multiple errors \(e_i\) and lambdas \(\lambda_i\):

\[\lambda_1 \cdot e_1+\lambda_2 \cdot e_2+....+\lambda_{n-1}+e_{n-1}+\left(1 - \sum_i{\lambda_i}\right) \cdot e_n \geq \tau\]

Note

While \(\tau\) can take arbitrary values, the weighting factors have to add up to 1! To avoid overwhelming error components, you might want to normalize the errors.

Parameters
  • errors (List[Tensor]) -- List of error Tensors.

  • metric (MetricType) -- The type of the metric that should be optimized.

  • taus (List[float]) -- Search range for optimizing the threshold tau.

  • params (List[List[float]]) -- Ranges of weighting parameters (requires n-1 weights for n tensors). Only params adding up to <1 are considered, you do not need to ensure this tho.

  • ground_truth_labels (Tensor) -- The real labels.

Return type

ndarray

Returns

An array of the 10 best scores for the specified metric given the parameterization. The shape will be [scores, taus, params].

ecgan.evaluation.optimization.get_weighted_error(errors, params)[source]

Calculate the weighted error.

Given n errors and n-1 parameters \(\lambda_i, i\in\{1,...n-1\}\), the calculation is based on the formula:

\[\lambda_1 \cdot e_1+\lambda_2 \cdot e_2+...+\lambda_{n-1} + e_{n-1} + \left(1-\sum_i{\lambda_i}\right) \cdot e_n\]
Return type

Tensor

ecgan.evaluation.optimization.label_score_get_best(ground_truth_labels, weighted_errors, tau, metric_classifier, best_weights, weights)[source]

Generate labels based on an absolute threshold and calculate the metric.

Check if the returned metric is one of the best weights and return the updated array of best weights.

Return type

ndarray

ecgan.evaluation.optimization.retrieve_labels_from_weights(errors, tau, weighting_params)[source]

Retrieve labels from a given pair of weighting parameters.

Return type

Tensor

Optimize anomaly detection via grid-search.

Return type

Tensor

ecgan.evaluation.optimization.optimize_tau_single_error(true_labels, error, tau_range, metric=MetricType.FSCORE)[source]

Optimize threshold given a metric for a single error.

Parameters
  • true_labels (Tensor) -- Real labels of the data.

  • error (Tensor) -- List of errors (1D Tensor) using any metric which can be used to formulate a thresholdable error score.

  • tau_range (List) -- Range of taus grid searched.

  • metric (MetricType) -- Metric to optimize on.

Return type

float

Returns

Highest score.