pyhealth.metrics.multilabel#

pyhealth.metrics.multilabel.multilabel_metrics_fn(y_true, y_prob, metrics=None, threshold=0.3, y_predset=None)[source]#

Computes metrics for multilabel classification.

User can specify which metrics to compute by passing a list of metric names. The accepted metric names are:

roc_auc_micro: area under the receiver operating characteristic curve, micro averaged

roc_auc_macro: area under the receiver operating characteristic curve, macro averaged

roc_auc_weighted: area under the receiver operating characteristic curve, weighted averaged

roc_auc_samples: area under the receiver operating characteristic curve, samples averaged

pr_auc_micro: area under the precision recall curve, micro averaged

pr_auc_macro: area under the precision recall curve, macro averaged

pr_auc_weighted: area under the precision recall curve, weighted averaged

pr_auc_samples: area under the precision recall curve, samples averaged

accuracy: accuracy score

f1_micro: f1 score, micro averaged

f1_macro: f1 score, macro averaged

f1_weighted: f1 score, weighted averaged

f1_samples: f1 score, samples averaged

precision_micro: precision score, micro averaged

precision_macro: precision score, macro averaged

precision_weighted: precision score, weighted averaged

precision_samples: precision score, samples averaged

recall_micro: recall score, micro averaged

recall_macro: recall score, macro averaged

recall_weighted: recall score, weighted averaged

recall_samples: recall score, samples averaged

jaccard_micro: Jaccard similarity coefficient score, micro averaged

jaccard_macro: Jaccard similarity coefficient score, macro averaged

jaccard_weighted: Jaccard similarity coefficient score, weighted averaged

jaccard_samples: Jaccard similarity coefficient score, samples averaged

ddi: drug-drug interaction score (specifically for drug-related tasks, such as drug recommendation)

hamming_loss: Hamming loss

cwECE: classwise ECE (with 20 equal-width bins). Check pyhealth.metrics.calibration.ece_classwise().

cwECE_adapt: classwise adaptive ECE (with 20 equal-size bins). Check pyhealth.metrics.calibration.ece_classwise().

The following metrics related to the prediction sets are accepted as well, but will be ignored if y_predset is None:

fp: Number of false positives.
tp: Number of true positives.

If no metrics are specified, pr_auc_samples is computed by default.

This function calls sklearn.metrics functions to compute the metrics. For more information on the metrics, please refer to the documentation of the corresponding sklearn.metrics functions.

Parameters:

y_true (ndarray) – True target values of shape (n_samples, n_labels).
y_prob (ndarray) – Predicted probabilities of shape (n_samples, n_labels).
metrics (Optional[List[str]]) – List of metrics to compute. Default is [“pr_auc_samples”].
threshold (float) – Threshold to binarize the predicted probabilities. Default is 0.5.

Return type:

Dict[str, float]

Returns:

Dictionary of metrics whose keys are the metric names and values are: the metric values.

Examples

>>> from pyhealth.metrics import multilabel_metrics_fn
>>> y_true = np.array([[0, 1, 1], [1, 0, 1]])
>>> y_prob = np.array([[0.1, 0.9, 0.8], [0.05, 0.95, 0.6]])
>>> multilabel_metrics_fn(y_true, y_prob, metrics=["accuracy"])
{'accuracy': 0.5}