pyhealth.metrics.fairness#
- pyhealth.metrics.fairness.fairness_metrics_fn(y_true, y_prob, sensitive_attributes, favorable_outcome=1, metrics=None, threshold=0.5)[source]#
Computes metrics for binary classification.
User can specify which metrics to compute by passing a list of metric names. The accepted metric names are:
disparate_impact:
statistical_parity_difference:
If no metrics are disparate_impact, and statistical_parity_difference are computed by default.
- Parameters:
y_true (
ndarray
) – True target values of shape (n_samples,).y_prob (
ndarray
) – Predicted probabilities of shape (n_samples,).sensitive_attributes (
ndarray
) – Sensitive attributes of shape (n_samples,) where 1 is the protected group and 0 is the unprotected group.favorable_outcome (
int
) – Label value which is considered favorable (i.e. “positive”).metrics (
Optional
[List
[str
]]) – List of metrics to compute. Default is [“disparate_impact”, “statistical_parity_difference”].threshold (
float
) – Threshold for binary classification. Default is 0.5.
- Return type:
- Returns:
- Dictionary of metrics whose keys are the metric names and values are
the metric values.
- pyhealth.metrics.fairness_utils.disparate_impact(sensitive_attributes, y_pred, favorable_outcome=1, allow_zero_division=False, epsilon=1e-08)[source]#
Computes the disparate impact between the the protected and unprotected group.
disparate_impact = P(y_pred = favorable_outcome | P) / P(y_pred = favorable_outcome | U)
- Parameters:
sensitive_attributes (
ndarray
) – Sensitive attributes of shape (n_samples,) where 1 is the protected group and 0 is the unprotected group.y_pred (
ndarray
) – Predicted target values of shape (n_samples,).favorable_outcome (
int
) – Label value which is considered favorable (i.e. “positive”).allow_zero_division – If True, use epsilon instead of 0 in the denominator if the denominator is 0. Otherwise, raise a ValueError.
- Return type:
- Returns:
The disparate impact between the protected and unprotected group.
- pyhealth.metrics.fairness_utils.statistical_parity_difference(sensitive_attributes, y_pred, favorable_outcome=1)[source]#
Computes the statistical parity difference between the the protected and unprotected group.
statistical_parity_difference = P(y_pred = favorable_outcome | P) - P(y_pred = favorable_outcome | U) :type sensitive_attributes:
ndarray
:param sensitive_attributes: Sensitive attributes of shape (n_samples,) where 1 is the protected group and 0 is the unprotected group. :type y_pred:ndarray
:param y_pred: Predicted target values of shape (n_samples,). :type favorable_outcome:int
:param favorable_outcome: Label value which is considered favorable (i.e. “positive”).- Return type:
- Returns:
The statistical parity difference between the protected and unprotected group.
- pyhealth.metrics.fairness_utils.sensitive_attributes_from_patient_ids(dataset, patient_ids, sensitive_attribute, protected_group)[source]#
Returns the desired sensitive attribute array from patient_ids.
- Parameters:
dataset (
BaseEHRDataset
) – Dataset object.sensitive_attribute (
str
) – Sensitive attribute to extract.protected_group (
str
) – Value of the protected group.
- Return type:
ndarray
- Returns:
Sensitive attribute array of shape (n_samples,).