pyhealth.tasks.dka#
- class pyhealth.tasks.dka.DKAPredictionMIMIC4(padding=0)[source]#
Bases:
BaseTaskTask for predicting Diabetic Ketoacidosis (DKA) in the general patient population.
This task creates PATIENT-LEVEL samples from ALL patients in the dataset, predicting whether they will develop DKA. Features are collected from admissions BEFORE the first DKA event to prevent data leakage.
- Target Population:
ALL patients in the dataset (no filtering)
Large pool of negative samples (patients without DKA)
- Label Definition:
Positive (1): Patient has any DKA diagnosis code (ICD-9 or ICD-10)
Negative (0): Patient has no DKA diagnosis codes
- Data Leakage Prevention:
Admissions are sorted chronologically
For DKA-positive patients: Only data from admissions BEFORE the first DKA admission is included (no data from DKA admission or after)
For DKA-negative patients: All admissions are included
Patients whose first admission has DKA are excluded (no pre-DKA data)
- Features:
icd_codes: Combined diagnosis + procedure ICD codes (stagenet format)
labs: 10-dimensional vectors with lab categories
- Parameters:
padding (
int) – Additional padding for StageNet processor. Default: 0.
Example
>>> from pyhealth.datasets import MIMIC4Dataset >>> from pyhealth.tasks import DKAPredictionMIMIC4 >>> >>> dataset = MIMIC4Dataset( ... root="/path/to/mimic4", ... tables=["diagnoses_icd", "procedures_icd", "labevents", "admissions"], ... ) >>> task = DKAPredictionMIMIC4() >>> samples = dataset.set_task(task)
- LAB_CATEGORIES: ClassVar[Dict[str, List[str]]] = {'Anion Gap': ['50868', '52500'], 'Bicarbonate': ['50803', '50804'], 'Calcium': ['50808', '51624'], 'Chloride': ['50806', '52434', '50902', '52535'], 'Glucose': ['50809', '52027', '50931', '52569'], 'Magnesium': ['50960'], 'Osmolality': ['52031', '50964', '51701'], 'Phosphate': ['50970'], 'Potassium': ['50822', '52452', '50971', '52610'], 'Sodium': ['50824', '52455', '50983', '52623']}#
- LAB_CATEGORY_ORDER: ClassVar[List[str]] = ['Sodium', 'Potassium', 'Chloride', 'Bicarbonate', 'Glucose', 'Calcium', 'Magnesium', 'Anion Gap', 'Osmolality', 'Phosphate']#
- LABITEMS: ClassVar[List[str]] = ['50824', '52455', '50983', '52623', '50822', '52452', '50971', '52610', '50806', '52434', '50902', '52535', '50803', '50804', '50809', '52027', '50931', '52569', '50808', '51624', '50960', '50868', '52500', '52031', '50964', '51701', '50970']#
- pre_filter(df)#
- Return type:
LazyFrame