pyhealth.tasks.mortality_prediction#
- class pyhealth.tasks.mortality_prediction.MortalityPredictionMIMIC3(code_mapping=None)[source]#
Bases:
BaseTaskTask for predicting mortality using MIMIC-III dataset with text data.
This task aims to predict whether the patient will decease in the next hospital visit based on clinical information from the current visit.
Examples
>>> from pyhealth.datasets import MIMIC3Dataset >>> from pyhealth.tasks import MortalityPredictionMIMIC3 >>> dataset = MIMIC3Dataset( ... root="/path/to/mimic-iii/1.4", ... tables=["diagnoses_icd", "procedures_icd", "prescriptions"], ... ) >>> task = MortalityPredictionMIMIC3() >>> samples = dataset.set_task(task)
- input_schema: Dict[str, str] = {'conditions': 'sequence', 'drugs': 'sequence', 'procedures': 'sequence'}#
- pre_filter(df)#
- Return type:
LazyFrame
- class pyhealth.tasks.mortality_prediction.MortalityPredictionMIMIC4(code_mapping=None)[source]#
Bases:
BaseTaskTask for predicting mortality using MIMIC-IV EHR data only.
Examples
>>> from pyhealth.datasets import MIMIC4EHRDataset >>> from pyhealth.tasks import MortalityPredictionMIMIC4 >>> dataset = MIMIC4EHRDataset( ... root="/path/to/mimic-iv/2.2", ... tables=["diagnoses_icd", "procedures_icd", "prescriptions"], ... ) >>> task = MortalityPredictionMIMIC4() >>> samples = dataset.set_task(task)
- input_schema: Dict[str, str] = {'conditions': 'sequence', 'drugs': 'sequence', 'procedures': 'sequence'}#
- pre_filter(df)#
- Return type:
LazyFrame
- class pyhealth.tasks.mortality_prediction.MortalityPredictionEICU(code_mapping=None)[source]#
Bases:
BaseTaskTask for predicting mortality using eICU dataset.
This task aims to predict whether the patient will decease in the next hospital visit based on clinical information from the current visit.
Features key-value pairs: - using diagnosis table (ICD9CM and ICD10CM) as condition codes - using physicalexam table as procedure codes - using medication table as drugs codes
Examples
>>> from pyhealth.datasets import eICUDataset >>> from pyhealth.tasks import MortalityPredictionEICU >>> dataset = eICUDataset( ... root="/path/to/eicu-crd/2.0", ... tables=["diagnosis", "medication", "physicalExam"], ... ) >>> task = MortalityPredictionEICU() >>> samples = dataset.set_task(task)
- input_schema: Dict[str, str] = {'conditions': 'sequence', 'drugs': 'sequence', 'procedures': 'sequence'}#
- pre_filter(df)#
- Return type:
LazyFrame
- class pyhealth.tasks.mortality_prediction.MortalityPredictionEICU2(code_mapping=None)[source]#
Bases:
BaseTaskTask for predicting mortality using eICU dataset with alternative coding.
This task aims to predict whether the patient will decease in the next hospital visit based on clinical information from the current visit.
Similar to MortalityPredictionEICU, but with different code mapping: - using admissiondx table and diagnosisstring under diagnosis table as condition codes - using treatment table as procedure codes
Examples
>>> from pyhealth.datasets import eICUDataset >>> from pyhealth.tasks import MortalityPredictionEICU2 >>> dataset = eICUDataset( ... root="/path/to/eicu-crd/2.0", ... tables=["diagnosis", "treatment", "admissionDx"], ... ) >>> task = MortalityPredictionEICU2() >>> samples = dataset.set_task(task)
- pre_filter(df)#
- Return type:
LazyFrame
- class pyhealth.tasks.mortality_prediction.MortalityPredictionOMOP(code_mapping=None)[source]#
Bases:
BaseTaskTask for predicting mortality using OMOP CDM dataset.
This task predicts whether a patient has a death record (binary mortality prediction) based on clinical information from each visit. Unlike visit-specific mortality tasks, this provides a patient-level mortality indicator (whether the patient died at any point).
The task processes visits sequentially and extracts clinical codes (conditions, procedures, drugs) for each visit. Clinical events are linked to visits via visit_occurrence_id, following OMOP CDM conventions.
- Features:
Uses OMOP CDM standard tables (condition_occurrence, procedure_occurrence, drug_exposure)
Links clinical events to visits via visit_occurrence_id
Uses OMOP concept_ids as medical codes
Binary mortality label (1 if patient has death record, 0 otherwise)
- Task Schema:
- Input:
conditions: sequence of condition_concept_id codes
procedures: sequence of procedure_concept_id codes
drugs: sequence of drug_concept_id codes
- Output:
mortality: binary label (0: no death record, 1: death record)
- Parameters:
patient (Patient) – A Patient object containing OMOP CDM data.
- Returns:
- A list of samples, where each sample
contains: - visit_id: The visit_occurrence_id - patient_id: The person_id - conditions: List of condition_concept_id codes - procedures: List of procedure_concept_id codes - drugs: List of drug_concept_id codes - mortality: Binary label (0 or 1)
- Return type:
Examples
>>> from pyhealth.datasets import OMOPDataset >>> from pyhealth.tasks import MortalityPredictionOMOP >>> >>> # Load OMOP dataset >>> dataset = OMOPDataset( ... root="/path/to/omop/data", ... tables=["condition_occurrence", "procedure_occurrence", ... "drug_exposure"], ... ) >>> >>> # Create mortality prediction task >>> task = MortalityPredictionOMOP() >>> sample_dataset = dataset.set_task(task=task) >>> >>> # Access samples >>> print(f"Generated {len(sample_dataset)} samples") >>> sample = sample_dataset.samples[0] >>> print(f"Conditions: {sample['conditions']}") >>> print(f"Mortality: {sample['mortality']}")
Note
Visits without any clinical codes (conditions, procedures, or drugs) are excluded
The last visit is excluded as there is no “next visit” to predict for
Clinical events are filtered by visit_occurrence_id, not by timestamp ranges, following OMOP best practices
- input_schema: Dict[str, str] = {'conditions': 'sequence', 'drugs': 'sequence', 'procedures': 'sequence'}#
- pre_filter(df)#
- Return type:
LazyFrame