pyhealth.tasks.mortality_prediction#

class pyhealth.tasks.mortality_prediction.MortalityPredictionMIMIC3(code_mapping=None)[source]#

Bases: BaseTask

Task for predicting mortality using MIMIC-III dataset with text data.

This task aims to predict whether the patient will decease in the next hospital visit based on clinical information from the current visit.

Examples

>>> from pyhealth.datasets import MIMIC3Dataset
>>> from pyhealth.tasks import MortalityPredictionMIMIC3
>>> dataset = MIMIC3Dataset(
...     root="/path/to/mimic-iii/1.4",
...     tables=["diagnoses_icd", "procedures_icd", "prescriptions"],
... )
>>> task = MortalityPredictionMIMIC3()
>>> samples = dataset.set_task(task)
task_name: str = 'MortalityPredictionMIMIC3'#
input_schema: Dict[str, str] = {'conditions': 'sequence', 'drugs': 'sequence', 'procedures': 'sequence'}#
output_schema: Dict[str, str] = {'mortality': 'binary'}#
pre_filter(df)#
Return type:

LazyFrame

class pyhealth.tasks.mortality_prediction.MortalityPredictionMIMIC4(code_mapping=None)[source]#

Bases: BaseTask

Task for predicting mortality using MIMIC-IV EHR data only.

Examples

>>> from pyhealth.datasets import MIMIC4EHRDataset
>>> from pyhealth.tasks import MortalityPredictionMIMIC4
>>> dataset = MIMIC4EHRDataset(
...     root="/path/to/mimic-iv/2.2",
...     tables=["diagnoses_icd", "procedures_icd", "prescriptions"],
... )
>>> task = MortalityPredictionMIMIC4()
>>> samples = dataset.set_task(task)
task_name: str = 'MortalityPredictionMIMIC4'#
input_schema: Dict[str, str] = {'conditions': 'sequence', 'drugs': 'sequence', 'procedures': 'sequence'}#
output_schema: Dict[str, str] = {'mortality': 'binary'}#
pre_filter(df)#
Return type:

LazyFrame

class pyhealth.tasks.mortality_prediction.MortalityPredictionEICU(code_mapping=None)[source]#

Bases: BaseTask

Task for predicting mortality using eICU dataset.

This task aims to predict whether the patient will decease in the next hospital visit based on clinical information from the current visit.

Features key-value pairs: - using diagnosis table (ICD9CM and ICD10CM) as condition codes - using physicalexam table as procedure codes - using medication table as drugs codes

Examples

>>> from pyhealth.datasets import eICUDataset
>>> from pyhealth.tasks import MortalityPredictionEICU
>>> dataset = eICUDataset(
...     root="/path/to/eicu-crd/2.0",
...     tables=["diagnosis", "medication", "physicalExam"],
... )
>>> task = MortalityPredictionEICU()
>>> samples = dataset.set_task(task)
task_name: str = 'MortalityPredictionEICU'#
input_schema: Dict[str, str] = {'conditions': 'sequence', 'drugs': 'sequence', 'procedures': 'sequence'}#
output_schema: Dict[str, str] = {'mortality': 'binary'}#
pre_filter(df)#
Return type:

LazyFrame

class pyhealth.tasks.mortality_prediction.MortalityPredictionEICU2(code_mapping=None)[source]#

Bases: BaseTask

Task for predicting mortality using eICU dataset with alternative coding.

This task aims to predict whether the patient will decease in the next hospital visit based on clinical information from the current visit.

Similar to MortalityPredictionEICU, but with different code mapping: - using admissiondx table and diagnosisstring under diagnosis table as condition codes - using treatment table as procedure codes

Examples

>>> from pyhealth.datasets import eICUDataset
>>> from pyhealth.tasks import MortalityPredictionEICU2
>>> dataset = eICUDataset(
...     root="/path/to/eicu-crd/2.0",
...     tables=["diagnosis", "treatment", "admissionDx"],
... )
>>> task = MortalityPredictionEICU2()
>>> samples = dataset.set_task(task)
task_name: str = 'MortalityPredictionEICU2'#
input_schema: Dict[str, str] = {'conditions': 'sequence', 'procedures': 'sequence'}#
output_schema: Dict[str, str] = {'mortality': 'binary'}#
pre_filter(df)#
Return type:

LazyFrame

class pyhealth.tasks.mortality_prediction.MortalityPredictionOMOP(code_mapping=None)[source]#

Bases: BaseTask

Task for predicting mortality using OMOP CDM dataset.

This task predicts whether a patient has a death record (binary mortality prediction) based on clinical information from each visit. Unlike visit-specific mortality tasks, this provides a patient-level mortality indicator (whether the patient died at any point).

The task processes visits sequentially and extracts clinical codes (conditions, procedures, drugs) for each visit. Clinical events are linked to visits via visit_occurrence_id, following OMOP CDM conventions.

Features:
  • Uses OMOP CDM standard tables (condition_occurrence, procedure_occurrence, drug_exposure)

  • Links clinical events to visits via visit_occurrence_id

  • Uses OMOP concept_ids as medical codes

  • Binary mortality label (1 if patient has death record, 0 otherwise)

Task Schema:
Input:
  • conditions: sequence of condition_concept_id codes

  • procedures: sequence of procedure_concept_id codes

  • drugs: sequence of drug_concept_id codes

Output:
  • mortality: binary label (0: no death record, 1: death record)

Parameters:

patient (Patient) – A Patient object containing OMOP CDM data.

Returns:

A list of samples, where each sample

contains: - visit_id: The visit_occurrence_id - patient_id: The person_id - conditions: List of condition_concept_id codes - procedures: List of procedure_concept_id codes - drugs: List of drug_concept_id codes - mortality: Binary label (0 or 1)

Return type:

List[Dict[str, Any]]

Examples

>>> from pyhealth.datasets import OMOPDataset
>>> from pyhealth.tasks import MortalityPredictionOMOP
>>>
>>> # Load OMOP dataset
>>> dataset = OMOPDataset(
...     root="/path/to/omop/data",
...     tables=["condition_occurrence", "procedure_occurrence",
...             "drug_exposure"],
... )
>>>
>>> # Create mortality prediction task
>>> task = MortalityPredictionOMOP()
>>> sample_dataset = dataset.set_task(task=task)
>>>
>>> # Access samples
>>> print(f"Generated {len(sample_dataset)} samples")
>>> sample = sample_dataset.samples[0]
>>> print(f"Conditions: {sample['conditions']}")
>>> print(f"Mortality: {sample['mortality']}")

Note

  • Visits without any clinical codes (conditions, procedures, or drugs) are excluded

  • The last visit is excluded as there is no “next visit” to predict for

  • Clinical events are filtered by visit_occurrence_id, not by timestamp ranges, following OMOP best practices

task_name: str = 'MortalityPredictionOMOP'#
input_schema: Dict[str, str] = {'conditions': 'sequence', 'drugs': 'sequence', 'procedures': 'sequence'}#
output_schema: Dict[str, str] = {'mortality': 'binary'}#
pre_filter(df)#
Return type:

LazyFrame