pyhealth.tasks.MutationPathogenicityPrediction#

class pyhealth.tasks.MutationPathogenicityPrediction(code_mapping=None)[source]#

Bases: BaseTask

Task for predicting mutation pathogenicity using COSMIC data.

This task predicts whether a somatic mutation is pathogenic or neutral based on FATHMM predictions and mutation features from the COSMIC database.

task_name#

The name of the task.

Type:

str

input_schema#

The input schema specifying required inputs.

Type:

Dict[str, str]

output_schema#

The output schema specifying outputs.

Type:

Dict[str, str]

VALID_FATHMM_PREDICTIONS#

Valid FATHMM prediction values.

Type:

tuple

Note

Only mutations with valid FATHMM predictions (PATHOGENIC or NEUTRAL) are included in the output samples.

Examples

>>> from pyhealth.datasets import COSMICDataset
>>> from pyhealth.tasks import MutationPathogenicityPrediction
>>> dataset = COSMICDataset(root="/path/to/cosmic")
>>> task = MutationPathogenicityPrediction()
>>> samples = dataset.set_task(task)
task_name: str = 'MutationPathogenicityPrediction'#
input_schema: Dict[str, str] = {'gene_name': 'text', 'mutation_description': 'text', 'primary_site': 'text'}#
output_schema: Dict[str, str] = {'fathmm_prediction': 'binary'}#
VALID_FATHMM_PREDICTIONS: tuple = ('PATHOGENIC', 'NEUTRAL')#
pre_filter(df)#
Return type:

LazyFrame