pyhealth.tasks.VariantClassificationClinVar#
- class pyhealth.tasks.VariantClassificationClinVar(code_mapping=None)[source]#
Bases:
BaseTaskTask for classifying variant clinical significance using ClinVar data.
This task predicts the clinical significance of genetic variants (e.g., Pathogenic, Benign, Uncertain significance) based on variant features from the ClinVar database.
- CLINICAL_SIGNIFICANCE_CATEGORIES#
Mapping of raw values to standardized clinical significance labels.
Note
Variants with conflicting interpretations or non-standard clinical significance values are excluded from the output samples.
Examples
>>> from pyhealth.datasets import ClinVarDataset >>> from pyhealth.tasks import VariantClassificationClinVar >>> dataset = ClinVarDataset(root="/path/to/clinvar") >>> task = VariantClassificationClinVar() >>> samples = dataset.set_task(task)
- input_schema: Dict[str, str] = {'chromosome': 'text', 'gene_symbol': 'text', 'variant_type': 'text'}#
- CLINICAL_SIGNIFICANCE_CATEGORIES: Dict[str, str] = {'benign': 'Benign', 'likely benign': 'Likely benign', 'likely pathogenic': 'Likely pathogenic', 'pathogenic': 'Pathogenic', 'uncertain significance': 'Uncertain significance', 'vus': 'Uncertain significance'}#
- pre_filter(df)#
- Return type:
LazyFrame