pyhealth.datasets.TUABDataset#
Dataset is available at https://isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml
The TUAB dataset (or Temple University Hospital EEG Abnormal Corpus) is a collection of EEG data acquired at the Temple University Hospital.
The dataset contains both normal and abnormal EEG readings.
- class pyhealth.datasets.TUABDataset(root, dataset_name=None, dev=False, refresh_cache=False, **kwargs)[source]#
Bases:
BaseSignalDataset
Base EEG dataset for the TUH Abnormal EEG Corpus
Dataset is available at https://isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml
The TUAB dataset (or Temple University Hospital EEG Abnormal Corpus) is a collection of EEG data acquired at the Temple University Hospital.
The dataset contains both normal and abnormal EEG readings.
Files are named in the form aaaaamye_s001_t000.edf. This includes the subject identifier (“aaaaamye”), the session number (“s001”) and a token number (“t000”). EEGs are split into a series of files starting with *t000.edf, *t001.edf, …
- Parameters:
root (
str
) – root directory of the raw data. You can choose to use the path to Cassette portion or the Telemetry portion.dev (
bool
) – whether to enable dev mode (only use a small subset of the data). Default is False.refresh_cache (
bool
) – whether to refresh the cache; if true, the dataset will be processed from scratch and the cache will be updated. Default is False.
- task#
Optional[str], name of the task (e.g., “EEG_abnormal”). Default is None.
- samples#
Optional[List[Dict]], a list of samples, each sample is a dict with patient_id, record_id, and other task-specific attributes as key. Default is None.
- patient_to_index#
Optional[Dict[str, List[int]]], a dict mapping patient_id to a list of sample indices. Default is None.
- visit_to_index#
Optional[Dict[str, List[int]]], a dict mapping visit_id to a list of sample indices. Default is None.
Examples
>>> from pyhealth.datasets import TUABDataset >>> dataset = TUABDataset( ... root="/srv/local/data/TUH/tuh_eeg_abnormal/v3.0.0/edf/", ... ) >>> dataset.stat() >>> dataset.info()