pyhealth.models.Deepr#
The separate callable DeeprLayer and the complete Deepr model.
- class pyhealth.models.DeeprLayer(feature_size=100, window=1, hidden_size=3)[source]#
Bases:
Module
Deepr layer.
Paper: P. Nguyen, T. Tran, N. Wickramasinghe and S. Venkatesh, “ Deepr : A Convolutional Net for Medical Records,” in IEEE Journal of Biomedical and Health Informatics, vol. 21, no. 1, pp. 22-30, Jan. 2017, doi: 10.1109/JBHI.2016.2633963.
This layer is used in the Deepr model.
- Parameters:
Examples
>>> from pyhealth.models import DeeprLayer >>> input = torch.randn(3, 128, 5) # [batch size, sequence len, input_size] >>> layer = DeeprLayer(5, window=4, hidden_size=7) # window does not impact the output shape >>> outputs = layer(input) >>> outputs.shape torch.Size([3, 7])
- forward(x, mask=None)[source]#
Forward propagation.
- Parameters:
x (
Tensor
) – a Tensor of shape [batch size, sequence len, input size].mask (
Optional
[Tensor
]) – an optional tensor of shape [batch size, sequence len], where 1 indicates valid and 0 indicates invalid.
- Returns:
- a Tensor of shape [batch size, hidden_size] representing the
summarized vector.
- Return type:
c
- class pyhealth.models.Deepr(dataset, feature_keys, label_key, mode, embedding_dim=128, hidden_dim=128, **kwargs)[source]#
Bases:
BaseModel
Deepr model.
Paper: P. Nguyen, T. Tran, N. Wickramasinghe and S. Venkatesh, “ Deepr : A Convolutional Net for Medical Records,” in IEEE Journal of Biomedical and Health Informatics, vol. 21, no. 1, pp. 22-30, Jan. 2017, doi: 10.1109/JBHI.2016.2633963.
Note
We use separate Deepr layers for different feature_keys.
- Parameters:
dataset (
BaseEHRDataset
) – the dataset to train the model. It is used to query certain information such as the set of all tokens.feature_keys (
List
[str
]) – list of keys in samples to use as features, e.g. [“conditions”, “procedures”].label_key (
str
) – key in samples to use as label (e.g., “drugs”).mode (
str
) – one of “binary”, “multiclass”, or “multilabel”.embedding_dim (
int
) – the embedding dimension. Default is 128.hidden_dim (
int
) – the hidden dimension. Default is 128.**kwargs – other parameters for the Deepr layer.
Examples
>>> from pyhealth.datasets import SampleEHRDataset >>> samples = [ ... { ... "patient_id": "patient-0", ... "visit_id": "visit-0", ... "list_codes": ["505800458", "50580045810", "50580045811"], # NDC ... "list_vectors": [[1.0, 2.55, 3.4], [4.1, 5.5, 6.0]], ... "list_list_codes": [["A05B", "A05C", "A06A"], ["A11D", "A11E"]], # ATC-4 ... "list_list_vectors": [ ... [[1.8, 2.25, 3.41], [4.50, 5.9, 6.0]], ... [[7.7, 8.5, 9.4]], ... ], ... "label": 1, ... }, ... { ... "patient_id": "patient-0", ... "visit_id": "visit-1", ... "list_codes": [ ... "55154191800", ... "551541928", ... "55154192800", ... "705182798", ... "70518279800", ... ], ... "list_vectors": [[1.4, 3.2, 3.5], [4.1, 5.9, 1.7], [4.5, 5.9, 1.7]], ... "list_list_codes": [["A04A", "B035", "C129"]], ... "list_list_vectors": [ ... [[1.0, 2.8, 3.3], [4.9, 5.0, 6.6], [7.7, 8.4, 1.3], [7.7, 8.4, 1.3]], ... ], ... "label": 0, ... }, ... ] >>> dataset = SampleEHRDataset(samples=samples, dataset_name="test") >>> >>> from pyhealth.models import Deepr >>> model = Deepr( ... dataset=dataset, ... feature_keys=[ ... "list_list_codes", ... "list_list_vectors", ... ], ... label_key="label", ... mode="binary", ... ) >>> >>> from pyhealth.datasets import get_dataloader >>> train_loader = get_dataloader(dataset, batch_size=2, shuffle=True) >>> data_batch = next(iter(train_loader)) >>> >>> ret = model(**data_batch) >>> print(ret) { 'loss': tensor(0.8908, device='cuda:0', grad_fn=<BinaryCrossEntropyWithLogitsBackward0>), 'y_prob': tensor([[0.2295], [0.2665]], device='cuda:0', grad_fn=<SigmoidBackward0>), 'y_true': tensor([[1.], [0.]], device='cuda:0'), 'logit': tensor([[-1.2110], [-1.0126]], device='cuda:0', grad_fn=<AddmmBackward0>) }