pyhealth.models.Deepr#

The separate callable DeeprLayer and the complete Deepr model.

class pyhealth.models.DeeprLayer(feature_size=100, window=1, hidden_size=3)[source]#

Bases: Module

Deepr layer.

Paper: P. Nguyen, T. Tran, N. Wickramasinghe and S. Venkatesh, “ Deepr : A Convolutional Net for Medical Records,” in IEEE Journal of Biomedical and Health Informatics, vol. 21, no. 1, pp. 22-30, Jan. 2017, doi: 10.1109/JBHI.2016.2633963.

This layer is used in the Deepr model.

Parameters:
  • feature_size (int) – embedding dim of codes (m in the original paper).

  • window (int) – sliding window (d in the original paper)

  • hidden_size (int) – number of conv filters (motif size, p, in the original paper)

Examples

>>> from pyhealth.models import DeeprLayer
>>> input = torch.randn(3, 128, 5)  # [batch size, sequence len, input_size]
>>> layer = DeeprLayer(5, window=4, hidden_size=7) # window does not impact the output shape
>>> outputs = layer(input)
>>> outputs.shape
torch.Size([3, 7])
forward(x, mask=None)[source]#

Forward propagation.

Parameters:
  • x (Tensor) – a Tensor of shape [batch size, sequence len, input size].

  • mask (Optional[Tensor]) – an optional tensor of shape [batch size, sequence len], where 1 indicates valid and 0 indicates invalid.

Returns:

a Tensor of shape [batch size, hidden_size] representing the

summarized vector.

Return type:

c

training: bool#
class pyhealth.models.Deepr(dataset, feature_keys, label_key, mode, embedding_dim=128, hidden_dim=128, **kwargs)[source]#

Bases: BaseModel

Deepr model.

Paper: P. Nguyen, T. Tran, N. Wickramasinghe and S. Venkatesh, “ Deepr : A Convolutional Net for Medical Records,” in IEEE Journal of Biomedical and Health Informatics, vol. 21, no. 1, pp. 22-30, Jan. 2017, doi: 10.1109/JBHI.2016.2633963.

Note

We use separate Deepr layers for different feature_keys.

Parameters:
  • dataset (BaseEHRDataset) – the dataset to train the model. It is used to query certain information such as the set of all tokens.

  • feature_keys (List[str]) – list of keys in samples to use as features, e.g. [“conditions”, “procedures”].

  • label_key (str) – key in samples to use as label (e.g., “drugs”).

  • mode (str) – one of “binary”, “multiclass”, or “multilabel”.

  • embedding_dim (int) – the embedding dimension. Default is 128.

  • hidden_dim (int) – the hidden dimension. Default is 128.

  • **kwargs – other parameters for the Deepr layer.

Examples

>>> from pyhealth.datasets import SampleEHRDataset
>>> samples = [
...         {
...             "patient_id": "patient-0",
...             "visit_id": "visit-0",
...             "list_codes": ["505800458", "50580045810", "50580045811"],  # NDC
...             "list_vectors": [[1.0, 2.55, 3.4], [4.1, 5.5, 6.0]],
...             "list_list_codes": [["A05B", "A05C", "A06A"], ["A11D", "A11E"]],  # ATC-4
...             "list_list_vectors": [
...                 [[1.8, 2.25, 3.41], [4.50, 5.9, 6.0]],
...                 [[7.7, 8.5, 9.4]],
...             ],
...             "label": 1,
...         },
...         {
...             "patient_id": "patient-0",
...             "visit_id": "visit-1",
...             "list_codes": [
...                 "55154191800",
...                 "551541928",
...                 "55154192800",
...                 "705182798",
...                 "70518279800",
...             ],
...             "list_vectors": [[1.4, 3.2, 3.5], [4.1, 5.9, 1.7], [4.5, 5.9, 1.7]],
...             "list_list_codes": [["A04A", "B035", "C129"]],
...             "list_list_vectors": [
...                 [[1.0, 2.8, 3.3], [4.9, 5.0, 6.6], [7.7, 8.4, 1.3], [7.7, 8.4, 1.3]],
...             ],
...             "label": 0,
...         },
...     ]
>>> dataset = SampleEHRDataset(samples=samples, dataset_name="test")
>>>
>>> from pyhealth.models import Deepr
>>> model = Deepr(
...         dataset=dataset,
...         feature_keys=[
...             "list_list_codes",
...             "list_list_vectors",
...         ],
...         label_key="label",
...         mode="binary",
...     )
>>>
>>> from pyhealth.datasets import get_dataloader
>>> train_loader = get_dataloader(dataset, batch_size=2, shuffle=True)
>>> data_batch = next(iter(train_loader))
>>>
>>> ret = model(**data_batch)
>>> print(ret)
{
    'loss': tensor(0.8908, device='cuda:0', grad_fn=<BinaryCrossEntropyWithLogitsBackward0>),
    'y_prob': tensor([[0.2295],
                [0.2665]], device='cuda:0', grad_fn=<SigmoidBackward0>),
    'y_true': tensor([[1.],
                [0.]], device='cuda:0'),
    'logit': tensor([[-1.2110],
                [-1.0126]], device='cuda:0', grad_fn=<AddmmBackward0>)
}
forward(**kwargs)[source]#

Forward propagation.

Return type:

Dict[str, Tensor]

training: bool#