pyhealth.models.SafeDrug#

The separate callable SafeDrugLayer and the complete SafeDrug model.

class pyhealth.models.SafeDrugLayer(hidden_size, mask_H, ddi_adj, num_fingerprints, molecule_set, average_projection, kp=0.05, target_ddi=0.08)[source]#

Bases: Module

SafeDrug model.

Paper: Chaoqi Yang et al. SafeDrug: Dual Molecular Graph Encoders for Recommending Effective and Safe Drug Combinations. IJCAI 2021.

This layer is used in the SafeDrug model. But it can also be used as a standalone layer. Note that we improve the layer a little bit to make it compatible with the package. Original code can be found at https://github.com/ycq091044/SafeDrug/blob/main/src/models.py.

Parameters:
  • hidden_size (int) – hidden feature size.

  • mask_H (Tensor) – the mask matrix H of shape [num_drugs, num_substructures].

  • ddi_adj (Tensor) – an adjacency tensor of shape [num_drugs, num_drugs].

  • num_fingerprints (int) – total number of different fingerprints.

  • molecule_set (List[Tuple]) – a list of molecule tuples (A, B, C) of length num_molecules. - A <torch.tensor>: fingerprints of atoms in the molecule - B <torch.tensor>: adjacency matrix of the molecule - C <int>: molecular_size

  • average_projection (Tensor) – a tensor of shape [num_drugs, num_molecules] representing the average projection for aggregating multiple molecules of the same drug into one vector.

  • kp (float) – correcting factor for the proportional signal. Default is 0.5.

  • target_ddi (float) – DDI acceptance rate. Default is 0.08.

pad(matrices, pad_value)[source]#

Pads the list of matrices.

Padding with a pad_value (e.g., 0) for batch processing. For example, given a list of matrices [A, B, C], we obtain a new matrix [A00, 0B0, 00C], where 0 is the zero (i.e., pad value) matrix.

calculate_loss(logits, y_prob, labels)[source]#
Return type:

Tensor

forward(patient_emb, drugs, mask=None)[source]#

Forward propagation.

Parameters:
  • patient_emb (tensor) – a tensor of shape [patient, visit, input_size].

  • drugs (tensor) – a multihot tensor of shape [patient, num_labels].

  • mask (Optional[tensor]) – an optional tensor of shape [patient, visit] where 1 indicates valid visits and 0 indicates invalid visits.

Returns:

a scalar tensor representing the loss. y_prob: a tensor of shape [patient, num_labels] representing

the probability of each drug.

Return type:

loss

training: bool#
class pyhealth.models.SafeDrug(dataset, embedding_dim=128, hidden_dim=128, num_layers=1, dropout=0.5, **kwargs)[source]#

Bases: BaseModel

SafeDrug model.

Paper: Chaoqi Yang et al. SafeDrug: Dual Molecular Graph Encoders for Recommending Effective and Safe Drug Combinations. IJCAI 2021.

Note

This model is only for medication prediction which takes conditions and procedures as feature_keys, and drugs as label_key. It only operates on the visit level.

Note

This model only accepts ATC level 3 as medication codes.

Parameters:
  • dataset (SampleEHRDataset) – the dataset to train the model. It is used to query certain information such as the set of all tokens.

  • embedding_dim (int) – the embedding dimension. Default is 128.

  • hidden_dim (int) – the hidden dimension. Default is 128.

  • num_layers (int) – the number of layers used in RNN. Default is 1.

  • dropout (float) – the dropout rate. Default is 0.5.

  • **kwargs – other parameters for the SafeDrug layer.

generate_ddi_adj()[source]#

Generates the DDI graph adjacency matrix.

Return type:

tensor

generate_smiles_list()[source]#

Generates the list of SMILES strings.

Return type:

List[List[str]]

generate_mask_H()[source]#

Generates the molecular segmentation mask H.

Return type:

tensor

generate_molecule_info(radius=1)[source]#

Generates the molecule information.

forward(conditions, procedures, drugs, **kwargs)[source]#

Forward propagation.

Parameters:
  • conditions (List[List[List[str]]]) – a nested list in three levels [patient, visit, condition].

  • procedures (List[List[List[str]]]) – a nested list in three levels [patient, visit, procedure].

  • drugs (List[List[str]]) – a nested list in two levels [patient, drug].

Returns:

loss: a scalar tensor representing the loss. y_prob: a tensor of shape [patient, visit, num_labels] representing

the probability of each drug.

y_true: a tensor of shape [patient, visit, num_labels] representing

the ground truth of each drug.

Return type:

A dictionary with the following keys

training: bool#