pyhealth.models.MoleRec#
The separate callable MoleRecLayer and the complete MoleRec model.
- class pyhealth.models.MoleRecLayer(hidden_size, coef=2.5, target_ddi=0.08, GNN_layers=4, dropout=0.5, multiloss_weight=0.05, **kwargs)[source]#
Bases:
Module
MoleRec model.
Paper: Nianzu Yang et al. MoleRec: Combinatorial Drug Recommendation with Substructure-Aware Molecular Representation Learning. WWW 2023.
This layer is used in the MoleRec model. But it can also be used as a standalone layer.
- Parameters:
hidden_size (
int
) – hidden feature size.coef (
float
) – coefficient of ddi loss weight annealing. larger coefficient means higher penalty to the drug-drug-interaction. Default is 2.5.target_ddi (
float
) – DDI acceptance rate. Default is 0.06.GNN_layers (
int
) – the number of layers of GNNs encoding molecule and substructures. Default is 4.dropout (
float
) – the dropout ratio of model. Default is 0.7.multiloss_weight (
float
) – the weight of multilabel_margin_loss for multilabel classification. Value should be set between [0, 1]. Default is 0.05
- forward(patient_emb, drugs, average_projection, ddi_adj, substructure_mask, substructure_graph, molecule_graph, mask=None, drug_indexes=None)[source]#
Forward propagation.
- Parameters:
patient_emb (
Tensor
) – a tensor of shape [patient, visit, num_substructures], representating the relation between each patient visit and each substructures.drugs (
Tensor
) – a multihot tensor of shape [patient, num_labels].mask (
Optional
[tensor
]) – an optional tensor of shape [patient, visit] where 1 indicates valid visits and 0 indicates invalid visits.substructure_mask (
Tensor
) – tensor of shape [num_drugs, num_substructures], representing whether a substructure shows up in one of the molecule of each drug.average_projection (
Tensor
) – a tensor of shape [num_drugs, num_molecules] representing the average projection for aggregating multiple molecules of the same drug into one vector.substructure_graph (
Union
[StaticParaDict
,Dict
[str
,Union
[int
,Tensor
]]]) – a dictionary representating a graph batch of all substructures, where each graph is extracted via ‘smiles2graph’ api of ogb library.molecule_graph (
Union
[StaticParaDict
,Dict
[str
,Union
[int
,Tensor
]]]) – dictionary with same form of substructure_graph, representing the graph batch of all molecules.ddi_adj (
Tensor
) – an adjacency tensor for drug drug interaction of shape [num_drugs, num_drugs].drug_indexes (
Optional
[Tensor
]) – the index version of drugs (ground truth) of shape [patient, num_labels], padded with -1
- Returns:
a scalar tensor representing the loss. y_prob: a tensor of shape [patient, num_labels] representing
the probability of each drug.
- Return type:
loss
- class pyhealth.models.MoleRec(dataset, embedding_dim=64, hidden_dim=64, num_rnn_layers=1, num_gnn_layers=4, dropout=0.5, **kwargs)[source]#
Bases:
BaseModel
MoleRec model.
Paper: Nianzu Yang et al. MoleRec: Combinatorial Drug Recommendation with Substructure-Aware Molecular Representation Learning. WWW 2023.
Note
This model is only for medication prediction which takes conditions and procedures as feature_keys, and drugs as label_key. It only operates on the visit level.
Note
This model only accepts ATC level 3 as medication codes.
- Parameters:
dataset (
SampleEHRDataset
) – the dataset to train the model. It is used to query certain information such as the set of all tokens.embedding_dim (
int
) – the embedding dimension. Default is 128.hidden_dim (
int
) – the hidden dimension. Default is 128.num_rnn_layers (
int
) – the number of layers used in RNN. Default is 1.num_gnn_layers (
int
) – the number of layers used in GNN. Default is 4.dropout (
float
) – the dropout rate. Default is 0.7.**kwargs – other parameters for the MoleRec layer.
- forward(conditions, procedures, drugs, **kwargs)[source]#
Forward propagation.
- Parameters:
- Returns:
loss: a scalar tensor representing the loss. y_prob: a tensor of shape [patient, visit, num_labels]
representing the probability of each drug.
- y_true: a tensor of shape [patient, visit, num_labels]
representing the ground truth of each drug.
- Return type:
A dictionary with the following keys