pyhealth.processors.TupleTimeTextProcessor#

Processor for tuple time-based text data with temporal information.

class pyhealth.processors.TupleTimeTextProcessor(type_tag='note', tokenizer_model=None, max_length=128, padding=True, truncation=True)[source]#

Bases: TemporalFeatureProcessor

Processes (text, time_diff) tuples for multimodal temporal fusion.

Converts paired text and temporal data into a format suitable for models that need to distinguish between different modality types automatically.

If tokenizer_model is provided, the text will be tokenized using a HuggingFace AutoTokenizer, and the output will differ from the raw text version.

process(value)[source]#

Process a tuple of texts and time differences.

Parameters:

value (Tuple[List[str], List[float]]) – Tuple containing: - List[str]: Text entries (clinical notes, observations, etc.) - List[float]: Time differences corresponding to each text entry

Returns:

Tuple containing:

List[str]: Original text entries (unmodified)

torch.Tensor: 1D float tensor of time differences [shape: (N,)]

str: Type tag for modality routing

If tokenizer_model is provided:

Tuple containing:

torch.Tensor: input_ids [shape: (N, max_length)]
torch.Tensor: attention_mask [shape: (N, max_length)]
torch.Tensor: token_type_ids [shape: (N, max_length)] (if supported by tokenizer)
torch.Tensor: 1D float tensor of time differences [shape: (N,)]
str: Type tag

Return type:

If tokenizer_model is None

size()[source]#: Return the size of the processor vocabulary (not applicable for this processor).

is_token()[source]#

Returns True if the processor outputs discrete tokens (when tokenizer is used).

Return type:: bool

schema()[source]#

Returns the schema of the processed feature.

Return type:: tuple[str, ...]

dim()[source]#

Number of dimensions for each output tensor.

Return type:: tuple[int, ...]

modality()[source]#

Clinical text → TEXT modality.

Return type:: ModalityType

value_dim()[source]#

Tokenizer vocabulary size (used with transformer encoder). Returns 0 if no tokenizer is loaded.

Return type:: int

process_temporal(value)[source]#

Return dict output for UnifiedMultimodalEmbeddingModel.

Requires tokenizer_model to be set (raw strings are not litdata-serialisable and cannot be embedded without tokenisation).

Returns:: LongTensor (N, L), “mask”: LongTensor (N, L), “time”: FloatTensor (N,)}
Return type:: {“value”
Raises:: ValueError – If processor was created without a tokenizer.

fit(samples, field)#

Fit the processor to the samples.

Parameters:: samples (Iterable[Dict[str, Any]]) – List of sample dictionaries.
Return type:: None

load(path)#

Optional: Load processor state from disk.

Parameters:: path (str) – File path to load processor state from.
Return type:: None

save(path)#

Optional: Save processor state to disk.

Parameters:: path (str) – File path to save processor state.
Return type:: None

spatial()#

Whether each dimension (axis) of the value tensor is spatial (i.e. corresponds to a spatial axis like time, height, width, etc.) or not. This is used to determine how to apply augmentations and other transformations that should only be applied to spatial dimensions.

E.g. for CNN or RNN features, this would help determine which dimensions to apply spatial augmentations to, and which dimensions to treat as channels or features.

Return type:: tuple[bool, ...]
Returns:: Tuple of booleans corresponding to whether each axis of the value tensor is spatial or not.

Overview

TupleTimeTextProcessor handles clinical text paired with temporal information (time differences), enabling automatic modality routing in multimodal fusion pipelines.

Input/Output

Input: Tuple[List[str], List[float]] (texts, time differences)
Output: Tuple[List[str], torch.Tensor, str] (texts, 1D time tensor, modality tag)

Use Case

The type_tag parameter enables automatic modality routing without hardcoding feature names in multimodal pipelines:

type_tag="note" routes to text encoder
type_tag="image" routes to vision encoder
type_tag="ehr" routes to EHR encoder

Example Usage

from pyhealth.processors import TupleTimeTextProcessor

# Initialize processor with modality tag
processor = TupleTimeTextProcessor(type_tag="clinical_note")

# Patient notes with time differences (hours since admission)
texts = [
    "Patient admitted with chest pain.",
    "Follow-up: symptoms improved.",
    "Discharge: stable condition."
]
time_diffs = [0.0, 24.0, 72.0]

# Process tuple
processed_texts, time_tensor, modality_tag = processor.process((texts, time_diffs))

print(time_tensor)      # tensor([0., 24., 72.])
print(modality_tag)     # "clinical_note"

Multimodal Fusion

Use different type tags for automatic routing in multimodal models:

# Different modalities with different type tags
note_processor = TupleTimeTextProcessor(type_tag="note")
ehr_processor = TupleTimeTextProcessor(type_tag="ehr")

# Process different data types
note_texts, note_times, note_tag = note_processor.process((notes, note_time_diffs))
ehr_texts, ehr_times, ehr_tag = ehr_processor.process((events, event_time_diffs))

# Tags enable automatic routing to appropriate encoders
# note_tag="note" -> TextEmbedding encoder
# ehr_tag="ehr" -> EHR encoder