pyhealth.processors.TemporalFeatureProcessor#

Abstract base class for processors whose features are paired with timestamps.

class pyhealth.processors.TemporalFeatureProcessor[source]#

Bases: FeatureProcessor

Abstract base class for processors whose features are paired with timestamps.

Contract — every subclass must implement:

  • modality() -> ModalityType — what kind of data this processor handles.

  • value_dim() -> int — size of the raw value vector before any learned embedding (e.g. vocab_size for codes, n_features for numerics).

  • process(value) -> dict[str, torch.Tensor] — must return a dict with at least the keys "value" and "time", and optionally "mask".

Backward compatibility — the existing FeatureProcessor API (is_token, schema, dim, spatial) is kept on the parent class and continues to work for all non-temporal processors. Subclasses of TemporalFeatureProcessor should still implement those methods if they want to remain compatible with the existing EmbeddingModel / MultimodalRNN pipeline. The new modality() / value_dim() API is additive — used exclusively by UnifiedMultimodalEmbeddingModel.

Why dict output?

Concern

Tuple (current)

Dict (this class)

Collation

Custom per arity

Generic: stack/pad per key

litdata

List[str] breaks

All values tensors/scalars ✓

Schema

Positional, fragile

Named keys, self-documenting

Extensibility

Adding field = new arity

Adding key = backward-compat

abstract modality()[source]#

Return the modality type of the data this processor handles.

Return type:

ModalityType

abstract value_dim()[source]#

Dimensionality of the raw value vector before learned embedding.

For codes: vocab_size (used with nn.Embedding) For images: C * H * W (used with CNN encoder) For numerics: n_features (used with nn.Linear) For text: vocab_size (used with transformer encoder)

Return type:

int

abstract process(value)[source]#

Process raw input and return a dict of tensors.

Required keys:

"value" — main feature tensor. "time" — 1-D float32 tensor, one timestamp per event.

Optional keys:

"mask" — validity / attention mask for "value".

Return type:

dict[str, Tensor]

schema()[source]#

Standardised schema: at minimum ('value', 'time').

Return type:

tuple[str, ...]

dim()#

Number of dimensions (Tensor.dim()) for each output tensor, in the same order as the output tuple.

Return type:

tuple[int, ...]

Returns:

Tuple of integers corresponding to the number of dimensions of each output tensor.

fit(samples, field)#

Fit the processor to the samples.

Parameters:

samples (Iterable[Dict[str, Any]]) – List of sample dictionaries.

Return type:

None

is_token()#

Returns whether the output (in particular, the value tensor) of the processor represents discrete token indices (True) or continuous values (False). This is used to determine whether to apply token-based transformations (e.g. nn.Embedding) or value-based augmentations (e.g. nn.Linear).

Return type:

bool

Returns:

True if the output of the processor represents discrete token indices, False otherwise.

load(path)#

Optional: Load processor state from disk.

Parameters:

path (str) – File path to load processor state from.

Return type:

None

save(path)#

Optional: Save processor state to disk.

Parameters:

path (str) – File path to save processor state.

Return type:

None

spatial()#

Whether each dimension (axis) of the value tensor is spatial (i.e. corresponds to a spatial axis like time, height, width, etc.) or not. This is used to determine how to apply augmentations and other transformations that should only be applied to spatial dimensions.

E.g. for CNN or RNN features, this would help determine which dimensions to apply spatial augmentations to, and which dimensions to treat as channels or features.

Return type:

tuple[bool, ...]

Returns:

Tuple of booleans corresponding to whether each axis of the value tensor is spatial or not.