Transformers for multivariate time series

Transformer neural networks represent a deep learning architecture exploiting attention for modeling sequence data, thus not requiring input data to actually be processed in sequence, unlike recurrent neural networks. Because of the recent successes of transformers in machine translation, I was curious to experiment with them for their use in modeling multivariate time series, and more specifically multivariate longitudinal clinical patient data. For this purpose, I have now implemented transformers as part of a method that I previously published (paper, code). The method addresses the problem of clustering multivariate time series with potentially many missing values, and uses a variational autoencoder with a Gaussian mixture prior, extended with LSTMs (or GRUs) for modeling multivariate time series, as well as implicit imputation and loss re-weighting for directly dealing with (potentially many) missing values. In addition to LSTMs and GRUs, I have now implemented transformers as part of the package. In addition to variational autoencoders with Gaussian mixture priors, the code allows to train ordinary variational LSTM/GRU/Transformer autoencoders (multivariate gaussian prior) and ordinary LSTM/GRU/Transformer autoencoders (without prior). I will probably report on some experiments comparing LSTMs and transformers for multivariate longitudinal clinical patient data in the near future.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s