You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This project merges inverted variate-token Transformers with locality-biased attention masks to examine how different encoding layouts interact with causal decay patterns in forecasting tasks.
Data
Weather data from southern Jena, Germany, 2020-2021 weather_data
Traffic data from the Bay Area California, USA, 2025/03/01-04 traffic_data
Mathematical Framework
Variate-Token Encoding
Given a multivariate time-series $X \in \mathbb{R}^{T \times D}$ where $T$ is the sequence length and $D$ is the number of variates, the framework explores two encoding perspectives:
Standard Layout: Each token represents a time step across all variates
where $\gamma \in [0,1]$ balances the contribution of each view, allowing the model to leverage complementary patterns captured by different encoding schemes.
where $w_h = \frac{1}{1 + \beta \cdot h}$ down-weights distant predictions, and $\beta$ controls the decay rate.
About
A research framework merging inverted variate-token Transformers with locality-biased attention masks. Examines how contrasting encoding layouts and causal decay interact in time-series forecasting tasks.