[20240722_LabSeminar_Huy]WaveForM: Graph Enhanced Wavelet Learning for Long Sequence Forecasting of Multivariate Time Series.pptx

Quang-Huy Tran
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: huytran1126@gmail.com
2024-07-22
WaveForM: Graph Enhanced Wavelet
Learning for Long Sequence Forecasting
of Multivariate Time Series
Fuhao Yang et al.
AAAI-2023: The Thirty-Seventh AAAI Conference on Artificial Intelligence

2
OUTLINE
• MOTIVATION
• METHODOLOGY
• EXPERIMENT & RESULT
• CONCLUSION

3
MOTIVATION
• Multiple interconnected streams of data, or Multivariate time series (MTS), have
pervasive presence in real world applications:
o Recorded traffic flows from sensors, weather observations from weather stations, etc.
o Forecasting based on historical MTS observations for making meaningful and accurate application-
wide predictions.
Overview and Limitation
• Challenges:
o Existing work still overlooks long sequence forecasting (LSF) of MTS: Uses a given length of MTS to
predict longer future sequence.
 Long-term MTS are often composed of more entangled temporal patterns than short-term. Overlooking
leads unreliable discoveries of temporal dependencies.
o Transformer-based have proven effectiveness in modeling sequential data.
 Suffer from high computational cost in LSF.

4
INTRODUCTION
• Propose a global graph constructor to extract global information on the interrelationship
among variables in the wavelet domain:
o Preventing training from overfitting.
Contribution
• Propose a DWT-based end-to-end framework:
o Transforms MTS into a wavelet domain for MTS long sequence prediction tasks.
o Capable of fully exploiting the inherent features of MTS in both frequency and time domains.

5
METHODOLOGY
Problem Definition
• Given an MTS
o : N -variate time series.
o : time series of the i-th variable, which consists of sequential recordings at T timestamps.
• Problem: given
o An observation window H for historical time series and a forecasting window P for prediction.
where is learnable parameter set.
o Long sequence MTS forecasting: .
o Goal is to learn a mapping function : predicting the next P timesteps from H

6
METHODOLOGY
Main Architecture

7
METHODOLOGY
Discrete Wavelet Transform Module and Its Inverse Version
• DWT module:
o Transforms an input MTS into its corresponding multi-scale frequency representations with DWT.
o Used to decompose input signals into a set of wavelets: capture frequency and time features of original
signals.
o Each DWT uses a high-pass filter h and a lowpass filter g to decompose a time series signal x into
different resolutions: , is the -th decomposition and .
where represents the length of after decomposing (l − 1) times and s represents the scale

8
METHODOLOGY
Discrete Wavelet Transform Module and Its Inverse Version
• DWT module:
o After l layers of decomposition, for each , outputs a set of coefficients .
o Let represent the layered wavelet coefficients:
where H is length of the input MTS and N is
number of variables.
o After following graph-enhanced modules, output the i-th variable’s coefficients for future P time steps
apply Inverse Discrete Wavelet Transform (IDWT) to reconstruct corresponding sequence in the time
domain
where are the synthesis version of h and g.

9
METHODOLOGY
Global Graph Constructor (GGC)
• After obtaining the wavelet coefficients at different scales:
o Model intends to forecast the coefficient changes overtime in the wavelet domain.
o assume the variables share the same basic interaction structure at different resolutions.
o Using a global graph rather than learning graphs in each GP module also avoids overfitting and saves
memory.
o Apply graph structure learning to learn two embedding representations for each node after assigning
each node/variable as an integer scalar.
 Variable representations obtained from 2 different layers: ,
 Adjacency matrix:
where and is the hyper-parameter for activation function.

10
METHODOLOGY
Graph-Enhanced Prediction Modules
• Given the learnable adjacency matrix:
o Build Graph-enhanced Prediction (GP) modules to exploit the graphical information for predictions.
where is dilated convolution operator and is sigmoid function.
• Dilated Convolution Component:
o Input through stacked 1D dilated convolution: filter wavelet coefficients to incorporate wavelet
information.
o Utilize multiple dilated convolution filters with different kernel sizes to capture respective features for
wavelet coefficients at each level of resolutions.

11
METHODOLOGY
o Given the dilated convolution component’s output Z, the process of graph convolution component:
where is representations output from the previous layer and is a hyperparameter controls
the proportion of information maintained from the previous representation.
• Graph Convolution Component:
o aggregate node information with its neighbors’ information to capture global dependencies among
different variable.
o To mitigate over-smoothing of GCN, utilize MixHop layer to capture complex relationshipsof neighbors at
various hops. The process of K-layer MixHop:

12
METHODOLOGY
• Skip Connection and Output:
o Improve representational capability by preserving original information.
o Given wavelet coefficients , initialize 2 factors:
where is a 1×1 convolution kernel in GP and is a 1×L convolution
kernel for a skip connection layer.
o Input to pass through a K-layer stacked GP modules:
o The skip-output and other output factor representations of current GP module:
where is a hyperparameter to control the balance.

13
EXPERIMENT AND RESULT
EXPERIMENT SETTINGs
• Dataset:
o Electricity, Traffic, Weather and Solar-Energy.
• Baselines:
o Deep Learning: LSTM, Transformer, Informer [1], and Autoformer [2].
o STGNN: GraphWaveNet [3] and MTGNN [4] .
[1] Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. (2021, May). Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 12, pp. 11106-11115).
[2] Wu, H., Xu, J., Wang, J., & Long, M. (2021). Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in neural information processing systems, 34, 22419-22430.
[3] Wu, Z., Pan, S., Long, G., Jiang, J., & Zhang, C. (2019). Graph wavenet for deep spatial-temporal graph modeling. arXiv preprint arXiv:1906.00121.
[4] Wu, Z., Pan, S., Long, G., Jiang, J., Chang, X., & Zhang, C. (2020, August). Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 753-763).
• Measurement:
o Mean absolute error (MAE) and mean squared error (MSE).

14
RESULT – Overall Performance

15
RESULT – Ablation Study

16
CONCLUSION
• Proposed WaveForM, a novel framework for long sequence multivariate time series
forecasting
o Use DWT to transform the time-domain series into wavelet-domain coefficients at multiple
resolutions.
o Apply a graph convolution module to model the relationships between multivariate.
Summarization
• The transformed coefficients in the wavelet domain are more capable of describing the
input series from multiple resolutions
o allowing the model to learn fine-grained complex patterns.

[20240722_LabSeminar_Huy]WaveForM: Graph Enhanced Wavelet Learning for Long Sequence Forecasting of Multivariate Time Series.pptx

[20240722_LabSeminar_Huy]WaveForM: Graph Enhanced Wavelet Learning for Long Sequence Forecasting of Multivariate Time Series.pptx

More Related Content

Similar to [20240722_LabSeminar_Huy]WaveForM: Graph Enhanced Wavelet Learning for Long Sequence Forecasting of Multivariate Time Series.pptx (20)

More from thanhdowork (20)

Recently uploaded (20)

[20240722_LabSeminar_Huy]WaveForM: Graph Enhanced Wavelet Learning for Long Sequence Forecasting of Multivariate Time Series.pptx

Editor's Notes