TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Time Series Forecasting	Electricity (192)	PatchTST/64	MSE	0.147	# 1
Time Series Forecasting	Electricity (336)	PatchTST/64	MSE	0.163	# 2
Time Series Forecasting	Electricity (720)	PatchTST/64	MSE	0.197	# 3
Time Series Forecasting	Electricity (96)	PatchTST/64	MSE	0.129	# 1
Time Series Forecasting	ETTh1 (192) Multivariate	PatchTST/64	MSE	0.413	# 8
Time Series Forecasting	ETTh1 (192) Multivariate	PatchTST/64	MAE	0.429	# 1
Time Series Forecasting	ETTh1 (192) Univariate	PatchTST/64	MSE	0.074	# 5
Time Series Forecasting	ETTh1 (192) Univariate	PatchTST/64	MAE	0.215	# 1
Time Series Forecasting	ETTh1 (336) Multivariate	PatchTST/64	MSE	0.422	# 3
Time Series Forecasting	ETTh1 (336) Multivariate	PatchTST/64	MAE	0.44	# 7
Time Series Forecasting	ETTh1 (336) Univariate	PatchTST/64	MSE	0.076	# 2
Time Series Forecasting	ETTh1 (336) Univariate	PatchTST/64	MAE	0.22	# 8
Time Series Forecasting	ETTh1 (720) Multivariate	PatchTST/64	MSE	0.447	# 4
Time Series Forecasting	ETTh1 (720) Multivariate	PatchTST/64	MAE	0.468	# 7
Time Series Forecasting	ETTh1 (720) Univariate	PatchTST/64	MSE	0.087	# 4
Time Series Forecasting	ETTh1 (720) Univariate	PatchTST/64	MAE	0.236	# 9
Time Series Forecasting	ETTh1 (96) Multivariate	PatchTST/64	MSE	0.37	# 3
Time Series Forecasting	ETTh1 (96) Multivariate	PatchTST/64	MAE	0.4	# 1
Time Series Forecasting	ETTh1 (96) Univariate	PatchTST/64	MSE	0.059	# 6
Time Series Forecasting	ETTh1 (96) Univariate	PatchTST/64	MAE	0.189	# 1
Time Series Forecasting	ETTh2 (192) Multivariate	PatchTST/64	MSE	0.341	# 5
Time Series Forecasting	ETTh2 (192) Multivariate	PatchTST/64	MAE	0.382	# 3
Time Series Forecasting	ETTh2 (192) Univariate	PatchTST/64	MSE	0.171	# 4
Time Series Forecasting	ETTh2 (192) Univariate	PatchTST/64	MAE	0.329	# 2
Time Series Forecasting	ETTh2 (336) Multivariate	PatchTST/64	MSE	0.329	# 3
Time Series Forecasting	ETTh2 (336) Multivariate	PatchTST/64	MAE	0.384	# 9
Time Series Forecasting	ETTh2 (336) Univariate	PatchTST/64	MSE	0.171	# 4
Time Series Forecasting	ETTh2 (336) Univariate	PatchTST/64	MAE	0.336	# 7
Time Series Forecasting	ETTh2 (720) Multivariate	PatchTST/64	MSE	0.379	# 1
Time Series Forecasting	ETTh2 (720) Multivariate	PatchTST/64	MAE	0.422	# 11
Time Series Forecasting	ETTh2 (720) Univariate	PatchTST/64	MSE	0.223	# 5
Time Series Forecasting	ETTh2 (720) Univariate	PatchTST/64	MAE	0.38	# 5
Time Series Forecasting	ETTh2 (96) Multivariate	PatchTST/64	MSE	0.274	# 5
Time Series Forecasting	ETTh2 (96) Multivariate	PatchTST/64	MAE	0.337	# 4
Time Series Forecasting	ETTh2 (96) Univariate	PatchTST/64	MSE	0.131	# 5
Time Series Forecasting	ETTh2 (96) Univariate	PatchTST/64	MAE	0.284	# 1
Time Series Forecasting	Weather (192)	PatchTST/64	MSE	0.194	# 4
Time Series Forecasting	Weather (336)	PatchTST/64	MSE	0.245	# 3
Time Series Forecasting	Weather (720)	PatchTST/64	MSE	0.314	# 2
Time Series Forecasting	Weather (96)	PatchTST/64	MSE	0.149	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-electricity-192)](https://paperswithcode.com/sota/time-series-forecasting-on-electricity-192?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-electricity-96)](https://paperswithcode.com/sota/time-series-forecasting-on-electricity-96?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth2-720-1)](https://paperswithcode.com/sota/time-series-forecasting-on-etth2-720-1?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-electricity-336)](https://paperswithcode.com/sota/time-series-forecasting-on-electricity-336?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth1-336-2)](https://paperswithcode.com/sota/time-series-forecasting-on-etth1-336-2?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-weather-720)](https://paperswithcode.com/sota/time-series-forecasting-on-weather-720?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-electricity-720)](https://paperswithcode.com/sota/time-series-forecasting-on-electricity-720?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth1-336-1)](https://paperswithcode.com/sota/time-series-forecasting-on-etth1-336-1?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth1-96-1)](https://paperswithcode.com/sota/time-series-forecasting-on-etth1-96-1?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth2-336-1)](https://paperswithcode.com/sota/time-series-forecasting-on-etth2-336-1?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-weather-336)](https://paperswithcode.com/sota/time-series-forecasting-on-weather-336?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth1-720-1)](https://paperswithcode.com/sota/time-series-forecasting-on-etth1-720-1?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth1-720-2)](https://paperswithcode.com/sota/time-series-forecasting-on-etth1-720-2?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth2-192-2)](https://paperswithcode.com/sota/time-series-forecasting-on-etth2-192-2?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth2-336-2)](https://paperswithcode.com/sota/time-series-forecasting-on-etth2-336-2?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-weather-192)](https://paperswithcode.com/sota/time-series-forecasting-on-weather-192?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-weather-96)](https://paperswithcode.com/sota/time-series-forecasting-on-weather-96?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth1-192-2)](https://paperswithcode.com/sota/time-series-forecasting-on-etth1-192-2?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth2-192-1)](https://paperswithcode.com/sota/time-series-forecasting-on-etth2-192-1?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth2-720-2)](https://paperswithcode.com/sota/time-series-forecasting-on-etth2-720-2?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth2-96-1)](https://paperswithcode.com/sota/time-series-forecasting-on-etth2-96-1?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth2-96-2)](https://paperswithcode.com/sota/time-series-forecasting-on-etth2-96-2?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth1-96-2)](https://paperswithcode.com/sota/time-series-forecasting-on-etth1-96-2?p=a-time-series-is-worth-64-words-long-term)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-time-series-is-worth-64-words-long-term/time-series-forecasting-on-etth1-192-1)](https://paperswithcode.com/sota/time-series-forecasting-on-etth1-192-1?p=a-time-series-is-worth-64-words-long-term)`

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

27 Nov 2022 · Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam ·

We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy. Code is available at: https://github.com/yuqinie98/PatchTST.

PDF Abstract

Code

Add Remove Mark official

yuqinie98/patchtst official

1,251

timeseriesAI/tsai

↳ Quickstart in

Colab

4,729

thuml/iTransformer

783

WenjieDu/PyPOTS

↳ Quickstart in

Colab

688

etna-team/etna

Tasks

Add Remove

Multivariate Time Series Forecasting

Representation Learning

Time Series

Time Series Analysis

Time Series Forecasting

Datasets

Weather Electricity Consuming Load

Results from the Paper

Edit

Ranked #1 on Time Series Forecasting on Electricity (192)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Time Series Forecasting	Electricity (192)	PatchTST/64	MSE	0.147	# 1	Compare
Time Series Forecasting	Electricity (336)	PatchTST/64	MSE	0.163	# 2	Compare
Time Series Forecasting	Electricity (720)	PatchTST/64	MSE	0.197	# 3	Compare
Time Series Forecasting	Electricity (96)	PatchTST/64	MSE	0.129	# 1	Compare
Time Series Forecasting	ETTh1 (192) Multivariate	PatchTST/64	MSE	0.413	# 8	Compare
Time Series Forecasting	ETTh1 (192) Multivariate	PatchTST/64	MAE	0.429	# 1	Compare
Time Series Forecasting	ETTh1 (192) Univariate	PatchTST/64	MSE	0.074	# 5	Compare
Time Series Forecasting	ETTh1 (192) Univariate	PatchTST/64	MAE	0.215	# 1	Compare
Time Series Forecasting	ETTh1 (336) Multivariate	PatchTST/64	MSE	0.422	# 3	Compare
Time Series Forecasting	ETTh1 (336) Multivariate	PatchTST/64	MAE	0.44	# 7	Compare
Time Series Forecasting	ETTh1 (336) Univariate	PatchTST/64	MSE	0.076	# 2	Compare
Time Series Forecasting	ETTh1 (336) Univariate	PatchTST/64	MAE	0.22	# 8	Compare
Time Series Forecasting	ETTh1 (720) Multivariate	PatchTST/64	MSE	0.447	# 4	Compare
Time Series Forecasting	ETTh1 (720) Multivariate	PatchTST/64	MAE	0.468	# 7	Compare
Time Series Forecasting	ETTh1 (720) Univariate	PatchTST/64	MSE	0.087	# 4	Compare
Time Series Forecasting	ETTh1 (720) Univariate	PatchTST/64	MAE	0.236	# 9	Compare
Time Series Forecasting	ETTh1 (96) Multivariate	PatchTST/64	MSE	0.37	# 3	Compare
Time Series Forecasting	ETTh1 (96) Multivariate	PatchTST/64	MAE	0.4	# 1	Compare
Time Series Forecasting	ETTh1 (96) Univariate	PatchTST/64	MSE	0.059	# 6	Compare
Time Series Forecasting	ETTh1 (96) Univariate	PatchTST/64	MAE	0.189	# 1	Compare
Time Series Forecasting	ETTh2 (192) Multivariate	PatchTST/64	MSE	0.341	# 5	Compare
Time Series Forecasting	ETTh2 (192) Multivariate	PatchTST/64	MAE	0.382	# 3	Compare
Time Series Forecasting	ETTh2 (192) Univariate	PatchTST/64	MSE	0.171	# 4	Compare
Time Series Forecasting	ETTh2 (192) Univariate	PatchTST/64	MAE	0.329	# 2	Compare
Time Series Forecasting	ETTh2 (336) Multivariate	PatchTST/64	MSE	0.329	# 3	Compare
Time Series Forecasting	ETTh2 (336) Multivariate	PatchTST/64	MAE	0.384	# 9	Compare
Time Series Forecasting	ETTh2 (336) Univariate	PatchTST/64	MSE	0.171	# 4	Compare
Time Series Forecasting	ETTh2 (336) Univariate	PatchTST/64	MAE	0.336	# 7	Compare
Time Series Forecasting	ETTh2 (720) Multivariate	PatchTST/64	MSE	0.379	# 1	Compare
Time Series Forecasting	ETTh2 (720) Multivariate	PatchTST/64	MAE	0.422	# 11	Compare
Time Series Forecasting	ETTh2 (720) Univariate	PatchTST/64	MSE	0.223	# 5	Compare
Time Series Forecasting	ETTh2 (720) Univariate	PatchTST/64	MAE	0.38	# 5	Compare
Time Series Forecasting	ETTh2 (96) Multivariate	PatchTST/64	MSE	0.274	# 5	Compare
Time Series Forecasting	ETTh2 (96) Multivariate	PatchTST/64	MAE	0.337	# 4	Compare
Time Series Forecasting	ETTh2 (96) Univariate	PatchTST/64	MSE	0.131	# 5	Compare
Time Series Forecasting	ETTh2 (96) Univariate	PatchTST/64	MAE	0.284	# 1	Compare
Time Series Forecasting	Weather (192)	PatchTST/64	MSE	0.194	# 4	Compare
Time Series Forecasting	Weather (336)	PatchTST/64	MSE	0.245	# 3	Compare
Time Series Forecasting	Weather (720)	PatchTST/64	MSE	0.314	# 2	Compare
Time Series Forecasting	Weather (96)	PatchTST/64	MSE	0.149	# 4	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove