🔔 Share your dataset with the ML community!

Filter by Modality

Filter by Task (clear)

Filter by Language

28 dataset results for Time Series Forecasting

ETT (Electricity Transformer Temperature)

The Electricity Transformer Temperature (ETT) is a crucial indicator in the electric power long-term deployment. This dataset consists of 2 years data from two separated counties in China. To explore the granularity on the Long sequence time-series forecasting (LSTF) problem, different subsets are created, {ETTh1, ETTh2} for 1-hour-level and ETTm1 for 15-minutes-level. Each data point consists of the target value ”oil temperature” and 6 power load features. The train/val/test is 12/4/4 months.

170 PAPERS • 1 BENCHMARK

The M4 dataset is a collection of 100,000 time series used for the fourth edition of the Makridakis forecasting Competition. The M4 dataset consists of time series of yearly, quarterly, monthly and other (weekly, daily and hourly) data, which are divided into training and test sets. The minimum numbers of observations in the training test are 13 for yearly, 16 for quarterly, 42 for monthly, 80 for weekly, 93 for daily and 700 for hourly series. The participants were asked to produce the following numbers of forecasts beyond the available data that they had been given: six for yearly, eight for quarterly, 18 for monthly series, 13 for weekly series and 14 and 48 forecasts respectively for the daily and hourly ones.

87 PAPERS • NO BENCHMARKS YET

Appliances Energy

Appliances Energy (Appliances Energy Prediction)

This dataset details the energy consumption of appliances in a low-energy building over 4.5 months. Data was collected at 10-minute intervals.

28 PAPERS • NO BENCHMARKS YET

PeMSD7

PeMSD7 is traffic data in District 7 of California consisting of the traffic speed of 228 sensors while the period is from May to June in 2012 (only weekdays) with a time interval of 5 minutes. This dataset is popular for benchmark the traffic forecasting models.

20 PAPERS • 2 BENCHMARKS

Weather2K

A multivariate spatio-temporal benchmark dataset for meteorological forecasting based on real-time observation data from ground weather stations.

8 PAPERS • 16 BENCHMARKS

EarthNet2021 (EarthNet2021: Earth Surface Forecasting)

Satellite images are snapshots of the Earth surface. We propose to forecast them. We frame Earth surface forecasting as the task of predicting satellite imagery conditioned on future weather. EarthNet2021 is a large dataset suitable for training deep neural networks on the task. It contains Sentinel~2 satellite imagery at $20$~m resolution, matching topography and mesoscale ($1.28$~km) meteorological variables packaged into $32000$ samples. Additionally we frame EarthNet2021 as a challenge allowing for model intercomparison. Resulting forecasts will greatly improve ($>\times50$) over the spatial resolution found in numerical models. This allows localized impacts from extreme weather to be predicted, thus supporting downstream applications such as crop yield prediction, forest health assessments or biodiversity monitoring. Find data, code, and how to participate at www.earthnet.tech.

7 PAPERS • 2 BENCHMARKS

Weather

Weather (Max-Planck-Institut Weather Dataset for Long-term Time Series Forecasting)

Weather is recorded every 10 minutes for the 2020 whole year, which contains 21 meteorological indicators, such as air temperature, humidity, etc. The dataset in CSV format can be downloaded at https://drive.google.com/file/d/1Tc7GeVN7DLEl-RAs-JVwG9yFMf--S8dy/view?usp=share_link.

7 PAPERS • 5 BENCHMARKS

Electricity Consuming Load

Electricity Consuming Load (UCI Electricity Consuming Load)

This data set contains electricity consumption of 370 points/clients.

6 PAPERS • 4 BENCHMARKS

The tourism forecasting competition

The data we use include 366 monthly series, 427 quarterly series and 518 yearly series. They were supplied by both tourism bodies (such as Tourism Australia, the Hong Kong Tourism Board and Tourism New Zealand) and various academics, who had used them in previous tourism forecasting studies (please refer to the acknowledgements and details of the data sources and availability).

4 PAPERS • NO BENCHMARKS YET

Air Quality Index

Air Quality Index (Air Quality Index prediction probelm)

The AQI dataset is collected from 12 observing stations around Beijing from year 2013 to 2017. The data is accessible at The University of California, Irvine (UCI) Machine Learning Repository.

3 PAPERS • NO BENCHMARKS YET

ExtMarker (3D motion of chest external markers)

Three-dimensional position of external markers placed on the chest and abdomen of healthy individuals breathing during intervals from 73s to 222s. The markers move because of the respiratory motion, and their position is sampled at approximately 10Hz. Markers are metallic objects used during external beam radiotherapy to track and predict the motion of tumors due to breathing for accurate dose delivery.

3 PAPERS • 1 BENCHMARK

VISUELLE2.0

Visuelle 2.0 is a dataset containing real data for 5355 clothing products of the retail fast-fashion Italian company, Nuna Lie. Specifically, Visuelle 2.0 provides data from 6 fashion seasons (partitioned in Autumn-Winter and Spring-Summer) from 2017-2019, right before the Covid-19 pandemic. Each product is accompanied by an HD image, textual tags and more. The time series data are disaggregated at the shop level, and include the sales, inventory stock, max-normalized prices (for the sake of confidentiality} and discounts. Exogenous time series data is also provided, in the form of Google Trends based on the textual tags and multivariate weather conditions of the stores’ locations. Finally, we also provide purchase data for 667K customers whose identity has been anonymized, to capture personal preferences. With these data, Visuelle 2.0 allows to cope with several problems which characterize the activity of a fast fashion company: new product demand forecasting, short-observation new pr

3 PAPERS • 2 BENCHMARKS

Hotel Sales (Time Series)

The dataset contains the hotel demand and revenue of 8 major tourist destinations in the US (e.g., Los Angeles, Orlando ...). The dataset contains sales, daily occupancy, demand, and revenue of the upper-middle class hotels.

2 PAPERS • NO BENCHMARKS YET

Hurricane (Time Series Hurricane)

A new spatio-temporal benchmark dataset (Hurricane), is suited for forecasting during extreme events and anomalies. The dataset is provided through the Florida Department of Revenue which provides the monthly sales revenue (2003-2020) for the tourism industry for all 67 counties of Florida which are prone to annual hurricanes. Furthermore, we aligned and joined the raw time series with the history of hurricane categories based on time for each county. More precisely, the hurricane category indicates the maximum sustained wind speed which can result in catastrophic damages (Oceanic 2022).

2 PAPERS • 1 BENCHMARK

Lorenz Dataset

The Lorenz dataset contains 100000 time-series with length 24. The data has 5 modes and it is obtained using the Lorenz equation with 5 different seed values.

2 PAPERS • 1 BENCHMARK

Multivariate-Mobility-Paris

The original dataset was provided by Orange telecom in France, which contains anonymized and aggregated human mobility data. The Multivariate-Mobility-Paris dataset comprises information from 2020-08-24 to 2020-11-04 (72 days during the COVID-19 pandemic), with time granularity of 30 minutes and spatial granularity of 6 coarse regions in Paris, France. In other words, it represents a multivariate time series dataset.

2 PAPERS • NO BENCHMARKS YET

StockEmotions

This repository contains a financial-domain-focused dataset for financial sentiment/emotion classification and stock market time series prediction. It's based on our paper: StockEmotions: Discover Investor Emotions for Financial Sentiment Analysis and Multivariate Time Series accepted by AAAI 2023 Bridge (AI for Financial Services).

2 PAPERS • NO BENCHMARKS YET

US Economy (Spending's, Population,)

State-level data for the US economy. The changes in the number of employees based on one million employees active in the US during the COVID-19 pandemic are gathered from Homebase (Bartik et al. 2020). We further enriched the data with the state-level policies as an indication of extreme events (e.g., the state’s business closure order).

2 PAPERS • 1 BENCHMARK

A probabilistic forecast methodology for volatile electricity prices in the Australian National Electricity Market

Dataset for A probabilistic forecast methodology for volatile electricity prices in the Australian National Electricity Market

1 PAPER • NO BENCHMARKS YET

Beijing Traffic

The Beijing Traffic Dataset collects traffic speeds at 5-minute granularity for 3126 roadway segments in Beijing between 2022/05/12 and 2022/07/25.

1 PAPER • 1 BENCHMARK

Box-Jenkins

Box-Jenkins (Box-Jenkins Gas Furnace Problem)

Box-Jenkins gas furnace, a well-known time series forecasting problem

1 PAPER • NO BENCHMARKS YET

Korea Composite Stock Price Index

The data contains the following attributes for Korea Stock Price Index (KOSPI) for January 2000–December 2016: 1. Date (YYYY.M(M).D(D)) 2. Opening Price for the date, PX_OPEN 3. Highest Price for the date, PX_HIGH 4. Lowest Price for the date, PX_LOW 5. Closing Price for the date, PX_LAST 6. Total volume traded on the date, PX_VOLUME

1 PAPER • 1 BENCHMARK

MLO-Cn2 (Mauna Loa Seeing Study)

The Mauna Loa Seeing Study was performed by the EOL/Integrated Surface Flux System team, capturing surface meteorology and flux products at the Mauna Loa Observatory in Hawaii.

1 PAPER • 2 BENCHMARKS

Narvik Road Dataset (DIT4BEARs Smart Road Dataset)

DIT4BEARs Internship Project (at UiT-The Arctic University of Norway) Dataset

1 PAPER • NO BENCHMARKS YET

SupplyGraph (SupplyGraph: A Benchmark Dataset for Supply Chain Planning using Graph Neural Networks)

Graph Neural Networks (GNNs) have gained traction across different domains such as transportation, bio-informatics, language processing, and computer vision. However, there is a noticeable absence of research on applying GNNs to supply chain networks. Supply chain networks are inherently graphlike in structure, making them prime candidates for applying GNN methodologies. This opens up a world of possibilities for optimizing, predicting, and solving even the most complex supply chain problems. A major setback in this approach lies in the absence of real-world benchmark datasets to facilitate the research and resolution of supply chain problem using GNNs. To address the issue, we present a real-world benchmark dataset for temporal tasks, obtained from one of the leading FMCG companies in Bangladesh, focusing on supply chain planning for production purposes. The dataset includes temporal data as node features to enable sales predictions, production planning, and the identification of fact

1 PAPER • NO BENCHMARKS YET

USNA-Cn2 (long-term)

USNA-Cn2 (long-term) (Unites States Naval Academy Long-term Scintillation Study)

The USNA long-term scintillation study is a continuing effort to characterize and measure optical turbulence in the near-maritime boundary layer.

1 PAPER • 1 BENCHMARK

USNA-Cn2 (short-duration)

USNA-Cn2 (short-duration) (Unites States Naval Academy Short-duration Optical Turbulence Dataset)

The USNA long-term scintillation study is a continuing effort to characterize and measure optical turbulence in the near-maritime boundary layer.

1 PAPER • 2 BENCHMARKS

PJM(AEP)

PJM Hourly Energy Consumption Data PJM Interconnection LLC (PJM) is a regional transmission organization (RTO) in the United States. It is part of the Eastern Interconnection grid operating an electric transmission system serving all or parts of Delaware, Illinois, Indiana, Kentucky, Maryland, Michigan, New Jersey, North Carolina, Ohio, Pennsylvania, Tennessee, Virginia, West Virginia, and the District of Columbia.

0 PAPER • NO BENCHMARKS YET

Datasets

28 dataset results for Time Series Forecasting