Trend driven retail industries such as fashion, launch substantial new products every season. In such a scenario, an accurate demand forecast for these newly launched products is vital for efficient downstream supply chain planning like assortment planning and stock allocation. While classical time-series forecasting algorithms can be used for existing products to forecast the sales, new products do not have any historical time-series data to base the forecast on. In this paper, we propose and empirically evaluate several novel attention-based multi-modal encoder-decoder models to forecast the sales for a new product purely based on product images, any available product attributes and also external factors like holidays, events, weather, and discount. We experimentally validate our approaches on a large fashion dataset and report the improvements in achieved accuracy and enhanced model interpretability as compared to existing k-nearest neighbor based baseline approaches.

PDF Abstract

Results from the Paper

 Ranked #1 on New Product Sales Forecasting on VISUELLE2.0 (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
New Product Sales Forecasting VISUELLE Explainable Cross-Attention Multimodal RNN MAE 32.1 # 5
New Product Sales Forecasting VISUELLE2.0 Explainable Cross-Attention Multimodal RNN MAE 0.99 # 1
Short-observation new product sales forecasting VISUELLE2.0 Explainable Cross-Attention Multimodal RNN 10 steps MAE 0.94 # 1
1 step MAE 0.96 # 1


No methods listed for this paper. Add relevant methods here