An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention
Sequential recommendation (SR) models based on Transformers have achieved remarkable successes. The self-attention mechanism of Transformers for computer vision and natural language processing suffers from the oversmoothing problem, i.e., hidden representations becoming similar to tokens. In the SR domain, we, for the first time, show that the same problem occurs. We present pioneering investigations that reveal the low-pass filtering nature of self-attention in the SR, which causes oversmoothing. To this end, we propose a novel method called $\textbf{B}$eyond $\textbf{S}$elf-$\textbf{A}$ttention for Sequential $\textbf{Rec}$ommendation (BSARec), which leverages the Fourier transform to i) inject an inductive bias by considering fine-grained sequential patterns and ii) integrate low and high-frequency information to mitigate oversmoothing. Our discovery shows significant advancements in the SR domain and is expected to bridge the gap for existing Transformer-based SR models. We test our proposed approach through extensive experiments on 6 benchmark datasets. The experimental results demonstrate that our model outperforms 7 baseline methods in terms of recommendation performance. Our code is available at https://github.com/yehjin-shin/BSARec.
PDF AbstractDatasets
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Sequential Recommendation | Amazon-Beauty | BSARec | HR@5 | 0.0736 | # 1 | |
HR@10 | 0.1008 | # 1 | ||||
NDCG@5 | 0.0523 | # 1 | ||||
nDCG@10 | 0.0611 | # 1 | ||||
HR@20 | 0.1373 | # 1 | ||||
NDCG@20 | 0.0703 | # 1 | ||||
Sequential Recommendation | Amazon-Sports | BSARec | HR@5 | 0.0426 | # 1 | |
HR@10 | 0.0612 | # 1 | ||||
HR@20 | 0.0858 | # 1 | ||||
Sequential Recommendation | Amazon-Toys | BSARec | HR@5 | 0.0805 | # 1 | |
Sequential Recommendation | LastFM | BSARec | HR@5 | 0.0523 | # 1 | |
HR@10 | 0.0807 | # 1 | ||||
HR@20 | 0.1174 | # 1 | ||||
NDCG@5 | 0.0344 | # 1 | ||||
NDCG@10 | 0.0435 | # 1 | ||||
NDCG@20 | 0.0526 | # 1 | ||||
HR@5 (99 Neg. Samples) | 0.3752 | # 1 | ||||
HR@10 (99 Neg. Samples) | 0.5028 | # 1 | ||||
NDCG@5 (99 Neg. Samples) | 0.2634 | # 1 | ||||
NDCG@10 (99 Neg. Samples) | 0.3045 | # 1 | ||||
MRR (99 Neg. Samples) | 0.2636 | # 1 | ||||
Sequential Recommendation | MovieLens 1M | BSARec | HR@5 | 0.1944 | # 2 | |
NDCG@5 | 0.1306 | # 2 | ||||
HR@10 | 0.2757 | # 2 | ||||
NDCG@10 | 0.1568 | # 2 | ||||
HR@20 | 0.3884 | # 2 | ||||
NDCG@20 | 0.1851 | # 2 | ||||
HR@5 (99 Neg. Samples) | 0.7023 | # 1 | ||||
HR@10 (99 Neg. Samples) | 0.7978 | # 1 | ||||
NDCG@5 (99 Neg. Samples) | 0.5646 | # 1 | ||||
NDCG@10 (99 Neg. Samples) | 0.5955 | # 1 | ||||
MRR (99 Neg. Samples) | 0.5406 | # 1 | ||||
Sequential Recommendation | Yelp | BSARec | HR@5 | 0.0275 | # 1 | |
NDCG@5 | 0.0170 | # 1 | ||||
HR@10 | 0.0465 | # 1 | ||||
HR@20 | 0.0746 | # 1 | ||||
NDCG@10 | 0.0231 | # 1 | ||||
NDCG@20 | 0.0302 | # 1 | ||||
HR@5 (99 Neg. Samples) | 0.6447 | # 1 | ||||
HR@10 (99 Neg. Samples) | 0.7848 | # 1 | ||||
NDCG@5 (99 Neg. Samples) | 0.4824 | # 1 | ||||
NDCG@10 (99 Neg. Samples) | 0.5280 | # 1 | ||||
MRR (99 Neg. Samples) | 0.4587 | # 1 |