BERT Goes Shopping: Comparing Distributional Models for Product Representations

Word embeddings (e.g., word2vec) have been applied successfully to eCommerce products through~\textit{prod2vec}. Inspired by the recent performance improvements on several NLP tasks brought by contextualized embeddings, we propose to transfer BERT-like architectures to eCommerce: our model -- ~\textit{Prod2BERT} -- is trained to generate representations of products through masked session modeling. Through extensive experiments over multiple shops, different tasks, and a range of design choices, we systematically compare the accuracy of~\textit{Prod2BERT} and~\textit{prod2vec} embeddings: while~\textit{Prod2BERT} is found to be superior in several scenarios, we highlight the importance of resources and hyperparameters in the best performing models. Finally, we provide guidelines to practitioners for training embeddings under a variety of computational and data constraints.

Results in Papers With Code
(↓ scroll down to see all results)