no code implementations • SemEval (NAACL) 2022 • Xuange Cui, Wei Xiong, Songlin Wang
This paper presents our contribution to the SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding. We explore the impact of three different pre-trained multilingual language models in the SubTaskA. By enhancing the model generalization and robustness, we use the exponential moving average (EMA) method and the adversarial attack strategy. In SubTaskB, we add an effective cross-attention module for modeling the relationships of two sentences. We jointly train the model with a contrastive learning objective and employ a momentum contrast to enlarge the number of negative pairs. Additionally, we use the alignment and uniformity properties to measure the quality of sentence embeddings. Our approach obtained competitive results in both subtasks.
no code implementations • 20 Mar 2023 • Binbin Wang, Mingming Li, Zhixiong Zeng, Jingwei Zhuo, Songlin Wang, Sulong Xu, Bo Long, Weipeng Yan
Retrieving relevant items that match users' queries from billion-scale corpus forms the core of industrial e-commerce search systems, in which embedding-based retrieval (EBR) methods are prevailing.
1 code implementation • 31 Jan 2023 • Xuange Cui, Wei Xiong, Songlin Wang
In this paper, we propose a robust multilingual model to improve the quality of search results.
1 code implementation • 12 Aug 2022 • Yiming Qiu, Chenyu Zhao, Han Zhang, Jingwei Zhuo, TianHao Li, Xiaowei Zhang, Songlin Wang, Sulong Xu, Bo Long, Wen-Yun Yang
BERT-style models pre-trained on the general corpus (e. g., Wikipedia) and fine-tuned on specific task corpus, have recently emerged as breakthrough techniques in many NLP tasks: question answering, text classification, sequence labeling and so on.
no code implementations • 1 Jul 2021 • Xinlin Xia, Shang Wang, Han Zhang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang
Graph convolution networks (GCN), which recently becomes new state-of-the-art method for graph node classification, recommendation and other applications, has not been successfully applied to industrial-scale search engine yet.
1 code implementation • 9 May 2021 • Han Zhang, Hongwei Shen, Yiming Qiu, Yunjiang Jiang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang
Embedding index that enables fast approximate nearest neighbor(ANN) search, serves as an indispensable component for state-of-the-art deep retrieval systems.
no code implementations • 24 Mar 2021 • Rui Li, Yunjiang Jiang, WenYun Yang, Guoyu Tang, Songlin Wang, Chaoyi Ma, wei he, Xi Xiong, Yun Xiao, Eric Yihong Zhao
We introduce deep learning models to the two most important stages in product search at JD. com, one of the largest e-commerce platforms in the world.
no code implementations • 1 Mar 2021 • Yiming Qiu, Kang Zhang, Han Zhang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang
Online A/B experiments show that it improves core e-commerce business metrics significantly.
no code implementations • 3 Jun 2020 • Han Zhang, Songlin Wang, Kang Zhang, Zhiling Tang, Yunjiang Jiang, Yun Xiao, Weipeng Yan, Wen-Yun Yang
Two critical challenges stay in today's e-commerce search: how to retrieve items that are semantically relevant but not exact matching to query terms, and how to retrieve items that are more personalized to different users for the same search query.