1 code implementation • NAACL 2019 • Guangxiang Zhao, Jingjing Xu, Qi Zeng, Xuancheng Ren
This task requires the system to identify multiple styles of music based on its reviews on websites.
no code implementations • 25 Sep 2019 • Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Xu sun
Extensive experimental results on a series of natural language processing tasks, including neural machine translation, image captioning, and language modeling, all demonstrate the advantages of Sparse Transformer in model performance.
1 code implementation • NeurIPS 2019 • Jingjing Xu, Xu sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin
Unlike them, we find that the derivatives of the mean and variance are more important than forward normalization by re-centering and re-scaling backward gradients.
Ranked #5 on Machine Translation on IWSLT2015 English-Vietnamese
2 code implementations • 17 Nov 2019 • Guangxiang Zhao, Xu sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
In this work, we explore parallel multi-scale representation learning on sequence data, striving to capture both long-range and short-range language structures.
Ranked #8 on Machine Translation on WMT2014 English-French
2 code implementations • 25 Dec 2019 • Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu sun
Self-attention based Transformer has demonstrated the state-of-the-art performances in a number of natural language processing tasks.
no code implementations • 16 May 2020 • Fenglin Liu, Xuancheng Ren, Guangxiang Zhao, Chenyu You, Xuewei Ma, Xian Wu, Xu sun
While it is common practice to draw information from only the last encoder layer, recent work has proposed to use representations from different encoder layers for diversified levels of information.
no code implementations • 1 Jan 2021 • Guangxiang Zhao, Lei LI, Xuancheng Ren, Xu sun, Bin He
We find in practice that the high-likelihood area contains correct predictions for tail classes and it plays a vital role in learning imbalanced class distributions.
1 code implementation • ACL 2021 • Shuhuai Ren, Junyang Lin, Guangxiang Zhao, Rui Men, An Yang, Jingren Zhou, Xu sun, Hongxia Yang
To bridge the semantic gap between the two modalities, previous studies mainly focus on word-region alignment at the object level, lacking the matching between the linguistic relation among the words and the visual relation among the regions.
Ranked #4 on Image-to-Text Retrieval on MS COCO
1 code implementation • NeurIPS 2021 • Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie zhou, Xu sun
The class imbalance problem, as an important issue in learning node representations, has drawn increasing attention from the community.
1 code implementation • 13 Oct 2021 • Guangxiang Zhao, Wenkai Yang, Xuancheng Ren, Lei LI, Yunfang Wu, Xu sun
The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary.
no code implementations • 14 Dec 2021 • Lei LI, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun
As many fine-tuned pre-trained language models~(PLMs) with promising performance are generously released, investigating better ways to reuse these models is vital as it can greatly reduce the retraining computational cost and the potential environmental side-effects.
1 code implementation • 4 Jun 2022 • Shuhuai Ren, Lei LI, Xuancheng Ren, Guangxiang Zhao, Xu sun
However, evaluating the openness of CLIP-like models is challenging, as the models are open to arbitrary vocabulary in theory, but their accuracy varies in practice.
1 code implementation • 11 Oct 2022 • Lei LI, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun
We then design a Model Uncertainty--aware Knowledge Integration (MUKI) framework to recover the golden supervision for the student.
no code implementations • 25 Jan 2023 • Wenkai Yang, Yankai Lin, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun
Federated Learning has become a widely-used framework which allows learning a global model on decentralized local datasets under the condition of protecting local data privacy.