1 code implementation • 15 Dec 2021 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong
Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.
1 code implementation • 18 Nov 2021 • Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, ran Xu, Wenhao Liu, Caiming Xiong
To enlarge the set of base classes, we propose a method to automatically generate pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs.
no code implementations • 11 Oct 2021 • Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong
However, given the limited size of the gender-neutral data and its potential distributional mismatch with the original pre-training data, catastrophic forgetting would occur during the second-phase pre-training.
no code implementations • ICLR 2021 • Qiyu Wu, Chen Xing, Yatao Li, Guolin Ke, Di He, Tie-Yan Liu
In this paper, we focus on improving the efficiency of language pre-training methods through providing better data utilization.
no code implementations • 16 Dec 2020 • Chen Xing, Wenhao Liu, Caiming Xiong
According to recent studies and our empirical observations, one possible reason is that some easy-to-fit patterns in the training data, such as frequently co-occurring word combinations, dominate and harm pre-training, making it hard for the model to fit more complex information.
no code implementations • 4 Aug 2020 • Qiyu Wu, Chen Xing, Yatao Li, Guolin Ke, Di He, Tie-Yan Liu
In this paper, we focus on improving the efficiency of language pre-training methods through providing better data utilization.
6 code implementations • ICML 2020 • Ruibin Xiong, Yunchang Yang, Di He, Kai Zheng, Shuxin Zheng, Chen Xing, Huishuai Zhang, Yanyan Lan, Li-Wei Wang, Tie-Yan Liu
This motivates us to remove the warm-up stage for the training of Pre-LN Transformers.
no code implementations • ICLR 2020 • Chen Xing, Sercan Arik, Zizhao Zhang, Tomas Pfister
To circumvent this by inferring the distance for every test sample, we propose to train a confidence model jointly with the classification model.
no code implementations • ICLR 2019 • Chen Xing, Devansh Arpit, Christos Tsirigotis, Yoshua Bengio
The non-convex nature of the loss landscape of deep neural networks (DNN) lends them the intuition that over the course of training, stochastic optimization algorithms explore different regions of the loss surface by entering and escaping many local minima due to the noise induced by mini-batches.
1 code implementation • NeurIPS 2019 • Chen Xing, Negar Rostamzadeh, Boris N. Oreshkin, Pedro O. Pinheiro
Through a series of experiments, we show that by this adaptive combination of the two modalities, our model outperforms current uni-modality few-shot learning methods and modality-alignment methods by a large margin on all benchmarks and few-shot scenarios tested.
no code implementations • 3 Oct 2018 • Andrey Ignatov, Radu Timofte, Thang Van Vu, Tung Minh Luu, Trung X. Pham, Cao Van Nguyen, Yongwoo Kim, Jae-Seok Choi, Munchurl Kim, Jie Huang, Jiewen Ran, Chen Xing, Xingguang Zhou, Pengfei Zhu, Mingrui Geng, Yawei Li, Eirikur Agustsson, Shuhang Gu, Luc van Gool, Etienne de Stoutz, Nikolay Kobyshev, Kehui Nie, Yan Zhao, Gen Li, Tong Tong, Qinquan Gao, Liu Hanwen, Pablo Navarrete Michelini, Zhu Dan, Hu Fengshuo, Zheng Hui, Xiumei Wang, Lirui Deng, Rang Meng, Jinghui Qin, Yukai Shi, Wushao Wen, Liang Lin, Ruicheng Feng, Shixiang Wu, Chao Dong, Yu Qiao, Subeesh Vasu, Nimisha Thekke Madam, Praveen Kandula, A. N. Rajagopalan, Jie Liu, Cheolkon Jung
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones.
no code implementations • 24 Feb 2018 • Chen Xing, Devansh Arpit, Christos Tsirigotis, Yoshua Bengio
Based on this and other metrics, we deduce that for most of the training update steps, SGD moves in valley like regions of the loss surface by jumping from one valley wall to another at a height above the valley floor.
no code implementations • CL 2019 • Yu Wu, Wei Wu, Chen Xing, Can Xu, Zhoujun Li, Ming Zhou
The task requires matching a response candidate with a conversation context, whose challenges include how to recognize important parts of the context, and how to model the relationships among utterances in the context.
1 code implementation • 25 Jan 2017 • Chen Xing, Wei Wu, Yu Wu, Ming Zhou, YaLou Huang, Wei-Ying Ma
With the word level attention, hidden vectors of a word level encoder are synthesized as utterance vectors and fed to an utterance level encoder to construct hidden representations of the context.
3 code implementations • ACL 2017 • Yu Wu, Wei Wu, Chen Xing, Ming Zhou, Zhoujun Li
Existing work either concatenates utterances in context or matches a response with a highly abstract context vector finally, which may lose relationships among utterances or important contextual information.
Ranked #7 on
Conversational Response Selection
on RRS
no code implementations • COLING 2016 • Chaozhuo Li, Yu Wu, Wei Wu, Chen Xing, Zhoujun Li, Ming Zhou
While automatic response generation for building chatbot systems has drawn a lot of attention recently, there is limited understanding on when we need to consider the linguistic context of an input text in the generation process.
1 code implementation • 21 Jun 2016 • Chen Xing, Wei Wu, Yu Wu, Jie Liu, YaLou Huang, Ming Zhou, Wei-Ying Ma
We consider incorporating topic information into the sequence-to-sequence framework to generate informative and interesting responses for chatbots.