no code implementations • EMNLP 2020 • Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou
In this paper, we introduce XGLUE, a new benchmark dataset to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora, and evaluate their performance across a diverse set of cross-lingual tasks.
no code implementations • NAACL (ACL) 2022 • Haonan Li, Yameng Huang, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan
Pre-trained language models (PLMs) have dramatically improved performance for many natural language processing (NLP) tasks in domains such as finance and healthcare.
1 code implementation • Findings (NAACL) 2022 • Wanjun Zhong, Siyuan Wang, Duyu Tang, Zenan Xu, Daya Guo, Yining Chen, Jiahai Wang, Jian Yin, Ming Zhou, Nan Duan
In this paper, we study the challenge of analytical reasoning of text and collect a new dataset consisting of questions from the Law School Admission Test from 1991 to 2016.
no code implementations • Findings (EMNLP) 2021 • Haonan Li, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan
Pre-trained language models have led to substantial gains over a broad range of natural language processing (NLP) tasks, but have been shown to have limitations for natural language generation tasks with high-quality requirements on the output, such as commonsense generation and ad keyword generation.
1 code implementation • 14 Mar 2025 • Haoyang Huang, Guoqing Ma, Nan Duan, Xing Chen, Changyi Wan, Ranchen Ming, Tianyu Wang, Bo wang, Zhiying Lu, Aojie Li, Xianfang Zeng, Xinhao Zhang, Gang Yu, Yuhe Yin, Qiling Wu, Wen Sun, Kang An, Xin Han, Deshan Sun, Wei Ji, Bizhu Huang, Brian Li, Chenfei Wu, Guanzhe Huang, Huixin Xiong, Jiaxin He, Jianchang Wu, Jianlong Yuan, Jie Wu, Jiashuai Liu, Junjing Guo, Kaijun Tan, Liangyu Chen, Qiaohui Chen, Ran Sun, Shanshan Yuan, Shengming Yin, Sitong Liu, Wei Chen, Yaqi Dai, Yuchu Luo, Zheng Ge, Zhisheng Guan, Xiaoniu Song, Yu Zhou, Binxing Jiao, Jiansheng Chen, Jing Li, Shuchang Zhou, Xiangyu Zhang, Yi Xiu, Yibo Zhu, Heung-Yeung Shum, Daxin Jiang
We present Step-Video-TI2V, a state-of-the-art text-driven image-to-video generation model with 30B parameters, capable of generating videos up to 102 frames based on both text and image inputs.
3 code implementations • 14 Feb 2025 • Guoqing Ma, Haoyang Huang, Kun Yan, Liangyu Chen, Nan Duan, Shengming Yin, Changyi Wan, Ranchen Ming, Xiaoniu Song, Xing Chen, Yu Zhou, Deshan Sun, Deyu Zhou, Jian Zhou, Kaijun Tan, Kang An, Mei Chen, Wei Ji, Qiling Wu, Wen Sun, Xin Han, Yanan Wei, Zheng Ge, Aojie Li, Bin Wang, Bizhu Huang, Bo wang, Brian Li, Changxing Miao, Chen Xu, Chenfei Wu, Chenguang Yu, Dapeng Shi, Dingyuan Hu, Enle Liu, Gang Yu, Ge Yang, Guanzhe Huang, Gulin Yan, Haiyang Feng, Hao Nie, Haonan Jia, Hanpeng Hu, Hanqi Chen, Haolong Yan, Heng Wang, Hongcheng Guo, Huilin Xiong, Huixin Xiong, Jiahao Gong, Jianchang Wu, Jiaoren Wu, Jie Wu, Jie Yang, Jiashuai Liu, Jiashuo Li, Jingyang Zhang, Junjing Guo, Junzhe Lin, Kaixiang Li, Lei Liu, Lei Xia, Liang Zhao, Liguo Tan, Liwen Huang, Liying Shi, Ming Li, Mingliang Li, Muhua Cheng, Na Wang, Qiaohui Chen, Qinglin He, Qiuyan Liang, Quan Sun, Ran Sun, Rui Wang, Shaoliang Pang, Shiliang Yang, Sitong Liu, SiQi Liu, Shuli Gao, Tiancheng Cao, Tianyu Wang, Weipeng Ming, Wenqing He, Xu Zhao, Xuelin Zhang, Xianfang Zeng, Xiaojia Liu, Xuan Yang, Yaqi Dai, Yanbo Yu, Yang Li, Yineng Deng, Yingming Wang, Yilei Wang, Yuanwei Lu, Yu Chen, Yu Luo, Yuchu Luo, Yuhe Yin, Yuheng Feng, Yuxiang Yang, Zecheng Tang, Zekai Zhang, Zidong Yang, Binxing Jiao, Jiansheng Chen, Jing Li, Shuchang Zhou, Xiangyu Zhang, Xinhao Zhang, Yibo Zhu, Heung-Yeung Shum, Daxin Jiang
We present Step-Video-T2V, a state-of-the-art text-to-video pre-trained model with 30B parameters and the ability to generate videos up to 204 frames in length.
no code implementations • 11 Feb 2025 • Xin Tan, Yuetao Chen, Yimin Jiang, Xing Chen, Kun Yan, Nan Duan, Yibo Zhu, Daxin Jiang, Hong Xu
Diffusion Transformers (DiTs) have shown remarkable performance in generating high-quality videos.
no code implementations • 21 Jan 2025 • Deyu Zhou, Quan Sun, Yuang Peng, Kun Yan, Runpei Dong, Duomin Wang, Zheng Ge, Nan Duan, Xiangyu Zhang, Lionel M. Ni, Heung-Yeung Shum
We introduce MAGI, a hybrid video generation framework that combines masked modeling for intra-frame generation with causal modeling for next-frame generation.
1 code implementation • 21 Oct 2024 • Shaonan Wu, Shuai Lu, Yeyun Gong, Nan Duan, Ping Wei
Experimental results demonstrate the effectiveness of our approach, achieving a 5% absolute performance improvement on Leandojo benchmark.
no code implementations • 21 Oct 2024 • Tianyu Chen, Shuai Lu, Shan Lu, Yeyun Gong, Chenyuan Yang, Xuheng Li, Md Rakib Hossain Misu, Hao Yu, Nan Duan, Peng Cheng, Fan Yang, Shuvendu K Lahiri, Tao Xie, Lidong Zhou
SAFE also re-purposes the large number of synthesized incorrect proofs to train the self-debugging capability of the fine-tuned models, empowering them to fix incorrect proofs based on the verifier's feedback.
1 code implementation • 20 Sep 2024 • JunJie Huang, Daya Guo, Chenglong Wang, Jiazhen Gu, Shuai Lu, Jeevana Priya Inala, Cong Yan, Jianfeng Gao, Nan Duan, Michael R. Lyu
With CoCoMine, we construct CoCoNote, a dataset containing 58, 221 examples for Contextualized Data-wrangling Code generation in Notebooks.
no code implementations • 21 Aug 2024 • Minheng Ni, Chenfei Wu, Huaying Yuan, Zhengyuan Yang, Ming Gong, Lijuan Wang, Zicheng Liu, WangMeng Zuo, Nan Duan
With the advancement of generative models, the synthesis of different sensory elements such as music, visuals, and speech has achieved significant realism.
3 code implementations • 11 Apr 2024 • Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan, Weizhu Chen
Unlike traditional LMs that learn to predict every next token in a corpus, Rho-1 employs Selective Language Modeling (SLM), which selectively trains on useful tokens that aligned with the desired distribution.
1 code implementation • 3 Apr 2024 • Gabriela Ben Melech Stan, Estelle Aflalo, Raanan Yehezkel Rohekar, Anahita Bhiwandiwalla, Shao-Yen Tseng, Matthew Lyle Olson, Yaniv Gurwicz, Chenfei Wu, Nan Duan, Vasudev Lal
In this work, we present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models.
no code implementations • 1 Apr 2024 • Xinzhe Ni, Yeyun Gong, Zhibin Gou, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen
Additionally, we showcase the use of QaDS in creating efficient fine-tuning mixtures with various selection ratios, and analyze the quality of a wide range of open-source datasets, which can perform as a reference for future works on mathematical reasoning tasks.
1 code implementation • 6 Mar 2024 • Zekai Zhang, Yiduo Guo, Yaobo Liang, Dongyan Zhao, Nan Duan
The growing dependence on Large Language Models (LLMs) for finishing user instructions necessitates a comprehensive understanding of their robustness to complex task completion in real-world situations.
no code implementations • 4 Mar 2024 • Yiming Huang, Xiao Liu, Yeyun Gong, Zhibin Gou, Yelong Shen, Nan Duan, Weizhu Chen
Large language models (LLMs) have shown great potential in complex reasoning tasks, yet their performance is often hampered by the scarcity of high-quality and reasoning-focused training datasets.
Ranked #51 on
Math Word Problem Solving
on MATH
no code implementations • 16 Feb 2024 • Jun Cen, Chenfei Wu, Xiao Liu, Shengming Yin, Yixuan Pei, Jinglong Yang, Qifeng Chen, Nan Duan, JianGuo Zhang
Large Language Models (LLMs) and Large Multi-modality Models (LMMs) have demonstrated remarkable decision masking capabilities on a variety of tasks.
no code implementations • 30 Jan 2024 • Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, Nan Duan
To leverage LLMs for visual synthesis, traditional methods convert raster image information into discrete grid tokens through specialized visual modules, while disrupting the model's ability to capture the true semantic representation of visual scenes.
no code implementations • 22 Dec 2023 • Kun Yan, Lei Ji, Zeyu Wang, Yuntao Wang, Nan Duan, Shuai Ma
In this paper, we introduce gaze information, feasibly collected by AR or VR devices, as a proxy for human attention to guide VLMs and propose a novel approach, Voila-A, for gaze alignment to enhance the interpretability and effectiveness of these models in real-world applications.
no code implementations • 4 Dec 2023 • Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, Weizhu Chen
Large language models (LLMs) have demonstrated impressive reasoning capabilities, yet there is ongoing debate about these abilities and the potential data contamination problem recently.
1 code implementation • 3 Nov 2023 • Yiduo Guo, Zekai Zhang, Yaobo Liang, Dongyan Zhao, Nan Duan
Recent evaluations of Large Language Models (LLMs) have centered around testing their zero-shot/few-shot capabilities for basic natural language tasks and their ability to translate instructions into tool APIs.
no code implementations • 12 Oct 2023 • Wang You, Wenshan Wu, Yaobo Liang, Shaoguang Mao, Chenfei Wu, Maosong Cao, Yuzhe Cai, Yiduo Guo, Yan Xia, Furu Wei, Nan Duan
In this paper, we propose a new framework called Evaluation-guided Iterative Plan Extraction for long-form narrative text generation (EIPE-text), which extracts plans from the corpus of narratives and utilizes the extracted plans to construct a better planner.
1 code implementation • 29 Sep 2023 • Baizhou Huang, Shuai Lu, Weizhu Chen, Xiaojun Wan, Nan Duan
We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency across outputs from multiple perspectives.
1 code implementation • 29 Sep 2023 • Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen
Large language models have made significant progress in various language tasks, yet they still struggle with complex mathematics.
Ranked #27 on
Math Word Problem Solving
on MATH
(using extra training data)
1 code implementation • 18 Sep 2023 • Zecheng Tang, Chenfei Wu, Juntao Li, Nan Duan
Graphic layout generation, a growing research field, plays a significant role in user engagement and information perception.
1 code implementation • 26 Aug 2023 • Minheng Ni, Chenfei Wu, Xiaodong Wang, Shengming Yin, Lijuan Wang, Zicheng Liu, Nan Duan
In this work, we formalize a new task, Open-vocabulary Responsible Visual Synthesis (ORES), where the synthesis model is able to avoid forbidden visual concepts while allowing users to input any desired content.
1 code implementation • 19 Aug 2023 • Dan Qiao, Chenfei Wu, Yaobo Liang, Juntao Li, Nan Duan
In this paper, we propose GameEval, a novel approach to evaluating LLMs through goal-driven conversational games, overcoming the limitations of previous methods.
1 code implementation • 16 Aug 2023 • Shengming Yin, Chenfei Wu, Jian Liang, Jie Shi, Houqiang Li, Gong Ming, Nan Duan
Our experiments validate the effectiveness of DragNUWA, demonstrating its superior performance in fine-grained control in video generation.
1 code implementation • 27 Jun 2023 • Zhijian Hou, Lei Ji, Difei Gao, Wanjun Zhong, Kun Yan, Chao Li, Wing-Kwong Chan, Chong-Wah Ngo, Nan Duan, Mike Zheng Shou
Motivated by this, we leverage a two-stage pre-training strategy to train egocentric feature extractors and the grounding model on video narrations, and further fine-tune the model on annotated data.
1 code implementation • 27 Jun 2023 • Ryo Sekizawa, Nan Duan, Shuai Lu, Hitomi Yanaka
Code search is a task to find programming codes that semantically match the given natural language queries.
1 code implementation • 26 Jun 2023 • Daya Guo, Canwen Xu, Nan Duan, Jian Yin, Julian McAuley
In this paper, we introduce a new task for code completion that focuses on handling long code input and propose a sparse Transformer model, called LongCoder, to address this task.
1 code implementation • 15 Jun 2023 • Haonan Li, Yixuan Zhang, Fajri Koto, Yifei Yang, Hai Zhao, Yeyun Gong, Nan Duan, Timothy Baldwin
As the capabilities of large language models (LLMs) continue to advance, evaluating their performance becomes increasingly crucial and challenging.
1 code implementation • 31 May 2023 • Xiao Xu, Bei Li, Chenfei Wu, Shao-Yen Tseng, Anahita Bhiwandiwalla, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan
With only 4M VLP data, ManagerTower achieves superior performances on various downstream VL tasks, especially 79. 15% accuracy on VQAv2 Test-Std, 86. 56% IR@1 and 95. 64% TR@1 on Flickr30K.
1 code implementation • 24 May 2023 • Hao Sun, Xiao Liu, Yeyun Gong, Yan Zhang, Daxin Jiang, Linjun Yang, Nan Duan
With the advance of large language models (LLMs), the research field of LLM applications becomes more and more popular and the idea of constructing pipelines to accomplish complex tasks by stacking LLM API calls come true.
1 code implementation • 24 May 2023 • Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen
In this paper, we show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.
no code implementations • 23 May 2023 • Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
Furthermore, to better align the query to the frozen modules, we propose a trainable scheme for our pipeline.
1 code implementation • 22 May 2023 • Yaobo Liang, Quanzhi Zhu, Junhe Zhao, Nan Duan
There are two primary approaches to addressing cross-lingual transfer: multilingual pre-training, which implicitly aligns the hidden representations of various languages, and translate-test, which explicitly translates different languages into an intermediate language, such as English.
1 code implementation • 19 May 2023 • Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen
Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging.
2 code implementations • NeurIPS 2023 • Tong Wu, Zhihao Fan, Xiao Liu, Yeyun Gong, Yelong Shen, Jian Jiao, Hai-Tao Zheng, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen
Diffusion models have gained significant attention in the realm of image generation due to their exceptional performance.
1 code implementation • 11 May 2023 • Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
Based on the remarkable achievements of pre-trained language models in abstractive summarization, the copying mechanism has proved helpful by improving the factuality, stability, and overall performance.
1 code implementation • 8 May 2023 • Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, Nan Duan
Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code.
1 code implementation • 23 Apr 2023 • Jiashuo Sun, Yi Luo, Yeyun Gong, Chen Lin, Yelong Shen, Jian Guo, Nan Duan
By utilizing iterative bootstrapping, our approach enables LLMs to autonomously rectify errors, resulting in more precise and comprehensive reasoning chains.
1 code implementation • 20 Apr 2023 • Yiduo Guo, Yaobo Liang, Chenfei Wu, Wenshan Wu, Dongyan Zhao, Nan Duan
To obtain it, we propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback.
2 code implementations • 17 Apr 2023 • Yuzhe Cai, Shaoguang Mao, Wenshan Wu, Zehua Wang, Yaobo Liang, Tao Ge, Chenfei Wu, Wang You, Ting Song, Yan Xia, Jonathan Tien, Nan Duan, Furu Wei
By introducing this framework, we aim to bridge the gap between humans and LLMs, enabling more effective and efficient utilization of LLMs for complex tasks.
3 code implementations • 13 Apr 2023 • Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan
Impressively, GPT-4 surpasses average human performance on SAT, LSAT, and math competitions, attaining a 95% accuracy rate on the SAT Math test and a 92. 5% accuracy on the English test of the Chinese national college entrance exam.
5 code implementations • 3 Apr 2023 • Canwen Xu, Daya Guo, Nan Duan, Julian McAuley
Furthermore, we propose a new technique called Self-Distill with Feedback, to further improve the performance of the Baize models with feedback from ChatGPT.
2 code implementations • 29 Mar 2023 • Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen
Many natural language processing (NLP) tasks rely on labeled data to train machine learning models with high performance.
no code implementations • 29 Mar 2023 • Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan
On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well.
no code implementations • 22 Mar 2023 • Shengming Yin, Chenfei Wu, Huan Yang, JianFeng Wang, Xiaodong Wang, Minheng Ni, Zhengyuan Yang, Linjie Li, Shuguang Liu, Fan Yang, Jianlong Fu, Gong Ming, Lijuan Wang, Zicheng Liu, Houqiang Li, Nan Duan
In this paper, we propose NUWA-XL, a novel Diffusion over Diffusion architecture for eXtremely Long video generation.
3 code implementations • 8 Mar 2023 • Chenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, Nan Duan
To this end, We build a system called \textbf{Visual ChatGPT}, incorporating different Visual Foundation Models, to enable the user to interact with ChatGPT by 1) sending and receiving not only languages but also images 2) providing complex visual questions or visual editing instructions that require the collaboration of multiple AI models with multi-steps.
no code implementations • 21 Feb 2023 • Xiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, JianFeng Wang, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan
3D photography renders a static image into a video with appealing 3D visual effects.
Ranked #1 on
Image Outpainting
on MSCOCO
1 code implementation • 3 Feb 2023 • Shunyu Zhang, Yaobo Liang, Ming Gong, Daxin Jiang, Nan Duan
Specifically, we propose a multilingual PLM called masked sentence model (MSM), which consists of a sentence encoder to generate the sentence representations, and a document encoder applied to a sequence of sentence vectors from a document.
no code implementations • 1 Feb 2023 • Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen
However, the quality of the prompts depends on the demonstrations given to the models, and creating many of them by hand is costly.
1 code implementation • 22 Dec 2022 • Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Nan Duan, Weizhu Chen
In this paper, we introduce a novel dIffusion language modEl pre-training framework for text generation, which we call GENIE.
1 code implementation • 18 Dec 2022 • Xingwei He, Yeyun Gong, A-Long Jin, Hang Zhang, Anlei Dong, Jian Jiao, Siu Ming Yiu, Nan Duan
The dual-encoder has become the de facto architecture for dense retrieval.
1 code implementation • 15 Dec 2022 • Kun Zhou, Xiao Liu, Yeyun Gong, Wayne Xin Zhao, Daxin Jiang, Nan Duan, Ji-Rong Wen
Pre-trained Transformers (\eg BERT) have been commonly used in existing dense retrieval methods for parameter initialization, and recent studies are exploring more effective pre-training tasks for further improving the quality of dense vectors.
1 code implementation • 10 Dec 2022 • Hao Sun, Xiao Liu, Yeyun Gong, Anlei Dong, Jingwen Lu, Yan Zhang, Linjun Yang, Rangan Majumder, Nan Duan
Knowledge distillation is often used to transfer knowledge from a strong teacher model to a relatively weak student model.
1 code implementation • 25 Nov 2022 • Haotian Cui, Chenglong Wang, JunJie Huang, Jeevana Priya Inala, Todd Mytkowicz, Bo wang, Jianfeng Gao, Nan Duan
Our experiments show that (1) our refined training dataset lets models achieve better performance in the explanation generation tasks compared to larger unrefined data (15x larger), and (2) fine-tuned models can generate well-structured long docstrings comparable to human-written ones.
no code implementations • CVPR 2023 • Zhengyuan Yang, JianFeng Wang, Zhe Gan, Linjie Li, Kevin Lin, Chenfei Wu, Nan Duan, Zicheng Liu, Ce Liu, Michael Zeng, Lijuan Wang
Human evaluation on PaintSkill shows that ReCo is +19. 28% and +17. 21% more accurate in generating images with correct object count and spatial relationship than the T2I model.
Conditional Text-to-Image Synthesis
Layout-to-Image Generation
+1
2 code implementations • 18 Nov 2022 • Biyang Guo, Yeyun Gong, Yelong Shen, Songqiao Han, Hailiang Huang, Nan Duan, Weizhu Chen
We introduce GENIUS: a conditional text generation model using sketches as input, which can fill in the missing contexts for a given sketch (key information consisting of textual spans, phrases, or words, concatenated by mask tokens).
1 code implementation • 17 Nov 2022 • JunJie Huang, Chenglong Wang, Jipeng Zhang, Cong Yan, Haotian Cui, Jeevana Priya Inala, Colin Clement, Nan Duan, Jianfeng Gao
Code generation models can benefit data scientists' productivity by automatically generating code from context and text descriptions.
no code implementations • 16 Nov 2022 • Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan
This technical report describes the CONE approach for Ego4D Natural Language Queries (NLQ) Challenge in ECCV 2022.
1 code implementation • 21 Oct 2022 • Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, Weizhu Chen
Thus, we propose a simple ambiguous negatives sampling method, SimANS, which incorporates a new sampling probability distribution to sample more ambiguous negatives.
no code implementations • 21 Oct 2022 • Xingwei He, Yeyun Gong, A-Long Jin, Weizhen Qi, Hang Zhang, Jian Jiao, Bartuer Zhou, Biao Cheng, SM Yiu, Nan Duan
Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities.
no code implementations • 20 Oct 2022 • Wanjun Zhong, Tingting Ma, Jiahai Wang, Jian Yin, Tiejun Zhao, Chin-Yew Lin, Nan Duan
This paper presents ReasonFormer, a unified reasoning framework for mirroring the modular and compositional reasoning process of humans in complex decision-making.
1 code implementation • 18 Oct 2022 • Xiaonan Li, Daya Guo, Yeyun Gong, Yun Lin, Yelong Shen, Xipeng Qiu, Daxin Jiang, Weizhu Chen, Nan Duan
In this paper, we present \textbf{SCodeR}, a \textbf{S}oft-labeled contrastive pre-training framework with two positive sample construction methods to learn functional-level \textbf{Code} \textbf{R}epresentation.
1 code implementation • 18 Oct 2022 • Shuai Fan, Chen Lin, Haonan Li, Zhenghao Lin, Jinsong Su, Hang Zhang, Yeyun Gong, Jian Guo, Nan Duan
Most existing pre-trained language representation models (PLMs) are sub-optimal in sentiment analysis tasks, as they capture the sentiment information from word-level while under-considering sentence-level information.
1 code implementation • 11 Oct 2022 • JunJie Huang, Wanjun Zhong, Qian Liu, Ming Gong, Daxin Jiang, Nan Duan
However, training an effective dense table-text retriever is difficult due to the challenges of table-text discrepancy and data sparsity problem.
no code implementations • 10 Oct 2022 • Kun Yan, Lei Ji, Chenfei Wu, Jian Liang, Ming Zhou, Nan Duan, Shuai Ma
Panorama synthesis endeavors to craft captivating 360-degree visual landscapes, immersing users in the heart of virtual worlds.
no code implementations • Conference 2022 • Yongjie Zhu, Chunhui Han, Yuefeng Zhan, Bochen Pang, Zhaoju Li, Hao Sun, Si Li, Boxin Shi, Nan Duan, Ruofei Zhang, Liangjie Zhang, Weiwei Deng, Qi Zhang
Sponsored search advertisements (ads) appear next to search results when consumers look for products and services on search engines.
Ranked #3 on
Image-text matching
on CommercialAdsDataset
1 code implementation • 27 Sep 2022 • Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan
It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student.
1 code implementation • 22 Sep 2022 • Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan
This paper tackles an emerging and challenging problem of long video temporal grounding~(VTG) that localizes video moments related to a natural language (NL) query.
no code implementations • 5 Aug 2022 • Wanjun Zhong, Yifan Gao, Ning Ding, Zhiyuan Liu, Ming Zhou, Jiahai Wang, Jian Yin, Nan Duan
Task generalization has been a long standing challenge in Natural Language Processing (NLP).
1 code implementation • 20 Jul 2022 • Chenfei Wu, Jian Liang, Xiaowei Hu, Zhe Gan, JianFeng Wang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan
In this paper, we present NUWA-Infinity, a generative model for infinite visual synthesis, which is defined as the task of generating arbitrarily-sized high-resolution images or long-duration videos.
Ranked #1 on
Image Outpainting
on LHQC
2 code implementations • 28 Jun 2022 • Weizhou Shen, Yeyun Gong, Yelong Shen, Song Wang, Xiaojun Quan, Nan Duan, Weizhu Chen
Generate-then-rank is a widely used mechanism for text generation, where a generator produces multiple text candidates and a ranker chooses the best one among the text candidates.
2 code implementations • 17 Jun 2022 • Xiao Xu, Chenfei Wu, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan
Vision-Language (VL) models with the Two-Tower architecture have dominated visual-language representation learning in recent years.
1 code implementation • 7 Jun 2022 • Ning Wu, Yaobo Liang, Houxing Ren, Linjun Shou, Nan Duan, Ming Gong, Daxin Jiang
On the multilingual sentence retrieval task Tatoeba, our model achieves new SOTA results among methods without using bilingual data.
no code implementations • 1 Jun 2022 • Jie Shi, Chenfei Wu, Jian Liang, Xiang Liu, Nan Duan
Our work proposes a VQ-VAE architecture model with a diffusion decoder (DiVAE) to work as the reconstructing component in image synthesis.
no code implementations • 23 May 2022 • Weizhen Qi, Yeyun Gong, Yelong Shen, Jian Jiao, Yu Yan, Houqiang Li, Ruofei Zhang, Weizhu Chen, Nan Duan
To further illustrate the commercial value of our approach, we conduct experiments on three generation tasks in real-world advertisements applications.
1 code implementation • 18 May 2022 • Xinyu Pi, Wanjun Zhong, Yan Gao, Nan Duan, Jian-Guang Lou
We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models.
1 code implementation • NAACL 2022 • Wanjun Zhong, Yifan Gao, Ning Ding, Yujia Qin, Zhiyuan Liu, Ming Zhou, Jiahai Wang, Jian Yin, Nan Duan
Furthermore, ProQA exhibits strong ability in both continual learning and transfer learning by taking the advantages of the structural prompt.
1 code implementation • ACL 2022 • Wei Chen, Yeyun Gong, Song Wang, Bolun Yao, Weizhen Qi, Zhongyu Wei, Xiaowu Hu, Bartuer Zhou, Yi Mao, Weizhu Chen, Biao Cheng, Nan Duan
Dialog response generation in open domain is an important research topic where the main challenge is to generate relevant and diverse responses.
1 code implementation • CVPR 2022 • Estelle Aflalo, Meng Du, Shao-Yen Tseng, Yongfei Liu, Chenfei Wu, Nan Duan, Vasudev Lal
Breakthroughs in transformer-based models have revolutionized not only the NLP field, but also vision and multimodal systems.
2 code implementations • 17 Mar 2022 • Zhiyu Li, Shuai Lu, Daya Guo, Nan Duan, Shailesh Jannu, Grant Jenks, Deep Majumder, Jared Green, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan
In this research, we focus on utilizing pre-training techniques for the tasks in the code review scenario.
no code implementations • ACL 2022 • Shunyu Zhang, Yaobo Liang, Ming Gong, Daxin Jiang, Nan Duan
Second, to prevent multi-view embeddings from collapsing to the same one, we further propose a global-local loss with annealed temperature to encourage the multiple viewers to better align with different potential queries.
no code implementations • ACL 2022 • Yuan Chai, Yaobo Liang, Nan Duan
Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.
1 code implementation • ACL 2022 • Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung-won Hwang, Alexey Svyatkovskiy
Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development.
1 code implementation • Findings (ACL) 2022 • Canwen Xu, Daya Guo, Nan Duan, Julian McAuley
Experimental results show that LaPraDoR achieves state-of-the-art performance compared with supervised dense retrieval models, and further analysis reveals the effectiveness of our training strategy and objectives.
2 code implementations • ACL 2022 • Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, Jian Yin
Furthermore, we propose to utilize multi-modal contents to learn representation of code fragment with contrastive learning, and then align representations among programming languages using a cross-modal generation task.
no code implementations • 10 Feb 2022 • Minheng Ni, Chenfei Wu, Haoyang Huang, Daxin Jiang, WangMeng Zuo, Nan Duan
Language guided image inpainting aims to fill in the defective regions of an image under the guidance of text while keeping non-defective regions unchanged.
1 code implementation • 26 Jan 2022 • Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan
For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build code-text pairs.
no code implementations • 15 Jan 2022 • Wanjun Zhong, JunJie Huang, Qian Liu, Ming Zhou, Jiahai Wang, Jian Yin, Nan Duan
CARP utilizes hybrid chain to model the explicit intermediate reasoning process across table and text for question answering.
Ranked #2 on
Question Answering
on OTT-QA
no code implementations • NeurIPS 2021 • Weijiang Yu, Haoteng Zheng, Mengfei Li, Lei Ji, Lijun Wu, Nong Xiao, Nan Duan
To consider the interdependent knowledge between contextual clips into the network inference, we propose a Siamese Sampling and Reasoning (SiaSamRea) approach, which consists of a siamese sampling mechanism to generate sparse and similar clips (i. e., siamese clips) from the same video, and a novel reasoning strategy for integrating the interdependent knowledge between contextual clips into the network.
1 code implementation • 24 Nov 2021 • Chenfei Wu, Jian Liang, Lei Ji, Fan Yang, Yuejian Fang, Daxin Jiang, Nan Duan
To cover language, image, and video at the same time for different scenarios, a 3D transformer encoder-decoder framework is designed, which can not only deal with videos as 3D data but also adapt to texts and images as 1D and 2D data, respectively.
Ranked #1 on
Text-to-Video Generation
on Kinetics
1 code implementation • ICLR 2022 • Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu Chen
To address these challenges, we present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.
2 code implementations • 26 Sep 2021 • Xiaoze Jiang, Yaobo Liang, Weizhu Chen, Nan Duan
The results on MLQA and NER exhibit the superiority of XLM-K in knowledge related tasks.
1 code implementation • ACL 2022 • Wei Chen, Yeyun Gong, Can Xu, Huang Hu, Bolun Yao, Zhongyu Wei, Zhihao Fan, Xiaowu Hu, Bartuer Zhou, Biao Cheng, Daxin Jiang, Nan Duan
We study the problem of coarse-grained response selection in retrieval-based dialogue systems.
1 code implementation • Findings (NAACL) 2022 • Yongfei Liu, Chenfei Wu, Shao-Yen Tseng, Vasudev Lal, Xuming He, Nan Duan
Self-supervised vision-and-language pretraining (VLP) aims to learn transferable multi-modal representations from large-scale image-text data and to achieve strong performances on a broad scope of vision-language tasks after finetuning.
no code implementations • EMNLP 2021 • Colin B. Clement, Shuai Lu, Xiaoyu Liu, Michele Tufano, Dawn Drain, Nan Duan, Neel Sundaresan, Alexey Svyatkovskiy
While there are many efforts to extend the context window, we introduce an architecture-independent approach for leveraging the syntactic hierarchies of source code for incorporating entire file-level context into a fixed-length window.
no code implementations • 14 Sep 2021 • Haonan Li, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan
Pre-trained language models have led to substantial gains over a broad range of natural language processing (NLP) tasks, but have been shown to have limitations for natural language generation tasks with high-quality requirements on the output, such as commonsense generation and ad keyword generation.
no code implementations • Findings (EMNLP) 2021 • Yimin Fan, Yaobo Liang, Alexandre Muzio, Hany Hassan, Houqiang Li, Ming Zhou, Nan Duan
Then we cluster all the target languages into multiple groups and name each group as a representation sprachbund.
1 code implementation • 5 Aug 2021 • Weijiang Yu, Jian Liang, Lei Ji, Lu Li, Yuejian Fang, Nong Xiao, Nan Duan
Firstly, we develop multi-commonsense learning for semantic-level reasoning by jointly training different commonsense types in a unified network, which encourages the interaction between the clues of multiple commonsense descriptions, event-wise captions and videos.
1 code implementation • 2 Aug 2021 • Siyuan Wang, Zhongkun Liu, Wanjun Zhong, Ming Zhou, Zhongyu Wei, Zhumin Chen, Nan Duan
Complex reasoning aims to draw a correct inference based on complex rules.
no code implementations • ACL 2021 • Kun Yan, Lei Ji, Huaishao Luo, Ming Zhou, Nan Duan, Shuai Ma
Moreover, the controllability and explainability of LoopCAG are validated by analyzing spatial and temporal sensitivity during the generation process.
Ranked #1 on
Image Captioning
on Localized Narratives
1 code implementation • ACL 2021 • Linmei Hu, Tianchi Yang, Luhao Zhang, Wanjun Zhong, Duyu Tang, Chuan Shi, Nan Duan, Ming Zhou
Specifically, we first construct a \textit{directed heterogeneous document graph} for each news incorporating topics and entities.
no code implementations • ICLR 2022 • Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, Miltiadis Allamanis
To evaluate models, we consider both ROUGE as well as a new metric RegexAcc that measures success of generating completions matching long outputs with as few holes as possible.
1 code implementation • Findings (ACL) 2021 • Lin Su, Nan Duan, Edward Cui, Lei Ji, Chenfei Wu, Huaishao Luo, Yongfei Liu, Ming Zhong, Taroon Bharti, Arun Sacheti
Comparing with existing multimodal datasets such as MSCOCO and Flicker30K for image-language tasks, YouCook2 and MSR-VTT for video-language tasks, GEM is not only the largest vision-language dataset covering image-language tasks and video-language tasks at the same time, but also labeled in multiple languages.
1 code implementation • ACL 2021 • Yu Yan, Fei Hu, Jiusheng Chen, Nikhil Bhendawade, Ting Ye, Yeyun Gong, Nan Duan, Desheng Cui, Bingyu Chi, Ruofei Zhang
Transformer-based models have made tremendous impacts in natural language generation.
1 code implementation • ACL 2021 • JunJie Huang, Duyu Tang, Linjun Shou, Ming Gong, Ke Xu, Daxin Jiang, Ming Zhou, Nan Duan
Finding codes given natural language query isb eneficial to the productivity of software developers.
1 code implementation • 11 May 2021 • Yu Yan, Jiusheng Chen, Weizhen Qi, Nikhil Bhendawade, Yeyun Gong, Nan Duan, Ruofei Zhang
Transformer model with multi-head attention requires caching intermediate results for efficient inference in generation tasks.
no code implementations • 10 May 2021 • Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, Weizhu Chen
We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA.
2 code implementations • Findings (ACL) 2022 • Siyuan Wang, Wanjun Zhong, Duyu Tang, Zhongyu Wei, Zhihao Fan, Daxin Jiang, Ming Zhou, Nan Duan
Logical reasoning of text requires understanding critical logical information in the text and performing inference over them.
Ranked #7 on
Reading Comprehension
on ReClor
1 code implementation • 30 Apr 2021 • Chenfei Wu, Lun Huang, Qianxi Zhang, Binyang Li, Lei Ji, Fan Yang, Guillermo Sapiro, Nan Duan
Generating videos from text is a challenging task due to its high computational requirements for training and infinite possible answers for evaluation.
Ranked #17 on
Text-to-Video Generation
on MSR-VTT
(CLIPSIM metric)
5 code implementations • 18 Apr 2021 • Huaishao Luo, Lei Ji, Ming Zhong, Yang Chen, Wen Lei, Nan Duan, Tianrui Li
In this paper, we propose a CLIP4Clip model to transfer the knowledge of the CLIP model to video-language retrieval in an end-to-end manner.
Ranked #1 on
Text to Video Retrieval
on MSR-VTT
3 code implementations • ACL 2021 • Weizhen Qi, Yeyun Gong, Yu Yan, Can Xu, Bolun Yao, Bartuer Zhou, Biao Cheng, Daxin Jiang, Jiusheng Chen, Ruofei Zhang, Houqiang Li, Nan Duan
ProphetNet is a pre-training based natural language generation method which shows powerful performance on English text summarization and question generation tasks.
1 code implementation • 14 Apr 2021 • Wanjun Zhong, Siyuan Wang, Duyu Tang, Zenan Xu, Daya Guo, Jiahai Wang, Jian Yin, Ming Zhou, Nan Duan
Analytical reasoning is an essential and challenging task that requires a system to analyze a scenario involving a set of particular circumstances and perform reasoning over it to make conclusions.
1 code implementation • Findings (EMNLP) 2021 • JunJie Huang, Duyu Tang, Wanjun Zhong, Shuai Lu, Linjun Shou, Ming Gong, Daxin Jiang, Nan Duan
In this work, we conduct a thorough examination of pretrained model based unsupervised sentence embeddings.
1 code implementation • NAACL 2021 • Zhihao Fan, Yeyun Gong, Dayiheng Liu, Zhongyu Wei, Siyuan Wang, Jian Jiao, Nan Duan, Ruofei Zhang, Xuanjing Huang
We therefore introduce a new layer named dynamic mask attention network (DMAN) with a learnable mask matrix which is able to model localness adaptively.
Ranked #11 on
Machine Translation
on WMT2014 English-German
5 code implementations • 9 Feb 2021 • Shuai Lu, Daya Guo, Shuo Ren, JunJie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu
Benchmark datasets have a significant impact on accelerating research in programming language tasks.
Ranked #1 on
Cloze Test
on CodeXGLUE - CT-maxmin
1 code implementation • 31 Dec 2020 • Weizhen Qi, Yeyun Gong, Jian Jiao, Yu Yan, Weizhu Chen, Dayiheng Liu, Kewen Tang, Houqiang Li, Jiusheng Chen, Ruofei Zhang, Ming Zhou, Nan Duan
In this paper, we propose BANG, a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation.
1 code implementation • ACL 2021 • Zenan Xu, Daya Guo, Duyu Tang, Qinliang Su, Linjun Shou, Ming Gong, Wanjun Zhong, Xiaojun Quan, Nan Duan, Daxin Jiang
We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa.
no code implementations • COLING 2020 • Zhihao Fan, Yeyun Gong, Zhongyu Wei, Siyuan Wang, Yameng Huang, Jian Jiao, Xuanjing Huang, Nan Duan, Ruofei Zhang
Commonsense generation aims at generating plausible everyday scenario description based on a set of provided concepts.
no code implementations • COLING 2020 • Bo Shao, Yeyun Gong, Weizhen Qi, Nan Duan, Xiaola Lin
In this paper, we present a multi-level alignment pretraining method in a unified architecture formulti-lingual semantic parsing.
1 code implementation • Findings (ACL) 2021 • Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, Pengcheng Wang, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Ruofei Zhang, Winnie Wu, Ming Zhou, Nan Duan
Multi-task benchmarks such as GLUE and SuperGLUE have driven great progress of pretraining and transfer learning in Natural Language Processing (NLP).
no code implementations • EMNLP 2020 • Nan Duan, Duyu Tang, Ming Zhou
Machine reasoning research aims to build interpretable AI systems that can solve problems or draw conclusions from what they are told (i. e. facts and observations) and already know (i. e. models, common sense and knowledge) under certain constraints.
3 code implementations • Findings of the Association for Computational Linguistics 2020 • Weizhen Qi, Yu Yan, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou
This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.
1 code implementation • 21 Oct 2020 • Weizhen Qi, Yeyun Gong, Yu Yan, Jian Jiao, Bo Shao, Ruofei Zhang, Houqiang Li, Nan Duan, Ming Zhou
We build a dataset from a real-word sponsored search engine and carry out experiments to analyze different generative retrieval models.
1 code implementation • EMNLP 2020 • Wanjun Zhong, Duyu Tang, Zenan Xu, Ruize Wang, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin
To address this, we propose a graph-based model that utilizes the factual structure of a document for deepfake detection of text.
1 code implementation • EMNLP 2020 • Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Jiancheng Lv, Nan Duan, Ming Zhou
In this paper, we propose a novel data augmentation method, referred to as Controllable Rewriting based Question Data Augmentation (CRQDA), for machine reading comprehension (MRC), question generation, and question-answering natural language inference tasks.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Xuguang Wang, Linjun Shou, Ming Gong, Nan Duan, Daxin Jiang
The Natural Questions (NQ) benchmark set brings new challenges to Machine Reading Comprehension: the answers are not only at different levels of granularity (long and short), but also of richer types (including no-answer, yes/no, single-span and multi-span).
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Huaishao Luo, Lei Ji, Tianrui Li, Nan Duan, Daxin Jiang
Specifically, a cascaded labeling module is developed to enhance the interchange between aspect terms and improve the attention of sentiment tokens when labeling sentiment polarities.
Aspect-Based Sentiment Analysis
Aspect-Based Sentiment Analysis (ABSA)
+4
1 code implementation • ICLR 2021 • Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou
Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.
Ranked #3 on
Type prediction
on ManyTypes4TypeScript
no code implementations • 16 Sep 2020 • Martin Kuo, Yaobo Liang, Lei Ji, Nan Duan, Linjun Shou, Ming Gong, Peng Chen
The semi-structured answer has two advantages which are more readable and falsifiable compared to span answer.
1 code implementation • ACL 2020 • Linmei Hu, Siyong Xu, Chen Li, Cheng Yang, Chuan Shi, Nan Duan, Xing Xie, Ming Zhou
Furthermore, the learned representations are disentangled with latent preference factors by a neighborhood routing algorithm, which can enhance expressiveness and interpretability.
1 code implementation • ACL 2020 • Daya Guo, Duyu Tang, Nan Duan, Jian Yin, Daxin Jiang, Ming Zhou
Generating inferential texts about an event in different perspectives requires reasoning over different contexts that the event occurs.
Ranked #1 on
Common Sense Reasoning
on Event2Mind test
(BLEU metric)
1 code implementation • CVPR 2021 • Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Jianfeng Gao, Dongdong Zhang, Nan Duan
We present M3P, a Multitask Multilingual Multimodal Pre-trained model that combines multilingual pre-training and multimodal pre-training into a unified framework via multitask pre-training.
1 code implementation • ACL 2020 • Bo Zheng, Haoyang Wen, Yaobo Liang, Nan Duan, Wanxiang Che, Daxin Jiang, Ming Zhou, Ting Liu
Natural Questions is a new challenging machine reading comprehension benchmark with two-grained answers, which are a long answer (typically a paragraph) and a short answer (one or more entities inside the long answer).
1 code implementation • EMNLP (nlpbt) 2020 • Frank F. Xu, Lei Ji, Botian Shi, Junyi Du, Graham Neubig, Yonatan Bisk, Nan Duan
Watching instructional videos are often used to learn about procedures.
no code implementations • ACL 2020 • Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Nan Duan
The representations are then fed into the predictor to obtain the span of the short answer, the paragraph of the long answer, and the answer type in a cascaded manner.
no code implementations • EMNLP 2020 • Ruize Wang, Duyu Tang, Nan Duan, Wanjun Zhong, Zhongyu Wei, Xuanjing Huang, Daxin Jiang, Ming Zhou
We study the detection of propagandistic text fragments in news articles.
no code implementations • ACL 2020 • Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang
Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve performance on low resource languages.
no code implementations • ACL 2020 • Wanjun Zhong, Duyu Tang, Zhangyin Feng, Nan Duan, Ming Zhou, Ming Gong, Linjun Shou, Daxin Jiang, Jiahai Wang, Jian Yin
The graph is used to obtain graph-enhanced contextual representations of words in Transformer-based architecture.
no code implementations • 25 Apr 2020 • Wanjun Zhong, Duyu Tang, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin
We study question answering over a dynamic textual environment.
no code implementations • 12 Apr 2020 • Shangwen Lv, Yuechen Wang, Daya Guo, Duyu Tang, Nan Duan, Fuqing Zhu, Ming Gong, Linjun Shou, Ryan Ma, Daxin Jiang, Guihong Cao, Ming Zhou, Songlin Hu
In this work, we introduce a learning algorithm which directly optimizes model's ability to learn text representations for effective learning of downstream tasks.
1 code implementation • EMNLP 2020 • Dayiheng Liu, Yeyun Gong, Jie Fu, Wei Liu, Yu Yan, Bo Shao, Daxin Jiang, Jiancheng Lv, Nan Duan
Furthermore, we propose a simple and effective method to mine the keyphrases of interest in the news article and build a first large-scale keyphrase-aware news headline corpus, which contains over 180K aligned triples of $<$news article, headline, keyphrase$>$.
no code implementations • 7 Apr 2020 • Daya Guo, Akari Asai, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Jian Yin, Ming Zhou
In this work, we use multiple knowledge sources as fuels for the model.
2 code implementations • 3 Apr 2020 • Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou
In this paper, we introduce XGLUE, a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora and evaluate their performance across a diverse set of cross-lingual tasks.
no code implementations • 3 Mar 2020 • Qiaolin Xia, Haoyang Huang, Nan Duan, Dong-dong Zhang, Lei Ji, Zhifang Sui, Edward Cui, Taroon Bharti, Xin Liu, Ming Zhou
While many BERT-based cross-modal pre-trained models produce excellent results on downstream understanding tasks like image-text retrieval and VQA, they cannot be applied to generation tasks directly.
9 code implementations • Findings of the Association for Computational Linguistics 2020 • Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou
Results show that CodeBERT achieves state-of-the-art performance on both natural language code search and code documentation generation tasks.
Ranked #1 on
Code Documentation Generation
on CodeSearchNet - Go
3 code implementations • 15 Feb 2020 • Huaishao Luo, Lei Ji, Botian Shi, Haoyang Huang, Nan Duan, Tianrui Li, Jason Li, Taroon Bharti, Ming Zhou
However, most of the existing multimodal models are pre-trained for understanding tasks, leading to a pretrain-finetune discrepancy for generation tasks.
Ranked #2 on
Action Segmentation
on COIN
(using extra training data)
2 code implementations • Findings (ACL) 2021 • Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuanjing Huang, Jianshu ji, Guihong Cao, Daxin Jiang, Ming Zhou
We study the problem of injecting knowledge into large pre-trained models like BERT and RoBERTa.
Ranked #1 on
Entity Typing
on Open Entity
5 code implementations • 13 Jan 2020 • Weizhen Qi, Yu Yan, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou
This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.
Ranked #6 on
Question Generation
on SQuAD1.1
(using extra training data)
no code implementations • IJCNLP 2019 • Jingjing Xu, Yuechen Wang, Duyu Tang, Nan Duan, Pengcheng Yang, Qi Zeng, Ming Zhou, Xu sun
We provide representative baselines for these tasks and further introduce a coarse-to-fine model for clarification question generation.
no code implementations • IJCNLP 2019 • Bo Shao, Yeyun Gong, Weizhen Qi, Nan Duan, Xiaola Lin
Given a sentence pair, we extract the output representations of it from BERT.
1 code implementation • IJCNLP 2019 • Tao Shen, Xiubo Geng, Tao Qin, Daya Guo, Duyu Tang, Nan Duan, Guodong Long, Daxin Jiang
We consider the problem of conversational question answering over a large-scale knowledge base.
no code implementations • 25 Sep 2019 • Xindian Ma, Peng Zhang, Xiaoliu Mao, Yehua Zhang, Nan Duan, Yuexian Hou, Ming Zhou.
Then, we show that the lower bound of such a separation rank can reveal the quantitative relation between the network structure (e. g. depth/width) and the modeling ability for the contextual dependency.
no code implementations • 12 Sep 2019 • Yibo Sun, Duyu Tang, Nan Duan, Yeyun Gong, Xiaocheng Feng, Bing Qin, Daxin Jiang
Neural semantic parsing has achieved impressive results in recent years, yet its success relies on the availability of large amounts of supervised data.
1 code implementation • 9 Sep 2019 • Shangwen Lv, Daya Guo, Jingjing Xu, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Songlin Hu
In this work, we propose to automatically extract evidence from heterogeneous knowledge sources, and answer questions based on the extracted evidence.
Ranked #14 on
Common Sense Reasoning
on CommonsenseQA
no code implementations • ACL 2020 • Wanjun Zhong, Jingjing Xu, Duyu Tang, Zenan Xu, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin
We evaluate our system on FEVER, a benchmark dataset for fact checking, and find that rich structural information is helpful and both our graph-based mechanisms improve the accuracy.
Ranked #2 on
Fact Verification
on FEVER
no code implementations • IJCNLP 2019 • Haoyang Huang, Yaobo Liang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Ming Zhou
On XNLI, 1. 8% averaged accuracy improvement (on 15 languages) is obtained.
Cross-Lingual Natural Language Inference
Cross-Lingual Question Answering
+2
no code implementations • 16 Aug 2019 • Gen Li, Nan Duan, Yuejian Fang, Ming Gong, Daxin Jiang, Ming Zhou
We propose Unicoder-VL, a universal encoder that aims to learn joint representations of vision and language in a pre-training manner.
Ranked #5 on
Image-to-Text Retrieval
on MS COCO
(Recall@10 metric)
no code implementations • International Joint Conferences on Artifical Intelligence (IJCAI) 2019 • Botian Shi, Lei Ji, Pan Lu, Zhendong Niu, Nan Duan
In this paper, we develop a Scene Concept Graph (SCG) by aggregating image scene graphs and extracting frequently co-occurred concept pairs as scene common-sense knowledge.
no code implementations • ACL 2019 • Changzhi Sun, Yeyun Gong, Yuanbin Wu, Ming Gong, Daxin Jiang, Man Lan, Shiliang Sun, Nan Duan
We develop a new paradigm for the task of joint entity relation extraction.
Ranked #1 on
Relation Extraction
on ACE 2005
(Sentence Encoder metric)
no code implementations • ACL 2019 • Botian Shi, Lei Ji, Yaobo Liang, Nan Duan, Peng Chen, Zhendong Niu, Ming Zhou
Understanding narrated instructional videos is important for both research and real-world web applications.
1 code implementation • NeurIPS 2019 • Xindian Ma, Peng Zhang, Shuai Zhang, Nan Duan, Yuexian Hou, Dawei Song, Ming Zhou
In this paper, based on the ideas of tensor decomposition and parameters sharing, we propose a novel self-attention model (namely Multi-linear attention) with Block-Term Tensor Decomposition (BTD).
no code implementations • ACL 2019 • Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, Jian Yin
In this paper, we present an approach to incorporate retrieved datapoints as supporting evidence for context-dependent semantic parsing, such as generating source code conditioned on the class environment.
no code implementations • 24 May 2019 • Chenfei Wu, Yanzhao Zhou, Gen Li, Nan Duan, Duyu Tang, Xiaojie Wang
This paper presents a strong baseline for real-world visual reasoning (GQA), which achieves 60. 93% in GQA 2019 challenge and won the sixth place.
1 code implementation • NeurIPS 2019 • Yikang Li, Tao Ma, Yeqi Bai, Nan Duan, Sining Wei, Xiaogang Wang
Therefore, to generate the images with preferred objects and rich interactions, we propose a semi-parametric method, PasteGAN, for generating the image from the scene graph and the image crops, where spatial arrangements of the objects and their pair-wise relationships are defined by the scene graph and the object appearances are determined by the given object crops.
1 code implementation • NeurIPS 2018 • Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, Jian Yin
We present an approach to map utterances in conversation to logical forms, which will be executed on a large-scale knowledge base.
no code implementations • 12 Sep 2018 • Yibo Sun, Duyu Tang, Nan Duan, Jingjing Xu, Xiaocheng Feng, Bing Qin
Results show that our knowledge-aware model outperforms the state-of-the-art approaches.
no code implementations • 12 Sep 2018 • Yibo Sun, Daya Guo, Duyu Tang, Nan Duan, Zhao Yan, Xiaocheng Feng, Bing Qin
Machine reading comprehension (MRC) requires reasoning about both the knowledge involved in a document and knowledge about the world.
no code implementations • 5 Sep 2018 • Wanjun Zhong, Duyu Tang, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin
Although neural network approaches achieve remarkable success on a variety of NLP tasks, many of them struggle to answer questions that require commonsense knowledge.
no code implementations • EMNLP 2018 • Daya Guo, Yibo Sun, Duyu Tang, Nan Duan, Jian Yin, Hong Chi, James Cao, Peng Chen, Ming Zhou
We study how to learn a semantic parser of state-of-the-art accuracy with less supervised training data.
no code implementations • NAACL 2018 • Duyu Tang, Nan Duan, Zhao Yan, Zhirui Zhang, Yibo Sun, Shujie Liu, Yuanhua Lv, Ming Zhou
Secondly, directly applying GAN that regards all the generated questions as negative instances could not improve the accuracy of the QA model.
no code implementations • 29 May 2018 • Junwei Bao, Duyu Tang, Nan Duan, Zhao Yan, Yuanhua Lv, Ming Zhou, Tiejun Zhao
The model maps a row from a table to a continuous vector and then generates a natural language sentence by leveraging the semantics of a table.
1 code implementation • 24 May 2018 • Pan Lu, Lei Ji, Wei zhang, Nan Duan, Ming Zhou, Jianyong Wang
To better utilize semantic knowledge in images, we propose a novel framework to learn visual relation facts for VQA.
no code implementations • ACL 2018 • Yibo Sun, Duyu Tang, Nan Duan, Jianshu ji, Guihong Cao, Xiaocheng Feng, Bing Qin, Ting Liu, Ming Zhou
We present a generative model to map natural language questions into SQL queries.
Ranked #4 on
Code Generation
on WikiSQL
no code implementations • 23 Jan 2018 • Zhao Yan, Duyu Tang, Nan Duan, Shujie Liu, Wendi Wang, Daxin Jiang, Ming Zhou, Zhoujun Li
We present assertion based question answering (ABQA), an open domain question answering task that takes a question and a passage as inputs, and outputs a semi-structured assertion consisting of a subject, a predicate and a list of arguments.
no code implementations • CVPR 2018 • Yikang Li, Nan Duan, Bolei Zhou, Xiao Chu, Wanli Ouyang, Xiaogang Wang
Recently visual question answering (VQA) and visual question generation (VQG) are two trending topics in the computer vision, which have been explored separately.
no code implementations • EMNLP 2017 • Nan Duan, Duyu Tang, Peng Chen, Ming Zhou
This paper presents how to generate questions from given passages using neural networks, where large scale QA pairs are automatically crawled and processed from Community-QA website, and used as training data.
no code implementations • 8 Jun 2017 • Zhao Yan, Duyu Tang, Nan Duan, Junwei Bao, Yuanhua Lv, Ming Zhou, Zhoujun Li
Understanding the connections between unstructured text and semi-structured table is an important yet neglected problem in natural language processing.
no code implementations • 7 Jun 2017 • Duyu Tang, Nan Duan, Tao Qin, Zhao Yan, Ming Zhou
On one side, the QA model judges whether the generated question of a QG model is relevant to the answer.
1 code implementation • COLING 2016 • Junwei Bao, Nan Duan, Zhao Yan, Ming Zhou, Tiejun Zhao
WebQuestions and SimpleQuestions are two benchmark data-sets commonly used in recent knowledge-based question answering (KBQA) work.