no code implementations • 3 Dec 2024 • Zhibo Yang, Jun Tang, Zhaohai Li, Pengfei Wang, Jianqiang Wan, Humen Zhong, Xuejing Liu, Mingkun Yang, Peng Wang, Yuliang Liu, Lianwen Jin, Xiang Bai, Shuai Bai, Junyang Lin
The current landscape lacks a comprehensive benchmark to effectively measure the literate capabilities of LMMs.
no code implementations • 14 Nov 2024 • Yidan Zhang, Boyi Deng, Yu Wan, Baosong Yang, Haoran Wei, Bowen Yu, Junyang Lin, Fei Huang, Jingren Zhou
Recent advancements in large language models (LLMs) showcase varied multilingual capabilities across tasks like translation, code generation, and reasoning.
1 code implementation • 31 Oct 2024 • Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, Junyang Lin
The framework consists of two roles: the Generator and the Extender.
no code implementations • 24 Oct 2024 • Yibo Miao, Bofei Gao, Shanghaoran Quan, Junyang Lin, Daoguang Zan, Jiaheng Liu, Jian Yang, Tianyu Liu, Zhijie Deng
We also contribute a pipeline for collecting preference pairs for DPO on CodeLLMs.
1 code implementation • 22 Oct 2024 • Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun, Jingren Zhou, Junyang Lin
The key to automated alignment lies in providing learnable and accurate preference signals for preference learning without human annotation.
1 code implementation • 12 Oct 2024 • Tingyu Xia, Bowen Yu, Kai Dang, An Yang, Yuan Wu, Yuan Tian, Yi Chang, Junyang Lin
Supervised fine-tuning (SFT) is crucial for aligning Large Language Models (LLMs) with human instructions.
1 code implementation • 2 Oct 2024 • Liang Chen, Sinan Tan, Zefan Cai, Weichu Xie, Haozhe Zhao, Yichi Zhang, Junyang Lin, Jinze Bai, Tianyu Liu, Baobao Chang
This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer.
no code implementations • 30 Sep 2024 • Ke Yi, Zengke Liu, Jianwei Zhang, Chengyuan Li, Tong Zhang, Junyang Lin, Jingren Zhou
Based on observing activations from large language models, outliers can be classified into channel-wise and spike outliers.
no code implementations • 28 Sep 2024 • Wenrui Liu, Zhifang Guo, Jin Xu, YuanJun Lv, Yunfei Chu, Zhou Zhao, Junyang Lin
This inconsistency can lead to a single audio segment being represented by multiple divergent sequences, which creates confusion in neural codec language models and results in omissions and repetitions during speech generation.
no code implementations • 18 Sep 2024 • An Yang, Beichen Zhang, Binyuan Hui, Bofei Gao, Bowen Yu, Chengpeng Li, Dayiheng Liu, Jianhong Tu, Jingren Zhou, Junyang Lin, Keming Lu, Mingfeng Xue, Runji Lin, Tianyu Liu, Xingzhang Ren, Zhenru Zhang
This RM is then applied to the iterative evolution of data in supervised fine-tuning (SFT).
Ranked #1 on Math Word Problem Solving on MATH (using extra training data)
4 code implementations • 18 Sep 2024 • Peng Wang, Shuai Bai, Sinan Tan, Shijie Wang, Zhihao Fan, Jinze Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Yang Fan, Kai Dang, Mengfei Du, Xuancheng Ren, Rui Men, Dayiheng Liu, Chang Zhou, Jingren Zhou, Junyang Lin
We present the Qwen2-VL Series, an advanced upgrade of the previous Qwen-VL models that redefines the conventional predetermined-resolution approach in visual processing.
Ranked #3 on Temporal Relation Extraction on Vinoground
2 code implementations • 18 Sep 2024 • Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, Kai Dang, Yang Fan, Yichang Zhang, An Yang, Rui Men, Fei Huang, Bo Zheng, Yibo Miao, Shanghaoran Quan, Yunlong Feng, Xingzhang Ren, Xuancheng Ren, Jingren Zhou, Junyang Lin
In this report, we introduce the Qwen2. 5-Coder series, a significant upgrade from its predecessor, CodeQwen1. 5.
no code implementations • 6 Aug 2024 • Jiaxi Yang, Binyuan Hui, Min Yang, Jian Yang, Junyang Lin, Chang Zhou
The capability gap between open-source and closed-source large language models (LLMs) remains a challenge in text-to-SQL tasks.
2 code implementations • 23 Jul 2024 • Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, Graham Neubig
OpenDevin), a platform for the development of powerful and flexible AI agents that interact with the world in similar ways to those of a human developer: by writing code, interacting with a command line, and browsing the web.
2 code implementations • 15 Jul 2024 • Yunfei Chu, Jin Xu, Qian Yang, Haojie Wei, Xipin Wei, Zhifang Guo, Yichong Leng, YuanJun Lv, Jinzheng He, Junyang Lin, Chang Zhou, Jingren Zhou
We introduce the latest progress of Qwen-Audio, a large-scale audio-language model called Qwen2-Audio, which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions.
4 code implementations • 15 Jul 2024 • An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, TianHao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Xuejing Liu, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zhifang Guo, Zhihao Fan
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models.
Ranked #1 on Arithmetic Reasoning on GSM8K (using extra training data)
1 code implementation • 18 Jun 2024 • Zhe Yang, Yichang Zhang, Tianyu Liu, Jian Yang, Junyang Lin, Chang Zhou, Zhifang Sui
Furthermore, we introduce the concept of consistency score to quantitatively measure this inconsistency and analyze the potential for improvement in consistency by relative consistency score.
1 code implementation • 11 Mar 2024 • Liang Chen, Haozhe Zhao, Tianyu Liu, Shuai Bai, Junyang Lin, Chang Zhou, Baobao Chang
To this end, we introduce FastV, a versatile plug-and-play method designed to optimize computational efficiency by learning adaptive attention patterns in early layers and pruning visual tokens in subsequent ones.
no code implementations • 15 Nov 2023 • Keming Lu, Hongyi Yuan, Runji Lin, Junyang Lin, Zheng Yuan, Chang Zhou, Jingren Zhou
Zooter shows computation efficiency in inference as it introduces only a minor computation overhead of a routing function compared with reward model ranking methods.
1 code implementation • 14 Nov 2023 • Shengguang Wu, Keming Lu, Benfeng Xu, Junyang Lin, Qi Su, Chang Zhou
The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets, as the model selects new data points most distinct from any existing ones according to its current embedding space.
2 code implementations • 28 Sep 2023 • Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.
Ranked #3 on Multi-Label Text Classification on CC3M-TagMask
1 code implementation • 31 Aug 2023 • Shuai Bai, Shusheng Yang, Jinze Bai, Peng Wang, Xingxuan Zhang, Junyang Lin, Xinggang Wang, Chang Zhou, Jingren Zhou
Large vision-language models (LVLMs) have recently witnessed rapid advancements, exhibiting a remarkable capacity for perceiving, understanding, and processing visual information by connecting visual receptor with large language models (LLMs).
2 code implementations • 24 Aug 2023 • Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang Zhou, Jingren Zhou
In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models (LVLMs) designed to perceive and understand both texts and images.
Ranked #3 on Visual Question Answering (VQA) on InfiMM-Eval
1 code implementation • 14 Aug 2023 • Keming Lu, Hongyi Yuan, Zheng Yuan, Runji Lin, Junyang Lin, Chuanqi Tan, Chang Zhou, Jingren Zhou
Based on this observation, we propose a data selector based on InsTag to select 6K diverse and complex samples from open-source datasets and fine-tune models on InsTag-selected data.
2 code implementations • 24 May 2023 • Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Yongdong Zhang, Zhendong Mao
The answering quality of an aligned large language model (LLM) can be drastically improved if treated with proper crafting of prompts.
2 code implementations • 18 May 2023 • Peng Wang, Shijie Wang, Junyang Lin, Shuai Bai, Xiaohuan Zhou, Jingren Zhou, Xinggang Wang, Chang Zhou
In this work, we explore a scalable way for building a general representation model toward unlimited modalities.
Ranked #1 on Semantic Segmentation on ADE20K (using extra training data)
1 code implementation • 19 Dec 2022 • Junyang Lin, Xuancheng Ren, Yichang Zhang, Gao Liu, Peng Wang, An Yang, Chang Zhou
This paper proposes a new method, OFA-OCR, to transfer multimodal pretrained models to text recognition.
1 code implementation • 8 Dec 2022 • Jinze Bai, Rui Men, Hao Yang, Xuancheng Ren, Kai Dang, Yichang Zhang, Xiaohuan Zhou, Peng Wang, Sinan Tan, An Yang, Zeyu Cui, Yu Han, Shuai Bai, Wenbin Ge, Jianxin Ma, Junyang Lin, Jingren Zhou, Chang Zhou
As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data.
1 code implementation • 2 Nov 2022 • An Yang, Junshu Pan, Junyang Lin, Rui Men, Yichang Zhang, Jingren Zhou, Chang Zhou
The tremendous success of CLIP (Radford et al., 2021) has promoted the research and application of contrastive learning for vision-language pretraining.
Ranked #1 on Zero-shot Image Retrieval on MUGE Retrieval
1 code implementation • 4 Aug 2022 • Hao Yang, Junyang Lin, An Yang, Peng Wang, Chang Zhou, Hongxia Yang
Prompt tuning has become a new paradigm for model tuning and it has demonstrated success in natural language pretraining and even vision pretraining.
Ranked #2 on Visual Entailment on SNLI-VE test
no code implementations • 4 Jun 2022 • Yuezihan Jiang, Hao Yang, Junyang Lin, Hanyu Zhao, An Yang, Chang Zhou, Hongxia Yang, Zhi Yang, Bin Cui
Prompt Learning has recently gained great popularity in bridging the gap between pretraining tasks and various downstream tasks.
no code implementations • 23 Mar 2022 • Yu Huang, Junyang Lin, Chang Zhou, Hongxia Yang, Longbo Huang
Recently, it has been observed that the best uni-modal network outperforms the jointly trained multi-modal network, which is counter-intuitive since multiple signals generally bring more information.
4 code implementations • 7 Feb 2022 • Peng Wang, An Yang, Rui Men, Junyang Lin, Shuai Bai, Zhikang Li, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization.
Ranked #1 on Visual Question Answering on VQA v2 test-std (yes/no metric)
1 code implementation • 26 Nov 2021 • Jingjing Xu, Liang Zhao, Junyang Lin, Rundong Gao, Xu sun, Hongxia Yang
Many existing neural architecture search (NAS) solutions rely on downstream training for architecture evaluation, which takes enormous computations.
no code implementations • 8 Oct 2021 • Junyang Lin, An Yang, Jinze Bai, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Yong Li, Wei Lin, Jingren Zhou, Hongxia Yang
Recent expeditious developments in deep learning algorithms, distributed training, and even hardware design for large models have enabled training extreme-scale models, say GPT-3 and Switch Transformer possessing hundreds of billions or even trillions of parameters.
no code implementations • Findings (ACL) 2021 • Peng Wang, Junyang Lin, An Yang, Chang Zhou, Yichang Zhang, Jingren Zhou, Hongxia Yang
Experimental results demonstrate that our method outperforms the previous state-of-the-art methods in both automatic and human evaluation, especially on coverage and faithfulness.
no code implementations • 31 May 2021 • An Yang, Junyang Lin, Rui Men, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Jiamang Wang, Yong Li, Di Zhang, Wei Lin, Lin Qu, Jingren Zhou, Hongxia Yang
Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost, and thus it has become a trend in model scaling.
1 code implementation • 31 May 2021 • Shuai Bai, Zhedong Zheng, Xiaohan Wang, Junyang Lin, Zhu Zhang, Chang Zhou, Yi Yang, Hongxia Yang
In this paper, we apply one new modality, i. e., the language description, to search the vehicle of interest and explore the potential of this task in the real-world scenario.
1 code implementation • ACL 2021 • Shuhuai Ren, Junyang Lin, Guangxiang Zhao, Rui Men, An Yang, Jingren Zhou, Xu sun, Hongxia Yang
To bridge the semantic gap between the two modalities, previous studies mainly focus on word-region alignment at the object level, lacking the matching between the linguistic relation among the words and the visual relation among the regions.
Ranked #5 on Image-to-Text Retrieval on MS COCO
4 code implementations • NeurIPS 2021 • Ming Ding, Zhuoyi Yang, Wenyi Hong, Wendi Zheng, Chang Zhou, Da Yin, Junyang Lin, Xu Zou, Zhou Shao, Hongxia Yang, Jie Tang
Text-to-Image generation in the general domain has long been an open problem, which requires both a powerful generative model and cross-modal understanding.
Ranked #53 on Text-to-Image Generation on MS COCO (using extra training data)
no code implementations • 1 Mar 2021 • Junyang Lin, Rui Men, An Yang, Chang Zhou, Ming Ding, Yichang Zhang, Peng Wang, Ang Wang, Le Jiang, Xianyan Jia, Jie Zhang, Jianwei Zhang, Xu Zou, Zhikang Li, Xiaodong Deng, Jie Liu, Jinbao Xue, Huiling Zhou, Jianxin Ma, Jin Yu, Yong Li, Wei Lin, Jingren Zhou, Jie Tang, Hongxia Yang
In this work, we construct the largest dataset for multimodal pretraining in Chinese, which consists of over 1. 9TB images and 292GB texts that cover a wide range of domains.
no code implementations • 1 Jan 2021 • Jingjing Xu, Liang Zhao, Junyang Lin, Xu sun, Hongxia Yang
Inspired by our new finding, we explore a simple yet effective network architecture search (NAS) approach that leverages gradient correlation and gradient values to find well-performing architectures.
no code implementations • 28 Sep 2020 • Liang Zhao, Jingjing Xu, Junyang Lin, Yichang Zhang, Hongxia Yang, Xu sun
The reasoning module is responsible for searching skeleton paths from a knowledge graph to imitate the imagination process in the human writing for semantic transfer.
no code implementations • 30 Mar 2020 • Junyang Lin, An Yang, Yichang Zhang, Jie Liu, Jingren Zhou, Hongxia Yang
We pretrain the model with three pretraining tasks, including masked segment modeling (MSM), masked region modeling (MRM) and image-text matching (ITM); and finetune the model on a series of vision-and-language downstream tasks.
2 code implementations • 25 Dec 2019 • Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu sun
Self-attention based Transformer has demonstrated the state-of-the-art performances in a number of natural language processing tasks.
2 code implementations • NeurIPS 2019 • Jingjing Xu, Xu sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin
Unlike them, we find that the derivatives of the mean and variance are more important than forward normalization by re-centering and re-scaling backward gradients.
Ranked #5 on Machine Translation on IWSLT2015 English-Vietnamese
no code implementations • IJCNLP 2019 • Pengcheng Yang, Junyang Lin, Jingjing Xu, Jun Xie, Qi Su, Xu sun
The task of unsupervised sentiment modification aims to reverse the sentiment polarity of the input text while preserving its semantic content without any parallel data.
no code implementations • 25 Sep 2019 • Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Xu sun
Extensive experimental results on a series of natural language processing tasks, including neural machine translation, image captioning, and language modeling, all demonstrate the advantages of Sparse Transformer in model performance.
1 code implementation • IJCNLP 2019 • Qibin Chen, Junyang Lin, Yichang Zhang, Ming Ding, Yukuo Cen, Hongxia Yang, Jie Tang
In this paper, we propose a novel end-to-end framework called KBRD, which stands for Knowledge-Based Recommender Dialog System.
Ranked #5 on Text Generation on ReDial
1 code implementation • ACL 2019 • Pengcheng Yang, Fuli Luo, Shuming Ma, Junyang Lin, Xu sun
In this way, we can reduce the dependence of the model on the label order, as well as capture high-order correlations between labels.
no code implementations • ACL 2019 • Bingzhen Wei, Mingxuan Wang, Hao Zhou, Junyang Lin, Jun Xie, Xu sun
Non-autoregressive translation models (NAT) have achieved impressive inference speedup.
4 code implementations • 29 Mar 2019 • Qibin Chen, Junyang Lin, Yichang Zhang, Hongxia Yang, Jingren Zhou, Jie Tang
In order to make the description both informative and personalized, KOBE considers a variety of important factors during text generation, including product aspects, user categories, and knowledge base, etc.
1 code implementation • EMNLP 2018 • Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu sun
Existing text generation methods tend to produce repeated and {''}boring{''} expressions.
no code implementations • 10 Sep 2018 • Pengcheng Yang, Shuming Ma, Yi Zhang, Junyang Lin, Qi Su, Xu sun
However, the Seq2Seq model is not suitable for the MLTC task in essence.
no code implementations • 2 Sep 2018 • Bingzhen Wei, Junyang Lin
We propose a novel model for Neural Machine Translation (NMT).
1 code implementation • EMNLP 2018 • Liangchen Luo, Jingjing Xu, Junyang Lin, Qi Zeng, Xu sun
Different from conventional text generation tasks, the mapping between inputs and responses in conversations is more complicated, which highly demands the understanding of utterance-level semantic dependency, a relation between the whole meanings of inputs and outputs.
Ranked #2 on Text Generation on DailyDialog
1 code implementation • EMNLP 2018 • Junyang Lin, Qi Su, Pengcheng Yang, Shuming Ma, Xu sun
We propose a novel model for multi-label text classification, which is based on sequence-to-sequence learning.
1 code implementation • EMNLP 2018 • Junyang Lin, Xu sun, Xuancheng Ren, Muyu Li, Qi Su
Most of the Neural Machine Translation (NMT) models are based on the sequence-to-sequence (Seq2Seq) model with an encoder-decoder framework equipped with the attention mechanism.
Ranked #7 on Machine Translation on IWSLT2015 English-Vietnamese
1 code implementation • COLING 2018 • Junyang Lin, Xu sun, Xuancheng Ren, Shuming Ma, Jinsong Su, Qi Su
A great proportion of sequence-to-sequence (Seq2Seq) models for Neural Machine Translation (NMT) adopt Recurrent Neural Network (RNN) to generate translation word by word following a sequential order.
Ranked #9 on Machine Translation on IWSLT2015 English-Vietnamese
1 code implementation • ACL 2018 • Shuming Ma, Xu sun, Yizhong Wang, Junyang Lin
However, most of the existing neural machine translation models only use one of the correct translations as the targets, and the other correct sentences are punished as the incorrect sentences in the training stage.
1 code implementation • ACL 2018 • Shuming Ma, Xu sun, Junyang Lin, Houfeng Wang
In this work, we supervise the learning of the representation of the source content with that of the summary.
4 code implementations • ACL 2018 • Junyang Lin, Xu sun, Shuming Ma, Qi Su
To tackle the problem, we propose a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context.
Ranked #29 on Text Summarization on GigaWord
no code implementations • 3 May 2018 • Shuming Ma, Xu sun, Junyang Lin, Xuancheng Ren
Text summarization and sentiment classification both aim to capture the main ideas of the text but at different levels.
no code implementations • 6 Feb 2018 • Junyang Lin, Shuming Ma, Qi Su, Xu sun
ACA learns to control the attention by keeping track of the decoding history and the current information with a memory vector, so that the model can take the translated contents and the current information into consideration.
3 code implementations • 5 Feb 2018 • Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu sun
Existing text generation methods tend to produce repeated and "boring" expressions.