no code implementations • EMNLP 2021 • Bowen Yu, Yucheng Wang, Tingwen Liu, Hongsong Zhu, Limin Sun, Bin Wang
However, the popular OpenIE systems usually output facts sequentially in the way of predicting the next fact conditioned on the previous decoded ones, which enforce an unnecessary order on the facts and involve the error accumulation between autoregressive steps.
3 code implementations • ACL 2022 • Yanzeng Li, Jiangxia Cao, Xin Cong, Zhenyu Zhang, Bowen Yu, Hongsong Zhu, Tingwen Liu
Chinese pre-trained language models usually exploit contextual character information to learn representations, while ignoring the linguistics knowledge, e. g., word and sentence information.
no code implementations • 31 Oct 2024 • Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, Junyang Lin
The framework consists of two roles: the Generator and the Extender.
no code implementations • 28 Oct 2024 • Xinyu Lu, Xueru Wen, Yaojie Lu, Bowen Yu, Hongyu Lin, Haiyang Yu, Le Sun, Xianpei Han, Yongbin Li
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference, enables them to achieve similar capability enhancements.
1 code implementation • 22 Oct 2024 • Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun, Jingren Zhou, Junyang Lin
The key to automated alignment lies in providing learnable and accurate preference signals for preference learning without human annotation.
no code implementations • 17 Oct 2024 • Qiaoyu Tang, Le Yu, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun
Post-training has emerged as a crucial paradigm for adapting large-scale pre-trained models to various tasks, whose effects are fully reflected by delta parameters (i. e., the disparity between post-trained and pre-trained parameters).
1 code implementation • 12 Oct 2024 • Tingyu Xia, Bowen Yu, Kai Dang, An Yang, Yuan Wu, Yuan Tian, Yi Chang, Junyang Lin
Supervised fine-tuning (SFT) is crucial for aligning Large Language Models (LLMs) with human instructions.
no code implementations • 18 Sep 2024 • An Yang, Beichen Zhang, Binyuan Hui, Bofei Gao, Bowen Yu, Chengpeng Li, Dayiheng Liu, Jianhong Tu, Jingren Zhou, Junyang Lin, Keming Lu, Mingfeng Xue, Runji Lin, Tianyu Liu, Xingzhang Ren, Zhenru Zhang
This RM is then applied to the iterative evolution of data in supervised fine-tuning (SFT).
Ranked #1 on Math Word Problem Solving on MATH (using extra training data)
2 code implementations • 18 Sep 2024 • Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Kai Dang, An Yang, Rui Men, Fei Huang, Xingzhang Ren, Xuancheng Ren, Jingren Zhou, Junyang Lin
In this report, we introduce the Qwen2. 5-Coder series, a significant upgrade from its predecessor, CodeQwen1. 5.
1 code implementation • 4 Sep 2024 • Bofei Gao, Feifan Song, Yibo Miao, Zefan Cai, Zhe Yang, Liang Chen, Helan Hu, Runxin Xu, Qingxiu Dong, Ce Zheng, Shanghaoran Quan, Wen Xiao, Ge Zhang, Daoguang Zan, Keming Lu, Bowen Yu, Dayiheng Liu, Zeyu Cui, Jian Yang, Lei Sha, Houfeng Wang, Zhifang Sui, Peiyi Wang, Tianyu Liu, Baobao Chang
Finally, based on our unified perspective, we explore the challenges and future research directions for aligning large language models with human preferences.
no code implementations • 20 Aug 2024 • Chenhan Yuan, Fei Huang, Ru Peng, Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou
Transformer-based large language models (LLMs) exhibit limitations such as generating unsafe responses, unreliable reasoning, etc.
4 code implementations • 15 Jul 2024 • An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, TianHao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Xuejing Liu, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zhifang Guo, Zhihao Fan
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models.
Ranked #1 on Arithmetic Reasoning on GSM8K (using extra training data)
no code implementations • 3 Jul 2024 • Hongke Zhao, Songming Zheng, Likang Wu, Bowen Yu, Jing Wang
The explainability of recommendation systems is crucial for enhancing user trust and satisfaction.
1 code implementation • 19 Jun 2024 • Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, Jingren Zhou
AutoIF transforms the validation of instruction-following data quality into code verification, requiring LLMs to generate instructions, the corresponding code to check the correctness of the instruction responses, and unit test samples to verify the code's correctness.
Ranked #1 on Instruction Following on IFEval
1 code implementation • 3 Jun 2024 • Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu
Alignment is the most critical step in building large language models (LLMs) that meet human needs.
1 code implementation • 28 May 2024 • Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, Chang Zhou
Effectively aligning Large Language Models (LLMs) with human-centric values while preventing the degradation of abilities acquired through Pre-training and Supervised Fine-tuning (SFT) poses a central challenge in Reinforcement Learning from Human Feedback (RLHF).
1 code implementation • 17 May 2024 • Tingyu Xia, Bowen Yu, Yuan Wu, Yi Chang, Chang Zhou
In this paper, we initiate our discussion by demonstrating how Large Language Models (LLMs), when tasked with responding to queries, display a more even probability distribution in their answers if they are more adept, as opposed to their less skilled counterparts.
1 code implementation • 17 Mar 2024 • Feifan Song, Bowen Yu, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang, Yongbin Li
Additionally, the concept of diversity for prompts can be more complex than responses that are typically quantified by single digits.
1 code implementation • 27 Feb 2024 • Xinyu Lu, Bowen Yu, Yaojie Lu, Hongyu Lin, Haiyang Yu, Le Sun, Xianpei Han, Yongbin Li
The alignment problem in Large Language Models (LLMs) involves adapting them to the broad spectrum of human values.
no code implementations • 23 Feb 2024 • Qiaoyu Tang, Jiawei Chen, Bowen Yu, Yaojie Lu, Cheng Fu, Haiyang Yu, Hongyu Lin, Fei Huang, Ben He, Xianpei Han, Le Sun, Yongbin Li
The rise of large language models (LLMs) has transformed the role of information retrieval (IR) systems in the way to humans accessing information.
1 code implementation • 23 Jan 2024 • Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou
Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora.
3 code implementations • 6 Nov 2023 • Le Yu, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li
We experiment with encoder- and decoder-based LMs, showing that: (1) SFT delta parameter value ranges are typically small (within 0. 002) with extreme redundancy, and DARE can effortlessly eliminate 90% or even 99% of them; (2) DARE can merge multiple task-specific LMs into one LM with diverse capabilities.
1 code implementation • 23 Oct 2023 • Qi Gou, Zehua Xia, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li, Nguyen Cam-Tu
Given a textual passage and an answer, humans are able to ask questions with various expressions, but this ability is still challenging for most question generation (QG) systems.
1 code implementation • 20 Oct 2023 • Zehua Xia, Qi Gou, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li, Cam-Tu Nguyen
Previous studies have suggested that key phrase selection is essential for question generation (QG), yet it is still challenging to connect such disjointed phrases into meaningful questions, particularly for long context.
no code implementations • 4 Oct 2023 • Julius Adebayo, Melissa Hall, Bowen Yu, Bobbie Chern
We empirically assess the proposed approach on a variety of datasets and find significant improvement, compared to alternative approaches, in identifying training inputs that improve a model's disparity metric.
2 code implementations • 28 Sep 2023 • Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.
Ranked #3 on Multi-Label Text Classification on CC3M-TagMask
1 code implementation • 10 Aug 2023 • Yingxiu Zhao, Bowen Yu, Binyuan Hui, Haiyang Yu, Fei Huang, Yongbin Li, Nevin L. Zhang
Training large language models (LLMs) with open-domain instruction data has yielded remarkable success in aligning to end tasks and human preferences.
1 code implementation • 3 Aug 2023 • Xinghua Zhang, Bowen Yu, Haiyang Yu, Yangyu Lv, Tingwen Liu, Fei Huang, Hongbo Xu, Yongbin Li
Each perspective corresponds to the role of a specific LLM neuron in the first layer.
1 code implementation • 12 Jul 2023 • Xiangpeng Wei, Haoran Wei, Huan Lin, TianHao Li, Pei Zhang, Xingzhang Ren, Mei Li, Yu Wan, Zhiwei Cao, Binbin Xie, Tianxiang Hu, Shangjie Li, Binyuan Hui, Bowen Yu, Dayiheng Liu, Baosong Yang, Fei Huang, Jun Xie
Large language models (LLMs) demonstrate remarkable ability to comprehend, reason, and generate following nature language instructions.
1 code implementation • 30 Jun 2023 • Feifan Song, Bowen Yu, Minghao Li, Haiyang Yu, Fei Huang, Yongbin Li, Houfeng Wang
In this manner, PRO effectively transforms human alignment into aligning the probability ranking of n responses generated by LLM with the preference ranking of humans towards these responses.
no code implementations • 29 Jun 2023 • Bowen Yu, Cheng Fu, Haiyang Yu, Fei Huang, Yongbin Li
When trying to answer complex questions, people often rely on multiple sources of information, such as visual, textual, and tabular data.
1 code implementation • 18 May 2023 • Yingxiu Zhao, Bowen Yu, Haiyang Yu, Bowen Li, Jinyang Li, Chao Wang, Fei Huang, Yongbin Li, Nevin L. Zhang
To tackle this issue, we are the first to present a causally-complete dataset construction strategy for building million-level DocGD pre-training corpora.
1 code implementation • 11 May 2023 • Yi Dai, Hao Lang, Yinhe Zheng, Bowen Yu, Fei Huang, Yongbin Li
Specifically, we dedicate task-level prompts to capture task-specific knowledge to retain high LL performances and maintain instance-level prompts to learn knowledge shared across input samples to improve the model's generalization performance.
no code implementations • 20 Apr 2023 • Gehang Zhang, Bowen Yu, Jiangxia Cao, Xinghua Zhang, Jiawei Sheng, Chuan Zhou, Tingwen Liu
Graph contrastive learning (GCL) has recently achieved substantial advancements.
1 code implementation • 14 Apr 2023 • Minghao Li, Yingxiu Zhao, Bowen Yu, Feifan Song, Hangyu Li, Haiyang Yu, Zhoujun Li, Fei Huang, Yongbin Li
(2) How can we enhance LLMs' ability to utilize tools?
1 code implementation • Conference on Empirical Methods in Natural Language Processing 2022 • Mengxiao Song, Bowen Yu, Li Quangang, Wang Yubin, Tingwen Liu, Hongbo Xu
To be specific, an intent-slot co-occurrence graph is constructed based on the entire training corpus to globally discover correlation between intents and slots.
Ranked #7 on Slot Filling on MixATIS
no code implementations • 29 Nov 2022 • Bowen Yu, Zhenyu Zhang, Jingyang Li, Haiyang Yu, Tingwen Liu, Jian Sun, Yongbin Li, Bin Wang
Open Information Extraction (OpenIE) facilitates the open-domain discovery of textual facts.
1 code implementation • 23 Nov 2022 • Yingxiu Zhao, Yinhe Zheng, Bowen Yu, Zhiliang Tian, Dongkyu Lee, Jian Sun, Haiyang Yu, Yongbin Li, Nevin L. Zhang
In this paper, we explore a novel setting, semi-supervised lifelong language learning (SSLL), where a model learns sequentially arriving language tasks with both labeled and unlabeled data.
1 code implementation • 14 Oct 2022 • Yingxiu Zhao, Yinhe Zheng, Zhiliang Tian, Chang Gao, Bowen Yu, Haiyang Yu, Yongbin Li, Jian Sun, Nevin L. Zhang
Lifelong learning (LL) is vital for advanced task-oriented dialogue (ToD) systems.
no code implementations • 14 Jul 2022 • Zhenyu Zhang, Bowen Yu, Haiyang Yu, Tingwen Liu, Cheng Fu, Jingyang Li, Chengguang Tang, Jian Sun, Yongbin Li
In this paper, we propose a Layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents (VRDs), so as to generate accurate responses in dialogue systems.
1 code implementation • SIGIR 2022 • Xin Cong, Jiawei Sheng, Shiyao Cui, Bowen Yu, Tingwen Liu, Bin Wang
To instantiate this strategy, we further propose a model, RelATE, which builds a dual-level attention to aggregate relationrelevant information to detect the relation occurrence and utilizes the annotated samples of the detected relations to extract the corresponding head/tail entities.
no code implementations • 24 May 2022 • Shaowen Zhou, Bowen Yu, Aixin Sun, Cheng Long, Jingyang Li, Haiyang Yu, Jian Sun, Yongbin Li
Open Information Extraction (OpenIE) facilitates domain-independent discovery of relational facts from large corpora.
Ranked #1 on Open Information Extraction on CaRB
Natural Language Understanding Open-Domain Question Answering +1
no code implementations • 20 Apr 2022 • Bowen Yu, Yingxia Shao, Ang Li
In recent years, with the rapid growth of Internet data, the number and types of scientific and technological resources are also rapidly expanding.
no code implementations • 21 Mar 2022 • Bowen Yu, Junping Du, Yingxia Shao
With the rapid growth of the number and types of web resources, there are still problems to be solved when using a single strategy to extract the text information of different pages.
no code implementations • 7 Feb 2022 • Shiyao Cui, Xin Cong, Bowen Yu, Tingwen Liu, Yucheng Wang, Jinqiao Shi
Meanwhile, rough reading is explored in a multi-round manner to discover undetected events, thus the multi-events problem is handled.
1 code implementation • EMNLP 2021 • Xinghua Zhang, Bowen Yu, Tingwen Liu, Zhenyu Zhang, Jiawei Sheng, Mengge Xue, Hongbo Xu
Distantly supervised named entity recognition (DS-NER) efficiently reduces labor costs but meanwhile intrinsically suffers from the label noise due to the strong assumption of distant supervision.
1 code implementation • Findings (ACL) 2021 • Jiawei Sheng, Shu Guo, Bowen Yu, Qian Li, Yiming Hei, Lihong Wang, Tingwen Liu, Hongbo Xu
Event extraction (EE) is a crucial information extraction task that aims to extract event information in texts.
1 code implementation • ACL 2021 • Yucheng Wang, Bowen Yu, Hongsong Zhu, Tingwen Liu, Nan Yu, Limin Sun
Named entity recognition (NER) remains challenging when entity mentions can be discontinuous.
no code implementations • NAACL 2021 • Yanzeng Li, Bowen Yu, Li Quangang, Tingwen Liu
In this paper, we introduce FITAnnotator, a generic web-based tool for efficient text annotation.
1 code implementation • Findings (ACL) 2021 • Xin Cong, Shiyao Cui, Bowen Yu, Tingwen Liu, Yubin Wang, Bin Wang
Event detection tends to struggle when it needs to recognize novel event types with a few samples.
no code implementations • 3 Dec 2020 • Shiyao Cui, Bowen Yu, Xin Cong, Tingwen Liu, Quangang Li, Jinqiao Shi
A heterogeneous graph attention networks is then introduced to propagate relational message and enrich information interaction.
no code implementations • COLING 2020 • Zhenyu Zhang, Bowen Yu, Xiaobo Shu, Tingwen Liu, Hengzhu Tang, Wang Yubin, Li Guo
Document-level relation extraction (RE) poses new challenges over its sentence-level counterpart since it requires an adequate comprehension of the whole document and the multi-hop reasoning ability across multiple sentences to reach the final result.
no code implementations • COLING 2020 • Bowen Yu, Xue Mengge, Zhenyu Zhang, Tingwen Liu, Wang Yubin, Bin Wang
Dependency trees have been shown to be effective in capturing long-range relations between target entities.
no code implementations • COLING 2020 • Xue Mengge, Bowen Yu, Tingwen Liu, Yue Zhang, Erli Meng, Bin Wang
Incorporating lexicons into character-level Chinese NER by lattices is proven effective to exploitrich word boundary information.
1 code implementation • COLING 2020 • Yucheng Wang, Bowen Yu, Yueyang Zhang, Tingwen Liu, Hongsong Zhu, Limin Sun
To mitigate the issue, we propose in this paper a one-stage joint extraction model, namely, TPLinker, which is capable of discovering overlapping relations sharing one or both entities while immune from the exposure bias.
Ranked #2 on Relation Extraction on NYT11-HRL
1 code implementation • EMNLP 2020 • Mengge Xue, Bowen Yu, Zhenyu Zhang, Tingwen Liu, Yue Zhang, Bin Wang
More recently, Named Entity Recognition hasachieved great advances aided by pre-trainingapproaches such as BERT.
1 code implementation • 23 Jun 2020 • Xin Cong, Bowen Yu, Tingwen Liu, Shiyao Cui, Hengzhu Tang, Bin Wang
We first build a representation extractor to derive features for unlabeled data from the target domain (no test data is necessary) and then group them with a cluster miner.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Shiyao Cui, Bowen Yu, Tingwen Liu, Zhen-Yu Zhang, Xuebin Wang, Jinqiao Shi
Previous studies on the task have verified the effectiveness of integrating syntactic dependency into graph convolutional networks.
no code implementations • 14 Jan 2020 • C. Estelle Smith, Bowen Yu, Anjali Srivastava, Aaron Halfaker, Loren Terveen, Haiyi Zhu
On Wikipedia, sophisticated algorithmic tools are used to assess the quality of edits and take corrective actions.
1 code implementation • ACL 2020 • Yanzeng Li, Bowen Yu, Mengge Xue, Tingwen Liu
Most Chinese pre-trained models take character as the basic unit and learn representation according to character's external contexts, ignoring the semantics expressed in the word, which is the smallest meaningful utterance in Chinese.
1 code implementation • 10 Sep 2019 • Bowen Yu, Zhen-Yu Zhang, Xiaobo Shu, Yubin Wang, Tingwen Liu, Bin Wang, Sujian Li
Joint extraction of entities and relations aims to detect entity pairs along with their relations using a single model.
Ranked #1 on Relation Extraction on NYT-single
1 code implementation • IJCAI-19 2019 • Bowen Yu, Zhen-Yu Zhang, Tingwen Liu, Bin Wang, Sujian Li, Quangang Li
Relation extraction studies the issue of predicting semantic relations between pairs of entities in sentences.
Ranked #30 on Relation Extraction on TACRED
1 code implementation • 2 Aug 2019 • Bowen Yu, Claudio T. Silva
Dataflow visualization systems enable flexible visual data exploration by allowing the user to construct a dataflow diagram that composes query and visualization modules to specify system functionality.
no code implementations • 5 Jul 2019 • Aécio Santos, Sonia Castelo, Cristian Felix, Jorge Piazentin Ono, Bowen Yu, Sungsoo Hong, Cláudio T. Silva, Enrico Bertini, Juliana Freire
In this paper, we present Visus, a system designed to support the model building process and curation of ML data processing pipelines generated by AutoML systems.
no code implementations • 24 Mar 2017 • Justin Cranshaw, Emad Elwany, Todd Newman, Rafal Kocielnik, Bowen Yu, Sandeep Soni, Jaime Teevan, Andrés Monroy-Hernández
Although information workers may complain about meetings, they are an essential part of their work life.