1 code implementation • 28 Feb 2024 • Sirui Hong, Yizhang Lin, Bang Liu, Bangbang Liu, Binhao Wu, Danyang Li, Jiaqi Chen, Jiayi Zhang, Jinlin Wang, Li Zhang, Lingyao Zhang, Min Yang, Mingchen Zhuge, Taicheng Guo, Tuo Zhou, Wei Tao, Wenyi Wang, Xiangru Tang, Xiangtao Lu, Xiawu Zheng, Xinbing Liang, Yaying Fei, Yuheng Cheng, Zongze Xu, Chenglin Wu
Large Language Model (LLM)-based agents have demonstrated remarkable effectiveness.
1 code implementation • NeurIPS 2021 • Xinlin Li, Bang Liu, YaoLiang Yu, Wulong Liu, Chunjing Xu, Vahid Partovi Nia
Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy efficient compared to conventional neural networks.
1 code implementation • ICCV 2023 • Xinlin Li, Bang Liu, Rui Heng Yang, Vanessa Courville, Chao Xing, Vahid Partovi Nia
We further propose a sign-scale decomposition design to enhance training efficiency and a low-variance random initialization strategy to improve the model's transfer learning performance.
1 code implementation • ACL 2019 • Bang Liu, Di Niu, Haojie Wei, Jinghong Lin, Yancheng He, Kunfeng Lai, Yu Xu
Identifying the relationship between two articles, e. g., whether two articles published from different sources describe the same breaking news, is critical to many document understanding tasks.
1 code implementation • CVPR 2022 • Yang Ding, Jing Yu, Bang Liu, Yue Hu, Mingxin Cui, Qi Wu
Knowledge-based visual question answering requires the ability of associating external knowledge for open-ended cross-modal scene understanding.
2 code implementations • 27 Jan 2020 • Bang Liu, Haojie Wei, Di Niu, Haolan Chen, Yancheng He
In this paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which aims at automatically generating high-quality and diverse question-answer pairs from unlabeled text corpus at scale by imitating the way a human asks questions.
1 code implementation • 1 Mar 2018 • Bang Liu, Di Niu, Kunfeng Lai, Linglong Kong, Yu Xu
We describe our experience of implementing a news content organization system at Tencent that discovers events from vast streams of breaking news and evolves news story structures in an online fashion.
Ranked #3 on Information Threading on NewSHead
1 code implementation • 5 Apr 2020 • Bang Liu, Weidong Guo, Di Niu, Jinwen Luo, Chaoyue Wang, Zhen Wen, Yu Xu
These services will benefit from a highly structured and web-scale ontology of entities, concepts, events, topics and categories.
1 code implementation • 9 Oct 2022 • Yi Cheng, Wenge Liu, Wenjie Li, Jiashuo Wang, Ruihui Zhao, Bang Liu, Xiaodan Liang, Yefeng Zheng
Providing Emotional Support (ES) to soothe people in emotional distress is an essential capability in social interactions.
1 code implementation • 21 Oct 2022 • Wangjie Jiang, Zhihao Ye, Zijing Ou, Ruihui Zhao, Jianguang Zheng, Yi Liu, Siheng Li, Bang Liu, Yujiu Yang, Yefeng Zheng
In this work, we define the task of Medical-domain Chinese Spelling Correction and propose MCSCSet, a large scale specialist-annotated dataset that contains about 200k samples.
Optical Character Recognition Optical Character Recognition (OCR) +1
1 code implementation • 12 Oct 2023 • Yu Song, Santiago Miret, huan zhang, Bang Liu
We propose an instruction-based process for trustworthy data curation in materials science (MatSci-Instruct), which we then apply to finetune a LLaMa-based language model targeted for materials science (HoneyBee).
1 code implementation • 29 Feb 2024 • Suyuchen Wang, Ivan Kobyzev, Peng Lu, Mehdi Rezagholizadeh, Bang Liu
This paper addresses the challenge of train-short-test-long (TSTL) scenarios in Large Language Models (LLMs) equipped with Rotary Position Embedding (RoPE), where models pre-trained on shorter sequences face difficulty with out-of-distribution (OOD) token positions in longer sequences.
1 code implementation • 27 Jan 2021 • Suyuchen Wang, Ruihui Zhao, Xi Chen, Yefeng Zheng, Bang Liu
Taxonomy is a hierarchically structured knowledge graph that plays a crucial role in machine intelligence.
1 code implementation • 6 Jun 2021 • Qianren Mao, Xi Li, Bang Liu, Shu Guo, Peng Hao, JianXin Li, Lihong Wang
These tokens or phrases may originate from primary fragmental textual pieces (e. g., segments) in the original text and are separated into different segments.
1 code implementation • 14 May 2023 • Yu Song, Santiago Miret, Bang Liu
Our experiments in this low-resource training setting show that language models pretrained on scientific text outperform BERT trained on general text.
1 code implementation • Findings (EMNLP) 2021 • Zijing Ou, Qinliang Su, Jianxing Yu, Ruihui Zhao, Yefeng Zheng, Bang Liu
As a first try, we modify existing generative hashing models to accommodate the BERT embeddings.
1 code implementation • 28 Mar 2022 • Sijie Cheng, Zhouhong Gu, Bang Liu, Rui Xie, Wei Wu, Yanghua Xiao
Specifically, i) to fully exploit user behavioral information, we extract candidate hyponymy relations that match user interests from query-click concepts; ii) to enhance the semantic information of new concepts and better detect hyponymy relations, we model concepts and relations through both user-generated content and structural information in existing taxonomies and user click logs, by leveraging Pre-trained Language Models and Graph Neural Network combined with Contrastive Learning; iii) to reduce the cost of dataset construction and overcome data skews, we construct a high-quality and balanced training dataset from existing taxonomy with no supervision.
1 code implementation • 12 Oct 2021 • Jiayuan Ding, Tong Xiang, Zijing Ou, Wangyang Zuo, Ruihui Zhao, Chenghua Lin, Yefeng Zheng, Bang Liu
In this paper, we introduce a new task named Reading Path Generation (RPG) which aims at automatically producing a path of papers to read for a given query.
1 code implementation • NAACL 2021 • Zhengxu Hou, Bang Liu, Ruihui Zhao, Zijing Ou, Yafei Liu, Xi Chen, Yefeng Zheng
For task-oriented dialog systems, training a Reinforcement Learning (RL) based Dialog Management module suffers from low sample efficiency and slow convergence speed due to the sparse rewards in RL. To solve this problem, many strategies have been proposed to give proper rewards when training RL, but their rewards lack interpretability and cannot accurately estimate the distribution of state-action pairs in real dialogs.
3 code implementations • ACL 2021 • Zijing Ou, Qinliang Su, Jianxing Yu, Bang Liu, Jingwen Wang, Ruihui Zhao, Changyou Chen, Yefeng Zheng
With the need of fast retrieval speed and small memory footprint, document hashing has been playing a crucial role in large-scale information retrieval.
1 code implementation • ICLR 2022 • Shengyao Lu, Bang Liu, Keith G. Mills, Shangling Jui, Di Niu
Systematicity, i. e., the ability to recombine known parts and rules to form new sequences while reasoning over relational data, is critical to machine intelligence.
1 code implementation • 26 Jan 2024 • Shengyao Lu, Keith G. Mills, Jiao He, Bang Liu, Di Niu
Understanding the decision-making process of Graph Neural Networks (GNNs) is crucial to their interpretability.
no code implementations • 1 Mar 2018 • Bang Liu, Ting Zhang, Fred X. Han, Di Niu, Kunfeng Lai, Yu Xu
The proposed sentence factorization technique leads to the invention of: 1) a new unsupervised distance metric which calculates the semantic distance between a pair of text snippets by solving a penalized optimal transport problem while preserving the logical relationship of words in the reordered sentences, and 2) new multi-scale deep learning models for supervised semantic training, based on factorized sentence hierarchies.
no code implementations • 27 Feb 2019 • Ting Zhang, Bang Liu, Di Niu, Kunfeng Lai, Yu Xu
In this paper, we are especially interested in relevance matching between a piece of short text and a long document, which is critical to problems like query-document matching in information retrieval and web searching.
no code implementations • 27 Feb 2019 • Bang Liu, Mingjun Zhao, Di Niu, Kunfeng Lai, Yancheng He, Haojie Wei, Yu Xu
In CGC-QG, we design a multi-task labeling strategy to identify whether a question word should be copied from the input passage or be generated instead, guiding the model to learn the accurate boundaries between copying and generation.
no code implementations • 21 May 2019 • Bang Liu, Weidong Guo, Di Niu, Chaoyue Wang, Shunnan Xu, Jinghong Lin, Kunfeng Lai, Yu Xu
We further present our techniques to tag documents with user-centered concepts and to construct a topic-concept-instance taxonomy, which has helped to improve search as well as news feeds recommendation in Tencent QQ Browser.
no code implementations • 27 Oct 2020 • Mingjun Zhao, ShengLi Yan, Bang Liu, Xinwang Zhong, Qian Hao, Haolan Chen, Di Niu, Bowei Long, Weidong Guo
In this paper, we present QBSUM, a high-quality large-scale dataset consisting of 49, 000+ data samples for the task of Chinese query-based document summarization.
no code implementations • ACL 2021 • Yi Cheng, SiYao Li, Bang Liu, Ruihui Zhao, Sujian Li, Chenghua Lin, Yefeng Zheng
This paper explores the task of Difficulty-Controllable Question Generation (DCQG), which aims at generating questions with required difficulty levels.
no code implementations • 28 May 2021 • Junnan Liu, Qianren Mao, Bang Liu, Hao Peng, Hongdong Zhu, JianXin Li
In this paper, we argue that this limitation can be overcome by a semi-supervised approach: consistency training which is to leverage large amounts of unlabeled data to improve the performance of supervised learning over a small corpus.
no code implementations • 4 Jun 2021 • Tong Mo, Bang Liu
Keyword spotting aims to identify specific keyword audio utterances.
no code implementations • Findings (ACL) 2021 • Zhexue Chen, Hong Huang, Bang Liu, Xuanhua Shi, Hai Jin
Aspect Sentiment Triplet Extraction (ASTE) aims to extract triplets from sentences, where each triplet includes an entity, its associated sentiment, and the opinion span explaining the reason for the sentiment.
no code implementations • 13 Oct 2021 • Jiuding Yang, Weidong Guo, Bang Liu, Yakun Yu, Chaoyue Wang, Jinwen Luo, Linglong Kong, Di Niu, Zhen Wen
Although conceptualization has been widely studied in semantics and knowledge representation, it is still challenging to find the most accurate concept phrases to characterize the main idea of a text snippet on the fast-growing social media.
no code implementations • NeurIPS 2021 • Xinlin Li, Bang Liu, YaoLiang Yu, Wulong Liu, Chunjing Xu, Vahid Partovi Nia
Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy-efficient compared to conventional neural networks.
no code implementations • 10 Jan 2022 • Martin Weyssow, Houari Sahraoui, Bang Liu
The progress made in code modeling has been tremendous in recent years thanks to the design of natural language processing learning approaches based on state-of-the-art model architectures.
no code implementations • 13 Jan 2022 • Yuyan Chen, Yanghua Xiao, Bang Liu
In this research, we argue that the evidences of an answer is critical to enhancing the interpretability of QA models.
no code implementations • 28 Feb 2022 • Zong-Kai Liu, Li-Hua Zhang, Bang Liu, Zheng-Yuan Zhang, Guang-Can Guo, Dong-Sheng Ding, Bao-Sen Shi
Recognition of multifrequency microwave (MW) electric fields is challenging because of the complex interference of multifrequency fields in practical applications.
no code implementations • ACL 2022 • Xiaoqiang Wang, Bang Liu, Fangli Xu, Bo Long, Siliang Tang, Lingfei Wu
In this paper, we argue that a deep understanding of model capabilities and data properties can help us feed a model with appropriate training data based on its learning status.
no code implementations • 29 Apr 2022 • Xiaoqiang Wang, Bang Liu, Siliang Tang, Lingfei Wu
Existing metrics for assessing question generation not only require costly human reference but also fail to take into account the input context of generation, rendering the lack of deep understanding of the relevance between the generated questions and input contexts.
no code implementations • 17 May 2022 • Ailisi Li, Xueyao Jiang, Bang Liu, Jiaqing Liang, Yanghua Xiao
Math Word Problems (MWP) is an important task that requires the ability of understanding and reasoning over mathematical text.
no code implementations • 8 May 2023 • Xiaoqiang Wang, Bang Liu, Siliang Tang, Lingfei Wu
We present $\textbf{$\texttt{SkillQG}$}$: a question generation framework with controllable comprehension types for assessing and improving machine reading comprehension models.
no code implementations • 27 May 2023 • Zhong Zhang, Bang Liu, Junming Shao
Pre-trained language models (PLMs) are known to be overly parameterized and have significant redundancy, indicating a small degree of freedom of the PLMs.
no code implementations • 16 Sep 2023 • Hossein Rajabzadeh, Suyuchen Wang, Hyock Ju Kwon, Bang Liu
We employ a tool-interacting divide-and-conquer strategy enabling large language models (LLMs) to answer complex multimodal multi-hop questions.
no code implementations • 1 Dec 2023 • Dekun Wu, Haochen Shi, Zhiyuan Sun, Bang Liu
In this study, we explore the application of Large Language Models (LLMs) in \textit{Jubensha}, a Chinese detective role-playing game and a novel area in Artificial Intelligence (AI) driven gaming.
no code implementations • 29 Feb 2024 • Xiaoqiang Wang, Bang Liu, Lingfei Wu
In this paper, we present FAC$^2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
no code implementations • 5 Mar 2024 • Haochen Shi, Zhiyuan Sun, Xingdi Yuan, Marc-Alexandre Côté, Bang Liu
Embodied Instruction Following (EIF) is a crucial task in embodied learning, requiring agents to interact with their environment through egocentric observations to fulfill natural language instructions.