Search Results for author: Damai Dai

Found 32 papers, 19 papers with code

Leveraging Word-Formation Knowledge for Chinese Word Sense Disambiguation

1 code implementation • Findings (EMNLP) 2021 • Hua Zheng, Lei LI, Damai Dai, Deli Chen, Tianyu Liu, Xu sun, Yang Liu

In this paper, we propose to leverage word-formation knowledge to enhance Chinese WSD.

Paper
Code

基于词信息嵌入的汉语构词结构识别研究(Chinese Word-Formation Prediction based on Representations of Word-Related Features)

no code implementations • CCL 2021 • Hua Zheng, Yaqi Yan, Yue Wang, Damai Dai, Yang Liu

“作为一种意合型语言, 汉语中的构词结构刻画了构词成分之间的组合关系, 是认知、理解词义的关键。在中文信息处理领域, 此前的构词结构识别工作大多沿用句法层面的粗粒度标签, 且主要基于上下文等词间信息建模, 忽略了语素义、词义等词内信息对构词结构识别的作用。本文采用语言学视域下的构词结构标签体系, 构建汉语构词结构及相关信息数据集, 提出了一种基于Bi-LSTM和Self-attention的模型, 以此来探究词内、词间等多方面信息对构词结构识别的潜在影响和能达到的性能。实验取得了良好的预测效果, 准确率77. 87%, F1值78. 36%;同时, 对比测试揭示, 词内的语素义信息对构词结构识别具有显著的贡献, 而词间的上下文信息贡献较弱且带有较强的不稳定性。该预测方法与数据集, 将为中文信息处理的多种任务, 如语素和词结构分析、词义识别与生成、语言文字研究与词典编纂等提供新的观点和方案。”

Paper
Add Code

Large Language Models Are Unconscious of Unreasonability in Math Problems

no code implementations • 28 Mar 2024 • Jingyuan Ma, Damai Dai, Lei Sha, Zhifang Sui

Large language models (LLMs) demonstrate substantial capabilities in solving math problems.

Math

Paper
Add Code

PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization

no code implementations • 25 Feb 2024 • Xiangdi Meng, Damai Dai, Weiyao Luo, Zhe Yang, Shaoxiang Wu, Xiaochen Wang, Peiyi Wang, Qingxiu Dong, Liang Chen, Zhifang Sui

Although LoRA fine-tuning is effective, there is still a performance gap compared to full fine-tuning, since its weight update is limited to low-rank matrices.

Paper
Add Code

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

1 code implementation • 11 Jan 2024 • Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, Wenfeng Liang

Subsequently, we scale up DeepSeekMoE to 16B parameters and show that it achieves comparable performance with LLaMA2 7B, with only about 40% of computations.

Language Modelling Large Language Model

814

Paper
Code

Language Models Understand Numbers, at Least Partially

no code implementations • 8 Jan 2024 • Fangwei Zhu, Damai Dai, Zhifang Sui

Large language models (LLMs) have exhibited impressive competence in various tasks, but their opaque internal mechanisms hinder their use in mathematical problems.

Math

Paper
Add Code

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

1 code implementation • 5 Jan 2024 • DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, JianZhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li, Guowei Li, Jiashi Li, Yao Li, Y. K. Li, Wenfeng Liang, Fangyun Lin, A. X. Liu, Bo Liu, Wen Liu, Xiaodong Liu, Xin Liu, Yiyuan Liu, Haoyu Lu, Shanghao Lu, Fuli Luo, Shirong Ma, Xiaotao Nie, Tian Pei, Yishi Piao, Junjie Qiu, Hui Qu, Tongzheng Ren, Zehui Ren, Chong Ruan, Zhangli Sha, Zhihong Shao, Junxiao Song, Xuecheng Su, Jingxiang Sun, Yaofeng Sun, Minghui Tang, Bingxuan Wang, Peiyi Wang, Shiyu Wang, Yaohui Wang, Yongji Wang, Tong Wu, Y. Wu, Xin Xie, Zhenda Xie, Ziwei Xie, Yiliang Xiong, Hanwei Xu, R. X. Xu, Yanhong Xu, Dejian Yang, Yuxiang You, Shuiping Yu, Xingkai Yu, B. Zhang, Haowei Zhang, Lecong Zhang, Liyue Zhang, Mingchuan Zhang, Minghua Zhang, Wentao Zhang, Yichao Zhang, Chenggang Zhao, Yao Zhao, Shangyan Zhou, Shunfeng Zhou, Qihao Zhu, Yuheng Zou

The rapid development of open-source large language models (LLMs) has been truly remarkable.

1,114

Paper
Code

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

1 code implementation • 14 Dec 2023 • Peiyi Wang, Lei LI, Zhihong Shao, R. X. Xu, Damai Dai, Yifei Li, Deli Chen, Y. Wu, Zhifang Sui

In this paper, we present an innovative process-oriented math process reward model called \textbf{Math-Shepherd}, which assigns a reward score to each step of math problem solutions.

Ranked #13 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Paper
Code

Not All Demonstration Examples are Equally Beneficial: Reweighting Demonstration Examples for In-Context Learning

1 code implementation • 12 Oct 2023 • Zhe Yang, Damai Dai, Peiyi Wang, Zhifang Sui

To assess the quality of weights in the absence of additional validation data, we design a masked self-prediction (MSP) score that exhibits a strong correlation with the final ICL performance.

In-Context Learning text-classification +1

Paper
Code

Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion

1 code implementation • 24 May 2023 • Shaoxiang Wu, Damai Dai, Ziwei Qin, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui

However, unlike other image-text multimodal tasks, video has longer multimodal sequences with more redundancy and noise in both visual and audio modalities.

Denoising Multimodal Sentiment Analysis

Paper
Code

Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization

no code implementations • 24 May 2023 • Shoujie Tong, Heming Xia, Damai Dai, Runxin Xu, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui

Also, Bi-Drop needs only one mini-batch to estimate the sub-net so it achieves higher utility of training data.

Natural Language Understanding

Paper
Add Code

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

1 code implementation • 23 May 2023 • Lean Wang, Lei LI, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

In-context learning (ICL) emerges as a promising capability of large language models (LLMs) by providing them with demonstration examples to perform diverse tasks.

In-Context Learning

113

Paper
Code

A Survey on In-context Learning

1 code implementation • 31 Dec 2022 • Qingxiu Dong, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu sun, Jingjing Xu, Lei LI, Zhifang Sui

With the increasing ability of large language models (LLMs), in-context learning (ICL) has become a new paradigm for natural language processing (NLP), where LLMs make predictions only based on contexts augmented with a few examples.

In-Context Learning

739

Paper
Code

Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers

1 code implementation • 20 Dec 2022 • Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei

We comprehensively compare the behaviors of in-context learning and explicit finetuning on real tasks to provide empirical evidence that supports our understanding.

In-Context Learning Open-Ended Question Answering

3,163

Paper
Code

Calibrating Factual Knowledge in Pretrained Language Models

1 code implementation • 7 Oct 2022 • Qingxiu Dong, Damai Dai, YiFan Song, Jingjing Xu, Zhifang Sui, Lei LI

However, we find that facts stored in the PLMs are not always correct.

Knowledge Probing Question Answering

Paper
Code

Neural Knowledge Bank for Pretrained Transformers

no code implementations • 31 Jul 2022 • Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui

The ability of pretrained Transformers to remember factual knowledge is essential but still limited for existing models.

Language Modelling Machine Translation +2

Paper
Add Code

Robust Fine-tuning via Perturbation and Interpolation from In-batch Instances

1 code implementation • 2 May 2022 • Shoujie Tong, Qingxiu Dong, Damai Dai, YiFan Song, Tianyu Liu, Baobao Chang, Zhifang Sui

For each instance in a batch, we involve other instances in the same batch to interact with it.

Paper
Code

On the Representation Collapse of Sparse Mixture of Experts

2 code implementations • 20 Apr 2022 • Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei

We also present a comprehensive analysis on the representation and routing behaviors of our models.

Clustering Language Modelling

18,284

Paper
Code

StableMoE: Stable Routing Strategy for Mixture of Experts

1 code implementation • ACL 2022 • Damai Dai, Li Dong, Shuming Ma, Bo Zheng, Zhifang Sui, Baobao Chang, Furu Wei

We point out that existing learning-to-route MoE methods suffer from the routing fluctuation issue, i. e., the target expert of the same input may change along with training, but only one expert will be activated for the input during inference.

Language Modelling Machine Translation

Paper
Code

Mixture of Experts for Biomedical Question Answering

no code implementations • 15 Apr 2022 • Damai Dai, Wenbin Jiang, Jiyuan Zhang, Weihua Peng, Yajuan Lyu, Zhifang Sui, Baobao Chang, Yong Zhu

In this paper, in order to alleviate the parameter competition problem, we propose a Mixture-of-Expert (MoE) based question answering method called MoEBQA that decouples the computation for different types of questions by sparse routing.

Question Answering

Paper
Add Code

Hierarchical Curriculum Learning for AMR Parsing

1 code implementation • ACL 2022 • Peiyi Wang, Liang Chen, Tianyu Liu, Damai Dai, Yunbo Cao, Baobao Chang, Zhifang Sui

Abstract Meaning Representation (AMR) parsing aims to translate sentences to semantic representation with a hierarchical structure, and is recently empowered by pretrained sequence-to-sequence models.

AMR Parsing Representation Learning

Paper
Code

Behind the Scenes: An Exploration of Trigger Biases Problem in Few-Shot Event Classification

1 code implementation • 29 Aug 2021 • Peiyi Wang, Runxin Xu, Tianyu Liu, Damai Dai, Baobao Chang, Zhifang Sui

However, we find they suffer from trigger biases that signify the statistical homogeneity between some trigger words and target event types, which we summarize as trigger overlapping and trigger separability.

Paper
Code

Explicit Interaction Network for Aspect Sentiment Triplet Extraction

no code implementations • 21 Jun 2021 • Peiyi Wang, Tianyu Liu, Damai Dai, Runxin Xu, Baobao Chang, Zhifang Sui

Table encoder extracts sentiment at token-pair level, so that the compositional feature between targets and opinions can be easily captured.

Aspect Sentiment Triplet Extraction Sentence +1

Paper
Add Code

Decompose, Fuse and Generate: A Formation-Informed Method for Chinese Definition Generation

no code implementations • NAACL 2021 • Hua Zheng, Damai Dai, Lei LI, Tianyu Liu, Zhifang Sui, Baobao Chang, Yang Liu

In this paper, we tackle the task of Definition Generation (DG) in Chinese, which aims at automatically generating a definition for a word.

Paper
Add Code

Knowledge Neurons in Pretrained Transformers

3 code implementations • ACL 2022 • Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei

In this paper, we present preliminary studies on how factual knowledge is stored in pretrained Transformers by introducing the concept of knowledge neurons.

487

Paper
Code

Incorporating Connections Beyond Knowledge Embeddings: A Plug-and-Play Module to Enhance Commonsense Reasoning in Machine Reading Comprehension

no code implementations • 26 Mar 2021 • Damai Dai, Hua Zheng, Zhifang Sui, Baobao Chang

Conventional Machine Reading Comprehension (MRC) has been well-addressed by pattern matching, but the ability of commonsense reasoning remains a gap between humans and machines.

Knowledge Graph Embeddings Knowledge Graphs +1

Paper
Add Code

Coarse-to-Fine Entity Representations for Document-level Relation Extraction

1 code implementation • 4 Dec 2020 • Damai Dai, Jing Ren, Shuang Zeng, Baobao Chang, Zhifang Sui

In classification, we combine the entity representations from both two levels into more comprehensive representations for relation extraction.

Ranked #34 on Relation Extraction on DocRED

Document-level Relation Extraction Relation

Paper
Code

Inductively Representing Out-of-Knowledge-Graph Entities by Optimal Estimation Under Translational Assumptions

1 code implementation • ACL (RepL4NLP) 2021 • Damai Dai, Hua Zheng, Fuli Luo, Pengcheng Yang, Baobao Chang, Zhifang Sui

Conventional Knowledge Graph Completion (KGC) assumes that all test entities appear during training.

Knowledge Graph Embedding

Paper
Code

Learning to Control the Fine-grained Sentiment for Story Ending Generation

no code implementations • ACL 2019 • Fuli Luo, Damai Dai, Pengcheng Yang, Tianyu Liu, Baobao Chang, Zhifang Sui, Xu sun

Therefore, we propose a generic and novel framework which consists of a sentiment analyzer and a sentimental generator, respectively addressing the two challenges.

Text Generation

Paper
Add Code

LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts

3 code implementations • 13 Sep 2018 • Shuming Ma, Lei Cui, Damai Dai, Furu Wei, Xu sun

We introduce the task of automatic live commenting.

Retrieval

124

Paper
Code

Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions

no code implementations • 16 Aug 2018 • Wei Li, Xuancheng Ren, Damai Dai, Yunfang Wu, Houfeng Wang, Xu sun

In the experiments, we take a real-world sememe knowledge base HowNet and the corresponding descriptions of the words in Baidu Wiki for training and evaluation.

Paper
Add Code

Live Video Comment Generation Based on Surrounding Frames and Live Comments

no code implementations • 13 Aug 2018 • Damai Dai

In this paper, we propose the task of live comment generation.

Comment Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.