Search Results for author: Li Du

Found 45 papers, 17 papers with code

CogBERT: Cognition-Guided Pre-trained Language Models

1 code implementation COLING 2022 Xiao Ding, Bowen Chen, Li Du, Bing Qin, Ting Liu

To fill the gap, we propose CogBERT, a framework that can induce fine-grained cognitive features from cognitive data and incorporate cognitive features into BERT by adaptively adjusting the weight of cognitive features for different NLP tasks.

EEG

Neural Natural Logic Inference for Interpretable Question Answering

1 code implementation EMNLP 2021 Jihao Shi, Xiao Ding, Li Du, Ting Liu, Bing Qin

Many open-domain question answering problems can be cast as a textual entailment task, where a question and candidate answers are concatenated to form hypotheses.

Multiple-choice Natural Language Inference +1

SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking

1 code implementation5 Jul 2024 Xingrun Xing, Boyan Gao, Zheng Zhang, David A. Clifton, Shitao Xiao, Li Du, Guoqi Li, Jiajun Zhang

In contrast, human brains, which contain approximately 86 billion biological neurons, exhibit significantly greater energy efficiency compared to LLMs with a similar number of parameters.

Language Modelling Large Language Model +1

SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic

no code implementations3 Jul 2024 Liulu He, Yufei Zhao, Rui Gao, Yuan Du, Li Du

Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models.

Quantization

Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation

1 code implementation26 May 2024 Rongyu Zhang, Aosong Cheng, Yulin Luo, Gaole Dai, Huanrui Yang, Jiaming Liu, ran Xu, Li Du, Yuan Du, Yanbing Jiang, Shanghang Zhang

Continual Test-Time Adaptation (CTTA), which aims to adapt the pre-trained model to ever-evolving target domains, emerges as an important task for vision models.

feature selection Test-time Adaptation

Medical Dialogue: A Survey of Categories, Methods, Evaluation and Challenges

no code implementations17 May 2024 Xiaoming Shi, Zeming Liu, Li Du, Yuxuan Wang, Hongru Wang, Yuhang Guo, Tong Ruan, Jie Xu, Shaoting Zhang

As a result, an overview of the categories, methods, and evaluation of medical dialogue systems remain limited and underspecified, hindering the further improvement of this area.

Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning

no code implementations13 Apr 2024 Yijiang Liu, Rongyu Zhang, Huanrui Yang, Kurt Keutzer, Yuan Du, Li Du, Shanghang Zhang

Large Language Models (LLMs) have demonstrated significant potential in performing multiple tasks in multimedia applications, ranging from content generation to interactive entertainment, and artistic creation.

Diversity

Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation

1 code implementation2 Apr 2024 Zhouhao Sun, Xiao Ding, Li Du, Bibo Cai, Jinglong Gao, Ting Liu, Qin Bing

To address this issue, we propose a novel framework, named Generalizable and Faithful Reasoner (GFaiR), which introduces the paradigm of resolution refutation.

Deciphering the Impact of Pretraining Data on Large Language Models through Machine Unlearning

no code implementations18 Feb 2024 Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Zhouhao Sun, Jun Shi, Ting Liu, Bing Qin

Through pretraining on a corpus with various sources, Large Language Models (LLMs) have gained impressive performance.

Machine Unlearning

Proximity QA: Unleashing the Power of Multi-Modal Large Language Models for Spatial Proximity Analysis

1 code implementation31 Jan 2024 Jianing Li, Xi Nan, Ming Lu, Li Du, Shanghang Zhang

To overcome this limitation in MLLMs, we introduce Proximity Question Answering (Proximity QA), a novel framework designed to enable MLLMs to infer the proximity relationship between objects in images.

Multi-Task Learning Question Answering +1

PromptCoT: Align Prompt Distribution via Adapted Chain-of-Thought

no code implementations CVPR 2024 Junyi Yao, Yijiang Liu, Zhen Dong, Mingfei Guo, Helan Hu, Kurt Keutzer, Li Du, Daquan Zhou, Shanghang Zhang

Considering computational efficiency instead of allocating a dedicated LLM for prompt enhancement to each individual model or dataset we integrate adapters that facilitate dataset-specific adaptation leveraging a shared pre-trained LLM as the foundation for this process.

Computational Efficiency Prompt Engineering +1

Principled Gradient-based Markov Chain Monte Carlo for Text Generation

no code implementations29 Dec 2023 Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason Eisner, Holden Lee, Ryan Cotterell

Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence.

Language Modelling Text Generation

Formal Aspects of Language Modeling

no code implementations7 Nov 2023 Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du

Large language models have become one of the most commonly deployed NLP inventions.

Language Modelling

On the Representational Capacity of Recurrent Neural Language Models

1 code implementation19 Oct 2023 Franz Nowak, Anej Svete, Li Du, Ryan Cotterell

We extend the Turing completeness result to the probabilistic case, showing how a rationally weighted RLM with unbounded computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions.

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

no code implementations11 Sep 2023 Li Du, Yequan Wang, Xingrun Xing, Yiqun Ya, Xiang Li, Xin Jiang, Xuezhi Fang

Although demonstrating superb performance on various NLP tasks, large language models (LLMs) still suffer from the hallucination problem, which threatens the reliability of LLMs.

Hallucination Instruction Following +2

FLM-101B: An Open LLM and How to Train It with $100K Budget

no code implementations7 Sep 2023 Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun, Yequan Wang

We demonstrate that a 101B-parameter LLM with 0. 31T tokens can be trained with a budget of 100K US dollars.

Memorization

FPTQ: Fine-grained Post-Training Quantization for Large Language Models

no code implementations30 Aug 2023 Qingyuan Li, Yifan Zhang, Liang Li, Peng Yao, Bo Zhang, Xiangxiang Chu, Yerui Sun, Li Du, Yuchen Xie

In this study, we propose a novel W4A8 post-training quantization method for the available open-sourced LLMs, which combines the advantages of both two recipes.

Quantization

A Formal Perspective on Byte-Pair Encoding

1 code implementation29 Jun 2023 Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Tim Vieira, Mrinmaya Sachan, Ryan Cotterell

Via submodular functions, we prove that the iterative greedy version is a $\frac{1}{{\sigma(\boldsymbol{\mu}^\star)}}(1-e^{-{\sigma(\boldsymbol{\mu}^\star)}})$-approximation of an optimal merge sequence, where ${\sigma(\boldsymbol{\mu}^\star)}$ is the total backward curvature with respect to the optimal merge sequence $\boldsymbol{\mu}^\star$.

Combinatorial Optimization

Structured Voronoi Sampling

1 code implementation NeurIPS 2023 Afra Amini, Li Du, Ryan Cotterell

In this paper, we take an important step toward building a principled approach for sampling from language models with gradient-based methods.

Text Generation

Autoregressive Modeling with Lookahead Attention

no code implementations20 May 2023 Li Du, Hongyuan Mei, Jason Eisner

To predict the next token, autoregressive models ordinarily examine the past.

Morphological Inflection

A Measure-Theoretic Characterization of Tight Language Models

no code implementations20 Dec 2022 Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell

Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings.

Language Modelling

ReCo: Reliable Causal Chain Reasoning via Structural Causal Recurrent Neural Networks

1 code implementation16 Dec 2022 Kai Xiong, Xiao Ding, Zhongyang Li, Li Du, Bing Qin, Yi Zheng, Baoxing Huai

Causal chain reasoning (CCR) is an essential ability for many decision-making AI systems, which requires the model to build reliable causal chains by connecting causal pairs.

Decision Making

CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification

no code implementations6 Dec 2022 Lirui Xiao, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang

CSQ stabilizes the bit-level mixed-precision training process with a bi-level gradual continuous sparsification on both the bit values of the quantized weights and the bit selection in determining the quantization precision of each layer.

Quantization

NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers

no code implementations CVPR 2023 Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang

Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer.

Quantization

Uncertainty Guided Depth Fusion for Spike Camera

no code implementations26 Aug 2022 Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang

In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.

Autonomous Driving Stereo Depth Estimation

DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples Discrimination

no code implementations21 Aug 2022 Tingting Wu, Xiao Ding, Hao Zhang, Jinglong Gao, Li Du, Bing Qin, Ting Liu

To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful (e. g., easy to hard) sequence.

Image Classification regression

Text Difficulty Study: Do machines behave the same as humans regarding text difficulty?

no code implementations14 Aug 2022 Bowen Chen, Xiao Ding, Li Du, Qin Bing, Ting Liu

Given a task, human learns from easy to hard, whereas the model learns randomly.

A Graph Enhanced BERT Model for Event Prediction

no code implementations Findings (ACL) 2022 Li Du, Xiao Ding, Yue Zhang, Kai Xiong, Ting Liu, Bing Qin

To this end, we incorporate an additional structured variable into BERT to learn to predict the event connections in the training process.

e-CARE: a New Dataset for Exploring Explainable Causal Reasoning

1 code implementation ACL 2022 Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin

Understanding causality has vital importance for various Natural Language Processing (NLP) applications.

valid

Binary Neural Networks as a general-propose compute paradigm for on-device computer vision

no code implementations8 Feb 2022 Guhong Nie, Lirui Xiao, Menglong Zhu, Dongliang Chu, Yue Shen, Peng Li, Kang Yang, Li Du, Bo Chen

For binary neural networks (BNNs) to become the mainstream on-device computer vision algorithm, they must achieve a superior speed-vs-accuracy tradeoff than 8-bit quantization and establish a similar degree of general applicability in vision tasks.

Quantization Super-Resolution

Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression

no code implementations12 Oct 2021 Zhuang Shao, Xiaoliang Chen, Li Du, Lei Chen, Yuan Du, Wei Zhuang, Huadong Wei, Chenjia Xie, Zhongfeng Wang

To maintain real-time processing in embedded systems, large on-chip memory is required to buffer the interlayer feature maps.

Feature Compression Quantization

ExCAR: Event Graph Knowledge Enhanced Explainable Causal Reasoning

1 code implementation ACL 2021 Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin

ExCAR first acquires additional evidence information from a large-scale causal event graph as logical rules for causal reasoning.

Representation Learning

Learning Event Graph Knowledge for Abductive Reasoning

1 code implementation ACL 2021 Li Du, Xiao Ding, Ting Liu, Bing Qin

Abductive reasoning aims at inferring the most plausible explanation for observed events, which would play critical roles in various NLP applications, such as reading comprehension and question answering.

Question Answering Reading Comprehension

Training Deep Neural Networks Using Posit Number System

no code implementations6 Sep 2019 Jinming Lu, Siyuan Lu, Zhisheng Wang, Chao Fang, Jun Lin, Zhongfeng Wang, Li Du

With the increasing size of Deep Neural Network (DNN) models, the high memory space requirements and computational complexity have become an obstacle for efficient DNN implementations.

Image Classification

The Curious Case of Neural Text Degeneration

16 code implementations ICLR 2020 Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi

Despite considerable advancements with deep neural language models, the enigma of neural text degeneration persists when these models are tested as text generators.

Diversity Language Modelling

A Streaming Accelerator for Deep Convolutional Neural Networks with Image and Feature Decomposition for Resource-limited System Applications

no code implementations15 Sep 2017 Yuan Du, Li Du, Yilei Li, Junjie Su, Mau-Chung Frank Chang

Deep convolutional neural networks (CNN) are widely used in modern artificial intelligence (AI) and smart vision systems but also limited by computation latency, throughput, and energy efficiency on a resource-limited scenario, such as mobile devices, internet of things (IoT), unmanned aerial vehicles (UAV), and so on.

A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things

no code implementations8 Jul 2017 Li Du, Yuan Du, Yilei Li, Mau-Chung Frank Chang

To implement image detection using CNN in the internet of things (IoT) devices, a streaming hardware accelerator is proposed.

Cannot find the paper you are looking for? You can Submit a new open access paper.