Search Results for author: Li Du

Found 40 papers, 14 papers with code

CogBERT: Cognition-Guided Pre-trained Language Models

1 code implementation • COLING 2022 • Xiao Ding, Bowen Chen, Li Du, Bing Qin, Ting Liu

To fill the gap, we propose CogBERT, a framework that can induce fine-grained cognitive features from cognitive data and incorporate cognitive features into BERT by adaptively adjusting the weight of cognitive features for different NLP tasks.

EEG

Paper
Code

Neural Natural Logic Inference for Interpretable Question Answering

1 code implementation • EMNLP 2021 • Jihao Shi, Xiao Ding, Li Du, Ting Liu, Bing Qin

Many open-domain question answering problems can be cast as a textual entailment task, where a question and candidate answers are concatenated to form hypotheses.

Multiple-choice Natural Language Inference +1

Paper
Code

Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning

no code implementations • 13 Apr 2024 • Yijiang Liu, Rongyu Zhang, Huanrui Yang, Kurt Keutzer, Yuan Du, Li Du, Shanghang Zhang

Large Language Models (LLMs) have demonstrated significant potential in performing multiple tasks in multimedia applications, ranging from content generation to interactive entertainment, and artistic creation.

Paper
Add Code

Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation

no code implementations • 2 Apr 2024 • Zhouhao Sun, Xiao Ding, Li Du, Bibo Cai, Jinglong Gao, Ting Liu, Qin Bing

To address this issue, we propose a novel framework, named Generalizable and Faithful Reasoner (GFaiR), which introduces the paradigm of resolution refutation.

Paper
Add Code

Deciphering the Impact of Pretraining Data on Large Language Models through Machine Unlearning

no code implementations • 18 Feb 2024 • Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Zhouhao Sun, Jun Shi, Ting Liu, Bing Qin

Through pretraining on a corpus with various sources, Large Language Models (LLMs) have gained impressive performance.

Machine Unlearning

Paper
Add Code

Proximity QA: Unleashing the Power of Multi-Modal Large Language Models for Spatial Proximity Analysis

1 code implementation • 31 Jan 2024 • Jianing Li, Xi Nan, Ming Lu, Li Du, Shanghang Zhang

To overcome this limitation in MLLMs, we introduce Proximity Question Answering (Proximity QA), a novel framework designed to enable MLLMs to infer the proximity relationship between objects in images.

Multi-Task Learning Question Answering +1

Paper
Code

VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness

no code implementations • 15 Jan 2024 • Rongyu Zhang, Zefan Cai, Huanrui Yang, Zidong Liu, Denis Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Baobao Chang, Yuan Du, Li Du, Shanghang Zhang

Finetuning a pretrained vision model (PVM) is a common technique for learning downstream vision tasks.

Computational Efficiency Image Classification

Paper
Add Code

Principled Gradient-based Markov Chain Monte Carlo for Text Generation

no code implementations • 29 Dec 2023 • Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason Eisner, Holden Lee, Ryan Cotterell

Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence.

Language Modelling Text Generation

Paper
Add Code

BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials

1 code implementation • 14 Dec 2023 • Xingrun Xing, Li Du, Xinyuan Wang, Xianlin Zeng, Yequan Wang, Zheng Zhang, Jiajun Zhang

Specifically, we first analyze the binarization error in self-attention operations and derive the polynomials of binarization error.

Binarization Natural Language Understanding

Paper
Code

Formal Aspects of Language Modeling

no code implementations • 7 Nov 2023 • Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du

Large language models have become one of the most commonly deployed NLP inventions.

Language Modelling

Paper
Add Code

On the Representational Capacity of Recurrent Neural Language Models

1 code implementation • 19 Oct 2023 • Franz Nowak, Anej Svete, Li Du, Ryan Cotterell

We extend the Turing completeness result to the probabilistic case, showing how a rationally weighted RLM with unbounded computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions.

Paper
Code

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

no code implementations • 11 Sep 2023 • Li Du, Yequan Wang, Xingrun Xing, Yiqun Ya, Xiang Li, Xin Jiang, Xuezhi Fang

Although demonstrating superb performance on various NLP tasks, large language models (LLMs) still suffer from the hallucination problem, which threatens the reliability of LLMs.

Hallucination Instruction Following +2

Paper
Add Code

FLM-101B: An Open LLM and How to Train It with $100K Budget

no code implementations • 7 Sep 2023 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun, Yequan Wang

We demonstrate that a 101B-parameter LLM with 0. 31T tokens can be trained with a budget of 100K US dollars.

Memorization

Paper
Add Code

FPTQ: Fine-grained Post-Training Quantization for Large Language Models

no code implementations • 30 Aug 2023 • Qingyuan Li, Yifan Zhang, Liang Li, Peng Yao, Bo Zhang, Xiangxiang Chu, Yerui Sun, Li Du, Yuchen Xie

In this study, we propose a novel W4A8 post-training quantization method for the available open-sourced LLMs, which combines the advantages of both two recipes.

Quantization

Paper
Add Code

QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection

no code implementations • ICCV 2023 • Yifan Zhang, Zhen Dong, Huanrui Yang, Ming Lu, Cheng-Ching Tseng, Yuan Du, Kurt Keutzer, Li Du, Shanghang Zhang

Multi-view 3D detection based on BEV (bird-eye-view) has recently achieved significant improvements.

3D Object Detection Model Compression +2

Paper
Add Code

Tokenization and the Noiseless Channel

1 code implementation • 29 Jun 2023 • Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Mrinmaya Sachan, Ryan Cotterell

Subword tokenization is a key part of many NLP pipelines.

Machine Translation

Paper
Code

A Formal Perspective on Byte-Pair Encoding

1 code implementation • 29 Jun 2023 • Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Tim Vieira, Mrinmaya Sachan, Ryan Cotterell

Via submodular functions, we prove that the iterative greedy version is a $\frac{1}{{\sigma(\boldsymbol{\mu}^\star)}}(1-e^{-{\sigma(\boldsymbol{\mu}^\star)}})$-approximation of an optimal merge sequence, where ${\sigma(\boldsymbol{\mu}^\star)}$ is the total backward curvature with respect to the optimal merge sequence $\boldsymbol{\mu}^\star$.

Combinatorial Optimization

Paper
Code

Structured Voronoi Sampling

1 code implementation • NeurIPS 2023 • Afra Amini, Li Du, Ryan Cotterell

In this paper, we take an important step toward building a principled approach for sampling from language models with gradient-based methods.

Text Generation

Paper
Code

Autoregressive Modeling with Lookahead Attention

no code implementations • 20 May 2023 • Li Du, Hongyuan Mei, Jason Eisner

To predict the next token, autoregressive models ordinarily examine the past.

Morphological Inflection

Paper
Add Code

A Measure-Theoretic Characterization of Tight Language Models

no code implementations • 20 Dec 2022 • Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell

Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings.

Language Modelling

Paper
Add Code

ReCo: Reliable Causal Chain Reasoning via Structural Causal Recurrent Neural Networks

1 code implementation • 16 Dec 2022 • Kai Xiong, Xiao Ding, Zhongyang Li, Li Du, Bing Qin, Yi Zheng, Baoxing Huai

Causal chain reasoning (CCR) is an essential ability for many decision-making AI systems, which requires the model to build reliable causal chains by connecting causal pairs.

Decision Making

Paper
Code

CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification

no code implementations • 6 Dec 2022 • Lirui Xiao, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang

CSQ stabilizes the bit-level mixed-precision training process with a bi-level gradual continuous sparsification on both the bit values of the quantized weights and the bit selection in determining the quantization precision of each layer.

Quantization

Paper
Add Code

BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for BEV 3D Object Detection

1 code implementation • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang

In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.

3D Object Detection Autonomous Driving +4

Paper
Code

NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers

no code implementations • CVPR 2023 • Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang

Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer.

Quantization

Paper
Add Code

Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning

no code implementations • 18 Oct 2022 • Shuo Xie, Jiahao Qiu, Ankita Pasad, Li Du, Qing Qu, Hongyuan Mei

We propose to select layers based on the variability of their hidden states given a task-specific corpus.

Language Modelling Transfer Learning

Paper
Add Code

Uncertainty Guided Depth Fusion for Spike Camera

no code implementations • 26 Aug 2022 • Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang

In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.

Autonomous Driving Stereo Depth Estimation

Paper
Add Code

DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples Discrimination

no code implementations • 21 Aug 2022 • Tingting Wu, Xiao Ding, Hao Zhang, Jinglong Gao, Li Du, Bing Qin, Ting Liu

To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful (e. g., easy to hard) sequence.

Image Classification regression

Paper
Add Code

Text Difficulty Study: Do machines behave the same as humans regarding text difficulty?

no code implementations • 14 Aug 2022 • Bowen Chen, Xiao Ding, Li Du, Qin Bing, Ting Liu

Given a task, human learns from easy to hard, whereas the model learns randomly.

Paper
Add Code

A Graph Enhanced BERT Model for Event Prediction

no code implementations • Findings (ACL) 2022 • Li Du, Xiao Ding, Yue Zhang, Kai Xiong, Ting Liu, Bing Qin

To this end, we incorporate an additional structured variable into BERT to learn to predict the event connections in the training process.

Paper
Add Code

e-CARE: a New Dataset for Exploring Explainable Causal Reasoning

1 code implementation • ACL 2022 • Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin

Understanding causality has vital importance for various Natural Language Processing (NLP) applications.

valid

Paper
Code

Binary Neural Networks as a general-propose compute paradigm for on-device computer vision

no code implementations • 8 Feb 2022 • Guhong Nie, Lirui Xiao, Menglong Zhu, Dongliang Chu, Yue Shen, Peng Li, Kang Yang, Li Du, Bo Chen

For binary neural networks (BNNs) to become the mainstream on-device computer vision algorithm, they must achieve a superior speed-vs-accuracy tradeoff than 8-bit quantization and establish a similar degree of general applicability in vision tasks.

Quantization Super-Resolution

Paper
Add Code

Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression

no code implementations • 12 Oct 2021 • Zhuang Shao, Xiaoliang Chen, Li Du, Lei Chen, Yuan Du, Wei Zhuang, Huadong Wei, Chenjia Xie, Zhongfeng Wang

To maintain real-time processing in embedded systems, large on-chip memory is required to buffer the interlayer feature maps.

Feature Compression Quantization

Paper
Add Code

Learning Event Graph Knowledge for Abductive Reasoning

1 code implementation • ACL 2021 • Li Du, Xiao Ding, Ting Liu, Bing Qin

Abductive reasoning aims at inferring the most plausible explanation for observed events, which would play critical roles in various NLP applications, such as reading comprehension and question answering.

Question Answering Reading Comprehension

Paper
Code

ExCAR: Event Graph Knowledge Enhanced Explainable Causal Reasoning

1 code implementation • ACL 2021 • Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin

ExCAR first acquires additional evidence information from a large-scale causal event graph as logical rules for causal reasoning.

Representation Learning

Paper
Code

Modeling Event Background for If-Then Commonsense Reasoning Using Context-aware Variational Autoencoder

no code implementations • IJCNLP 2019 • Li Du, Xiao Ding, Ting Liu, Zhongyang Li

Understanding event and event-centered commonsense reasoning are crucial for natural language processing (NLP).

Paper
Add Code

Training Deep Neural Networks Using Posit Number System

no code implementations • 6 Sep 2019 • Jinming Lu, Siyuan Lu, Zhisheng Wang, Chao Fang, Jun Lin, Zhongfeng Wang, Li Du

With the increasing size of Deep Neural Network (DNN) models, the high memory space requirements and computational complexity have become an obstacle for efficient DNN implementations.

Image Classification

Paper
Add Code

The Curious Case of Neural Text Degeneration

16 code implementations • ICLR 2020 • Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi

Despite considerable advancements with deep neural language models, the enigma of neural text degeneration persists when these models are tested as text generators.

Language Modelling

47,906

Paper
Code

An Analog Neural Network Computing Engine using CMOS-Compatible Charge-Trap-Transistor (CTT)

no code implementations • 19 Sep 2017 • Yuan Du, Li Du, Xuefeng Gu, Jieqiong Du, X. Shawn Wang, Boyu Hu, Mingzhe Jiang, Xiaoliang Chen, Junjie Su, Subramanian S. Iyer, Mau-Chung Frank Chang

The proposed computing engine is composed of a scalable CTT multiplier array and energy efficient analog-digital interfaces.

Paper
Add Code

A Streaming Accelerator for Deep Convolutional Neural Networks with Image and Feature Decomposition for Resource-limited System Applications

no code implementations • 15 Sep 2017 • Yuan Du, Li Du, Yilei Li, Junjie Su, Mau-Chung Frank Chang

Deep convolutional neural networks (CNN) are widely used in modern artificial intelligence (AI) and smart vision systems but also limited by computation latency, throughput, and energy efficiency on a resource-limited scenario, such as mobile devices, internet of things (IoT), unmanned aerial vehicles (UAV), and so on.

Paper
Add Code

A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things

no code implementations • 8 Jul 2017 • Li Du, Yuan Du, Yilei Li, Mau-Chung Frank Chang

To implement image detection using CNN in the internet of things (IoT) devices, a streaming hardware accelerator is proposed.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.