Search Results for author: Yuhan Chen

Found 19 papers, 9 papers with code

CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes

no code implementations • 1 Apr 2024 • Ting En Lam, Yuhan Chen, Elston Tan, Eric Peh, Ruirui Chen, Paritosh Parmar, Basura Fernando

We will release our dataset, codes, and models to help future efforts in this domain.

Question Answering Video Question Answering

Paper
Add Code

Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models

1 code implementation • 28 Mar 2024 • Ang Lv, Kaiyi Zhang, Yuhan Chen, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan

In this paper, we deeply explore the mechanisms employed by Transformer-based language models in factual recall tasks.

Paper
Code

AS-ES Learning: Towards Efficient CoT Learning in Small Models

no code implementations • 4 Mar 2024 • Nuwa Xi, Yuhan Chen, Sendong Zhao, Haochun Wang, Bing Qin, Ting Liu

Chain-of-Thought (CoT) serves as a critical emerging ability in LLMs, especially when it comes to logical reasoning.

Data Augmentation Logical Reasoning

Paper
Add Code

PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks

1 code implementation • 1 Feb 2024 • Sifan Wang, Bowen Li, Yuhan Chen, Paris Perdikaris

While physics-informed neural networks (PINNs) have become a popular deep learning framework for tackling forward and inverse problems governed by partial differential equations (PDEs), their performance is known to degrade when larger and deeper neural network architectures are employed.

133

Paper
Code

LCV2: An Efficient Pretraining-Free Framework for Grounded Visual Question Answering

no code implementations • 29 Jan 2024 • Yuhan Chen, Lumei Su, Lihua Chen, Zhiwei Lin

Experimental implementations were conducted under constrained computational and memory resources, evaluating the proposed method's performance on benchmark datasets including GQA, CLEVR, and VizWiz-VQA-Grounding.

Language Modelling Large Language Model +5

Paper
Add Code

Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning

1 code implementation • 12 Jan 2024 • Kaiyi Zhang, Ang Lv, Yuhan Chen, Hansen Ha, Tao Xu, Rui Yan

In this paper, by treating in-context learning (ICL) as a meta-optimization process, we explain why LLMs are sensitive to the order of ICL examples.

In-Context Learning Zero-Shot Learning

Paper
Code

Analyzing the Inherent Response Tendency of LLMs: Real-World Instructions-Driven Jailbreak

no code implementations • 7 Dec 2023 • Yanrui Du, Sendong Zhao, Ming Ma, Yuhan Chen, Bing Qin

The jailbreak idea of our method is "Inherent Response Tendency Analysis" which identifies real-world instructions that can inherently induce LLMs to generate affirmation responses and the corresponding jailbreak strategy is "Real-World Instructions-Driven Jailbreak" which involves strategically splicing real-world instructions identified through the above analysis around the malicious instruction.

Paper
Add Code

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

1 code implementation • 7 Dec 2023 • Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan

Specifically, the crucial information in the context will be potentially overlooked by model when it is positioned in the trough zone of the attention waveform, leading to decreased performance.

Ranked #2 on Trajectory Planning on ToolBench

Trajectory Planning

968

Paper
Code

Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse

1 code implementation • 13 Nov 2023 • Ang Lv, Kaiyi Zhang, Shufang Xie, Quan Tu, Yuhan Chen, Ji-Rong Wen, Rui Yan

Recent studies have highlighted a phenomenon in large language models (LLMs) known as "the reversal curse," in which the order of knowledge entities in the training data biases the models' comprehension.

Denoising Language Modelling

Paper
Code

Make Your Decision Convincing! A Unified Two-Stage Framework: Self-Attribution and Decision-Making

no code implementations • 20 Oct 2023 • Yanrui Du, Sendong Zhao, Haochun Wang, Yuhan Chen, Rui Bai, Zewen Qiang, MuZhen Cai, Bing Qin

Through extensive experiments on five reasoning datasets from the ERASER benchmark, we demonstrate that our framework not only establishes a more reliable link between the generated rationale and model decision but also achieves competitive results in task performance and the quality of rationale.

Decision Making

Paper
Add Code

From Artificially Real to Real: Leveraging Pseudo Data from Large Language Models for Low-Resource Molecule Discovery

1 code implementation • 11 Sep 2023 • Yuhan Chen, Nuwa Xi, Yanrui Du, Haochun Wang, Jianyu Chen, Sendong Zhao, Bing Qin

Furthermore, our method shows a sustained improvement as the volume of pseudo data increases, revealing the great potential of pseudo data in advancing low-resource cross-modal molecule discovery.

Descriptive Domain Adaptation +2

Paper
Code

GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models' Over-Reliance on Superficial Clue

no code implementations • 8 Sep 2023 • Yanrui Du, Sendong Zhao, Yuhan Chen, Rai Bai, Jing Liu, Hua Wu, Haifeng Wang, Bing Qin

To address this issue, it is crucial to analyze and mitigate the influence of superficial clues on STM models.

Paper
Add Code

Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese

1 code implementation • 8 Sep 2023 • Haochun Wang, Sendong Zhao, Zewen Qiang, Zijian Li, Nuwa Xi, Yanrui Du, MuZhen Cai, Haoqiang Guo, Yuhan Chen, Haoming Xu, Bing Qin, Ting Liu

To address this challenge, we propose knowledge-tuning, which leverages structured medical knowledge bases for the LLMs to grasp domain knowledge efficiently and facilitate reliable response generation.

Domain Adaptation Hallucination +2

4,241

Paper
Code

DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations

no code implementations • 29 Jun 2023 • Ang Lv, Jinpeng Li, Yuhan Chen, Xing Gao, Ji Zhang, Rui Yan

In open-domain dialogue generation tasks, contexts and responses in most datasets are one-to-one mapped, violating an important many-to-many characteristic: a context leads to various responses, and a response answers multiple contexts.

Data Augmentation Dialogue Generation +2

Paper
Add Code

LSGNN: Towards General Graph Neural Network in Node Classification by Local Similarity

1 code implementation • 7 May 2023 • Yuhan Chen, Yihong Luo, Jing Tang, Liang Yang, Siya Qiu, Chuan Wang, Xiaochun Cao

Motivated by it, we propose to use the local similarity (LocalSim) to learn node-level weighted fusion, which can also serve as a plug-and-play module.

Node Classification

Paper
Code

Large region targets observation scheduling by multiple satellites using resampling particle swarm optimization

no code implementations • 21 Jun 2022 • Yi Gu, Chao Han, Yuhan Chen, Shenggang Liu, Xinwei Wang

A greedy initialization-based resampling particle swarm optimization (GI-RPSO) algorithm is proposed to solve the model.

Earth Observation Scheduling

Paper
Add Code

Neural Symplectic Form: Learning Hamiltonian Equations on General Coordinate Systems

no code implementations • NeurIPS 2021 • Yuhan Chen, Takashi Matsubara, Takaharu Yaguchi

In this study, we propose a model that learns the symplectic form from data using neural networks, thereby providing a method for learning Hamiltonian equations from data represented in general coordinate systems, which are not limited to the generalized coordinates and the generalized momenta.

Paper
Add Code

KAM Theory Meets Statistical Learning Theory: Hamiltonian Neural Networks with Non-Zero Training Loss

no code implementations • 22 Feb 2021 • Yuhan Chen, Takashi Matsubara, Takaharu Yaguchi

To apply the KAM theory, we provide a generalization error bound for Hamiltonian neural networks by deriving an estimate of the covering number of the gradient of the multi-layer perceptron, which is the key ingredient of the model.

Learning Theory

Paper
Add Code

Bayesian graphical compositional regression for microbiome data

2 code implementations • 13 Dec 2017 • Jialiang Mao, Yuhan Chen, Li Ma

An important task in microbiome studies is to test the existence of and give characterization to differences in the microbiome composition across groups of samples.

Methodology Applications Computation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.