no code implementations • 1 Apr 2024 • Ting En Lam, Yuhan Chen, Elston Tan, Eric Peh, Ruirui Chen, Paritosh Parmar, Basura Fernando
We will release our dataset, codes, and models to help future efforts in this domain.
1 code implementation • 28 Mar 2024 • Ang Lv, Kaiyi Zhang, Yuhan Chen, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan
In this paper, we deeply explore the mechanisms employed by Transformer-based language models in factual recall tasks.
no code implementations • 4 Mar 2024 • Nuwa Xi, Yuhan Chen, Sendong Zhao, Haochun Wang, Bing Qin, Ting Liu
Chain-of-Thought (CoT) serves as a critical emerging ability in LLMs, especially when it comes to logical reasoning.
1 code implementation • 1 Feb 2024 • Sifan Wang, Bowen Li, Yuhan Chen, Paris Perdikaris
While physics-informed neural networks (PINNs) have become a popular deep learning framework for tackling forward and inverse problems governed by partial differential equations (PDEs), their performance is known to degrade when larger and deeper neural network architectures are employed.
no code implementations • 29 Jan 2024 • Yuhan Chen, Lumei Su, Lihua Chen, Zhiwei Lin
Experimental implementations were conducted under constrained computational and memory resources, evaluating the proposed method's performance on benchmark datasets including GQA, CLEVR, and VizWiz-VQA-Grounding.
1 code implementation • 12 Jan 2024 • Kaiyi Zhang, Ang Lv, Yuhan Chen, Hansen Ha, Tao Xu, Rui Yan
In this paper, by treating in-context learning (ICL) as a meta-optimization process, we explain why LLMs are sensitive to the order of ICL examples.
no code implementations • 7 Dec 2023 • Yanrui Du, Sendong Zhao, Ming Ma, Yuhan Chen, Bing Qin
The jailbreak idea of our method is "Inherent Response Tendency Analysis" which identifies real-world instructions that can inherently induce LLMs to generate affirmation responses and the corresponding jailbreak strategy is "Real-World Instructions-Driven Jailbreak" which involves strategically splicing real-world instructions identified through the above analysis around the malicious instruction.
1 code implementation • 7 Dec 2023 • Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan
Specifically, the crucial information in the context will be potentially overlooked by model when it is positioned in the trough zone of the attention waveform, leading to decreased performance.
Ranked #2 on Trajectory Planning on ToolBench
1 code implementation • 13 Nov 2023 • Ang Lv, Kaiyi Zhang, Shufang Xie, Quan Tu, Yuhan Chen, Ji-Rong Wen, Rui Yan
Recent studies have highlighted a phenomenon in large language models (LLMs) known as "the reversal curse," in which the order of knowledge entities in the training data biases the models' comprehension.
no code implementations • 20 Oct 2023 • Yanrui Du, Sendong Zhao, Haochun Wang, Yuhan Chen, Rui Bai, Zewen Qiang, MuZhen Cai, Bing Qin
Through extensive experiments on five reasoning datasets from the ERASER benchmark, we demonstrate that our framework not only establishes a more reliable link between the generated rationale and model decision but also achieves competitive results in task performance and the quality of rationale.
1 code implementation • 11 Sep 2023 • Yuhan Chen, Nuwa Xi, Yanrui Du, Haochun Wang, Jianyu Chen, Sendong Zhao, Bing Qin
Furthermore, our method shows a sustained improvement as the volume of pseudo data increases, revealing the great potential of pseudo data in advancing low-resource cross-modal molecule discovery.
no code implementations • 8 Sep 2023 • Yanrui Du, Sendong Zhao, Yuhan Chen, Rai Bai, Jing Liu, Hua Wu, Haifeng Wang, Bing Qin
To address this issue, it is crucial to analyze and mitigate the influence of superficial clues on STM models.
1 code implementation • 8 Sep 2023 • Haochun Wang, Sendong Zhao, Zewen Qiang, Zijian Li, Nuwa Xi, Yanrui Du, MuZhen Cai, Haoqiang Guo, Yuhan Chen, Haoming Xu, Bing Qin, Ting Liu
To address this challenge, we propose knowledge-tuning, which leverages structured medical knowledge bases for the LLMs to grasp domain knowledge efficiently and facilitate reliable response generation.
no code implementations • 29 Jun 2023 • Ang Lv, Jinpeng Li, Yuhan Chen, Xing Gao, Ji Zhang, Rui Yan
In open-domain dialogue generation tasks, contexts and responses in most datasets are one-to-one mapped, violating an important many-to-many characteristic: a context leads to various responses, and a response answers multiple contexts.
1 code implementation • 7 May 2023 • Yuhan Chen, Yihong Luo, Jing Tang, Liang Yang, Siya Qiu, Chuan Wang, Xiaochun Cao
Motivated by it, we propose to use the local similarity (LocalSim) to learn node-level weighted fusion, which can also serve as a plug-and-play module.
no code implementations • 21 Jun 2022 • Yi Gu, Chao Han, Yuhan Chen, Shenggang Liu, Xinwei Wang
A greedy initialization-based resampling particle swarm optimization (GI-RPSO) algorithm is proposed to solve the model.
no code implementations • NeurIPS 2021 • Yuhan Chen, Takashi Matsubara, Takaharu Yaguchi
In this study, we propose a model that learns the symplectic form from data using neural networks, thereby providing a method for learning Hamiltonian equations from data represented in general coordinate systems, which are not limited to the generalized coordinates and the generalized momenta.
no code implementations • 22 Feb 2021 • Yuhan Chen, Takashi Matsubara, Takaharu Yaguchi
To apply the KAM theory, we provide a generalization error bound for Hamiltonian neural networks by deriving an estimate of the covering number of the gradient of the multi-layer perceptron, which is the key ingredient of the model.
2 code implementations • 13 Dec 2017 • Jialiang Mao, Yuhan Chen, Li Ma
An important task in microbiome studies is to test the existence of and give characterization to differences in the microbiome composition across groups of samples.
Methodology Applications Computation