Search Results for author: Shiyang Li

Found 17 papers, 10 papers with code

Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection

1 code implementation • 31 Jul 2023 • Jun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin

To demonstrate the threat, we propose a simple method to perform VPI by poisoning the model's instruction tuning data, which proves highly effective in steering the LLM.

Backdoor Attack

Paper
Code

Instruction-following Evaluation through Verbalizer Manipulation

no code implementations • 20 Jul 2023 • Shiyang Li, Jun Yan, Hai Wang, Zheng Tang, Xiang Ren, Vijay Srinivasan, Hongxia Jin

We conduct a comprehensive evaluation of four major model families across nine datasets, employing twelve sets of verbalizers for each of them.

Instruction Following

Paper
Add Code

AlpaGasus: Training A Better Alpaca with Fewer Data

3 code implementations • 17 Jul 2023 • Lichang Chen, Shiyang Li, Jun Yan, Hai Wang, Kalpa Gunaratna, Vikas Yadav, Zheng Tang, Vijay Srinivasan, Tianyi Zhou, Heng Huang, Hongxia Jin

Large language models (LLMs) strengthen instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data.

Instruction Following

162

Paper
Code

Graph Reasoning for Question Answering with Triplet Retrieval

no code implementations • 30 May 2023 • Shiyang Li, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifeng Yan, Chao Zhang, Bing Yin

State-of-the-art methods often utilize entities in questions to retrieve local subgraphs, which are then fed into KG encoder, e. g. graph neural networks (GNNs), to model their local structures and integrated into language models for question answering.

Knowledge Graphs Question Answering +1

Paper
Add Code

Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting

1 code implementation • 22 May 2023 • Xinlu Zhang, Shiyang Li, Xianjun Yang, Chenxin Tian, Yao Qin, Linda Ruth Petzold

Large language models (LLMs) demonstrate remarkable medical expertise, but data privacy concerns impede their direct use in healthcare environments.

Decision Making Privacy Preserving

Paper
Code

Time Series as Images: Vision Transformer for Irregularly Sampled Time Series

1 code implementation • NeurIPS 2023 • Zekun Li, Shiyang Li, Xifeng Yan

This paper introduces a novel perspective by converting irregularly sampled time series into line graph images, then utilizing powerful pre-trained vision transformers for time series classification in the same way as image classification.

Image Classification Time Series +1

Paper
Code

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

1 code implementation • 18 Oct 2022 • Xinlu Zhang, Shiyang Li, Zhiyu Chen, Xifeng Yan, Linda Petzold

Our method first addresses irregularity in each single modality by (1) modeling irregular time series by dynamically incorporating hand-crafted imputation embeddings into learned interpolation embeddings via a gating mechanism, and (2) casting a series of clinical note representations as multivariate irregular time series and tackling irregularity via a time attention mechanism.

Imputation Irregular Time Series +2

Paper
Code

Explanations from Large Language Models Make Small Reasoners Better

no code implementations • 13 Oct 2022 • Shiyang Li, Jianshu Chen, Yelong Shen, Zhiyu Chen, Xinlu Zhang, Zekun Li, Hong Wang, Jing Qian, Baolin Peng, Yi Mao, Wenhu Chen, Xifeng Yan

Integrating free-text explanations to in-context learning of large language models (LLM) is shown to elicit strong reasoning capabilities along with reasonable explanations.

Explanation Generation In-Context Learning +1

Paper
Add Code

Controllable Dialogue Simulation with In-Context Learning

1 code implementation • 9 Oct 2022 • Zekun Li, Wenhu Chen, Shiyang Li, Hong Wang, Jing Qian, Xifeng Yan

Experimental results on the MultiWOZ dataset demonstrate that training a model on the simulated dialogues leads to even better performance than using the same amount of human-generated dialogues under the challenging low-resource settings, with as few as 85 dialogues as a seed.

Data Augmentation In-Context Learning +2

Paper
Code

ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering

1 code implementation • 7 Oct 2022 • Zhiyu Chen, Shiyang Li, Charese Smiley, Zhiqiang Ma, Sameena Shah, William Yang Wang

With the recent advance in large pre-trained language models, researchers have achieved record performances in NLP tasks that mostly focus on language pattern matching.

Ranked #2 on Question Answering on ConvFinQA

Conversational Question Answering

Paper
Code

Limitations of Language Models in Arithmetic and Symbolic Induction

no code implementations • 9 Aug 2022 • Jing Qian, Hong Wang, Zekun Li, Shiyang Li, Xifeng Yan

LMs with tutor is able to deliver 100% accuracy in situations of OOD and repeating symbols, shedding new insights on the boundary of large LMs in induction.

Paper
Add Code

Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding

no code implementations • Findings (EMNLP) 2021 • Shiyang Li, Semih Yavuz, Wenhu Chen, Xifeng Yan

Task-adaptive pre-training (TAPT) and Self-training (ST) have emerged as the major semi-supervised approaches to improve natural language understanding (NLU) tasks with massive amount of unlabeled data.

named-entity-recognition Named Entity Recognition +6

Paper
Add Code

CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers

2 code implementations • ICLR 2021 • Shiyang Li, Semih Yavuz, Kazuma Hashimoto, Jia Li, Tong Niu, Nazneen Rajani, Xifeng Yan, Yingbo Zhou, Caiming Xiong

Dialogue state trackers have made significant progress on benchmark datasets, but their generalization capability to novel and realistic scenarios beyond the held-out conversations is less understood.

Ranked #2 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1 (using extra training data)

counterfactual Dialogue State Tracking +1

Paper
Code

Teaching Pretrained Models with Commonsense Reasoning: A Preliminary KB-Based Approach

no code implementations • 20 Sep 2019 • Shiyang Li, Jianshu Chen, Dian Yu

Recently, pretrained language models (e. g., BERT) have achieved great success on many downstream natural language understanding tasks and exhibit a certain level of commonsense reasoning ability.

Few-Shot Learning Logical Reasoning +2

Paper
Add Code

TabFact: A Large-scale Dataset for Table-based Fact Verification

1 code implementation • ICLR 2020 • Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, William Yang Wang

To this end, we construct a large-scale dataset called TabFact with 16k Wikipedia tables as the evidence for 118k human-annotated natural language statements, which are labeled as either ENTAILED or REFUTED.

Ranked #10 on Table-based Fact Verification on TabFact

16k Fact Checking +4

352

Paper
Code

Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

2 code implementations • NeurIPS 2019 • Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, Xifeng Yan

Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation.

Ranked #27 on Image Generation on ImageNet 64x64 (Bits per dim metric)

Time Series Time Series Forecasting

1,881

Paper
Code

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

no code implementations • NeurIPS 2018 • Tianyi Liu, Shiyang Li, Jianping Shi, Enlu Zhou, Tuo Zhao

Asynchronous momentum stochastic gradient descent algorithms (Async-MSGD) is one of the most popular algorithms in distributed machine learning.

Stochastic Optimization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.