Search Results for author: Simiao Zuo

Found 23 papers, 14 papers with code

Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning

1 code implementation • 25 Jan 2024 • Yanda Chen, Chandan Singh, Xiaodong Liu, Simiao Zuo, Bin Yu, He He, Jianfeng Gao

We propose explanation-consistency finetuning (EC-finetuning), a method that adapts LLMs to generate more consistent natural-language explanations on related examples.

Question Answering

Paper
Code

SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process

1 code implementation • 25 Oct 2023 • Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha

We conduct extensive experiments in both event type prediction and uncertainty quantification of arrival time.

Type prediction Uncertainty Quantification

Paper
Code

Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing

no code implementations • 20 Oct 2023 • Xinyu Hu, Pengfei Tang, Simiao Zuo, Zihan Wang, Bowen Song, Qiang Lou, Jian Jiao, Denis Charles

In Evoke, there are two instances of a same LLM: one as a reviewer (LLM-Reviewer), it scores the current prompt; the other as an author (LLM-Author), it edits the prompt by considering the edit history and the reviewer's feedback.

Logical Fallacy Detection

Paper
Add Code

DeepTagger: Knowledge Enhanced Named Entity Recognition for Web-Based Ads Queries

no code implementations • 30 Jun 2023 • Simiao Zuo, Pengfei Tang, Xinyu Hu, Qiang Lou, Jian Jiao, Denis Charles

For model-free enhancement, we collect unlabeled web queries to augment domain knowledge; and we collect web search results to enrich the information of ads queries.

Data Augmentation named-entity-recognition +2

Paper
Add Code

Machine Learning Force Fields with Data Cost Aware Training

1 code implementation • 5 Jun 2023 • Alexander Bukharin, Tianyi Liu, Shengjie Wang, Simiao Zuo, Weihao Gao, Wen Yan, Tuo Zhao

To address this issue, we propose a multi-stage computational framework -- ASTEROID, which lowers the data cost of MLFFs by leveraging a combination of cheap inaccurate data and expensive accurate data.

Paper
Code

Efficient Long Sequence Modeling via State Space Augmented Transformer

1 code implementation • 15 Dec 2022 • Simiao Zuo, Xiaodong Liu, Jian Jiao, Denis Charles, Eren Manavoglu, Tuo Zhao, Jianfeng Gao

Specifically, we augment a SSM into the bottom layer of SPADE, and we employ efficient local attention methods for the other layers.

Computational Efficiency Language Modelling +2

Paper
Code

Less is More: Task-aware Layer-wise Distillation for Language Model Compression

1 code implementation • 4 Oct 2022 • Chen Liang, Simiao Zuo, Qingru Zhang, Pengcheng He, Weizhu Chen, Tuo Zhao

As such, TED reduces the knowledge gap between the two models and helps the student to fit better on the target task.

Language Modelling Model Compression

Paper
Code

DiP-GNN: Discriminative Pre-Training of Graph Neural Networks

no code implementations • 15 Sep 2022 • Simiao Zuo, Haoming Jiang, Qingyu Yin, Xianfeng Tang, Bing Yin, Tuo Zhao

Specifically, we train a generator to recover identities of the masked edges, and simultaneously, we train a discriminator to distinguish the generated edges from the original graph's edges.

Node Classification

Paper
Add Code

Differentially Private Estimation of Hawkes Process

no code implementations • 15 Sep 2022 • Simiao Zuo, Tianyi Liu, Tuo Zhao, Hongyuan Zha

Point process models are of great importance in real world applications.

Paper
Add Code

Context-Aware Query Rewriting for Improving Users' Search Experience on E-commerce Websites

no code implementations • 15 Sep 2022 • Simiao Zuo, Qingyu Yin, Haoming Jiang, Shaohui Xi, Bing Yin, Chao Zhang, Tuo Zhao

The model subsequently calculates session representations by combining the contextual information with the instant search query using an aggregation network.

Graph Attention

Paper
Add Code

PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance

1 code implementation • 25 Jun 2022 • Qingru Zhang, Simiao Zuo, Chen Liang, Alexander Bukharin, Pengcheng He, Weizhu Chen, Tuo Zhao

Large Transformer-based models have exhibited superior performance in various natural language processing and computer vision tasks.

Image Classification Natural Language Understanding +1

Paper
Code

MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation

1 code implementation • NAACL 2022 • Simiao Zuo, Qingru Zhang, Chen Liang, Pengcheng He, Tuo Zhao, Weizhu Chen

We propose MoEBERT, which uses a Mixture-of-Experts structure to increase model capacity and inference speed.

Knowledge Distillation Natural Language Understanding +1

Paper
Code

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models

1 code implementation • ICLR 2022 • Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Analysis shows that the proposed schedule indeed reduces the redundancy and improves generalization performance.

Image Classification Machine Translation +2

Paper
Code

Taming Sparsely Activated Transformer with Stochastic Experts

1 code implementation • ICLR 2022 • Simiao Zuo, Xiaodong Liu, Jian Jiao, Young Jin Kim, Hany Hassan, Ruofei Zhang, Tuo Zhao, Jianfeng Gao

While most on-going research focuses on improving SAMs models by exploring methods of routing inputs to experts, our analysis reveals that such research might not lead to the solution we expect, i. e., the commonly-used routing methods based on gating mechanisms do not work better than randomly routing inputs to experts.

Machine Translation Translation

Paper
Code

Adversarially Regularized Policy Learning Guided by Trajectory Optimization

no code implementations • 16 Sep 2021 • Zhigen Zhao, Simiao Zuo, Tuo Zhao, Ye Zhao

Recent advancement in combining trajectory optimization with function approximation (especially neural networks) shows promise in learning complex control policies for diverse tasks in robot systems.

Robot Manipulation

Paper
Add Code

ARCH: Efficient Adversarial Regularized Training with Caching

1 code implementation • Findings (EMNLP) 2021 • Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Adversarial regularization can improve model generalization in many natural language processing tasks.

Machine Translation Natural Language Understanding +1

Paper
Code

Self-Training with Differentiable Teacher

no code implementations • Findings (NAACL) 2022 • Simiao Zuo, Yue Yu, Chen Liang, Haoming Jiang, Siawpeng Er, Chao Zhang, Tuo Zhao, Hongyuan Zha

In self-training, the student contributes to the prediction performance, and the teacher controls the training process by generating pseudo-labels.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization

1 code implementation • ACL 2021 • Chen Liang, Simiao Zuo, Minshuo Chen, Haoming Jiang, Xiaodong Liu, Pengcheng He, Tuo Zhao, Weizhu Chen

The Lottery Ticket Hypothesis suggests that an over-parametrized network consists of ``lottery tickets'', and training a certain collection of them (i. e., a subnetwork) can match the performance of the full model.

Model Compression Multi-Task Learning

Paper
Code

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach

1 code implementation • EMNLP 2021 • Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Adversarial regularization has been shown to improve the generalization performance of deep learning models in various natural language processing tasks.

Machine Translation Natural Language Understanding +1

Paper
Code

A Hypergradient Approach to Robust Regression without Correspondence

no code implementations • ICLR 2021 • Yujia Xie, Yixiu Mao, Simiao Zuo, Hongteng Xu, Xiaojing Ye, Tuo Zhao, Hongyuan Zha

Due to the combinatorial nature of the problem, most existing methods are only applicable when the sample size is small, and limited to linear regression models.

Multi-Object Tracking regression

Paper
Add Code

Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach

1 code implementation • NAACL 2021 • Yue Yu, Simiao Zuo, Haoming Jiang, Wendi Ren, Tuo Zhao, Chao Zhang

To address this problem, we develop a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision.

Ranked #1 on Word Sense Disambiguation on Words in Context

Language Modelling Sentence +2

199

Paper
Code

Transformer Hawkes Process

3 code implementations • ICML 2020 • Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, Hongyuan Zha

Modern data acquisition routinely produce massive amounts of event sequence data in various domains, such as social media, healthcare, and financial markets.

Computational Efficiency Point Processes

158

Paper
Code

Image Score: How to Select Useful Samples

no code implementations • ICLR 2019 • Simiao Zuo, Jialin Wu

There has long been debates on how we could interpret neural networks and understand the decisions our models make.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.