Search Results for author: Simeng Sun

Found 20 papers, 8 papers with code

How Much Do Modifications to Transformer Language Models Affect Their Ability to Learn Linguistic Knowledge?

no code implementations insights (ACL) 2022 Simeng Sun, Brian Dillon, Mohit Iyyer

Recent progress in large pretrained language models (LMs) has led to a growth of analyses examining what kinds of linguistic knowledge are encoded by these models.

IGA: An Intent-Guided Authoring Assistant

no code implementations EMNLP 2021 Simeng Sun, Wenlong Zhao, Varun Manjunatha, Rajiv Jain, Vlad Morariu, Franck Dernoncourt, Balaji Vasan Srinivasan, Mohit Iyyer

While large-scale pretrained language models have significantly improved writing assistance functionalities such as autocomplete, more complex and controllable writing assistants have yet to be explored.

Language Modelling Sentence

TopicGPT: A Prompt-based Topic Modeling Framework

1 code implementation2 Nov 2023 Chau Minh Pham, Alexander Hoyle, Simeng Sun, Mohit Iyyer

Topic modeling is a well-established technique for exploring text corpora.

Topic Models

Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF

1 code implementation16 Sep 2023 Simeng Sun, Dhawal Gupta, Mohit Iyyer

During the last stage of RLHF, a large language model is aligned to human intents via PPO training, a process that generally requires large-scale computational resources.

Language Modelling Large Language Model

PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents

1 code implementation23 May 2023 Simeng Sun, Yang Liu, Shuohang Wang, Chenguang Zhu, Mohit Iyyer

PEARL outperforms zero-shot and chain-of-thought prompting on this dataset, and ablation experiments show that each stage of PEARL is critical to its performance.

How Does In-Context Learning Help Prompt Tuning?

no code implementations22 Feb 2023 Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, Mohit Iyyer

This motivates the use of parameter-efficient adaptation methods such as prompt tuning (PT), which adds a small number of tunable embeddings to an otherwise frozen model, and in-context learning (ICL), in which demonstrations of the task are provided to the model in natural language without any additional training.

In-Context Learning Text Generation

Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages

no code implementations7 Feb 2023 Simeng Sun, Maha Elbayad, Anna Sun, James Cross

With multilingual machine translation (MMT) models continuing to grow in size and number of supported languages, it is natural to reuse and upgrade existing models to save computation as data becomes available in more languages.

Machine Translation Translation

Image Coding for Machines with Omnipotent Feature Learning

no code implementations5 Jul 2022 Ruoyu Feng, Xin Jin, Zongyu Guo, Runsen Feng, Yixin Gao, Tianyu He, Zhizheng Zhang, Simeng Sun, Zhibo Chen

Learning a kind of feature that is both general (for AI tasks) and compact (for compression) is pivotal for its success.

Self-Supervised Learning

ChapterBreak: A Challenge Dataset for Long-Range Language Models

1 code implementation NAACL 2022 Simeng Sun, Katherine Thai, Mohit Iyyer

While numerous architectures for long-range language models (LRLMs) have recently been proposed, a meaningful evaluation of their discourse-level language understanding capabilities has not yet followed.

Semantically Video Coding: Instill Static-Dynamic Clues into Structured Bitstream for AI Tasks

no code implementations25 Jan 2022 Xin Jin, Ruoyu Feng, Simeng Sun, Runsen Feng, Tianyu He, Zhibo Chen

Traditional media coding schemes typically encode image/video into a semantic-unknown binary stream, which fails to directly support downstream intelligent tasks at the bitstream level.

Action Recognition Object +8

Alternative Input Signals Ease Transfer in Multilingual Machine Translation

no code implementations ACL 2022 Simeng Sun, Angela Fan, James Cross, Vishrav Chaudhary, Chau Tran, Philipp Koehn, Francisco Guzman

Further, we find that incorporating alternative inputs via self-ensemble can be particularly effective when training set is small, leading to +5 BLEU when only 5% of the total training data is accessible.

Machine Translation Translation

Do Long-Range Language Models Actually Use Long-Range Context?

no code implementations EMNLP 2021 Simeng Sun, Kalpesh Krishna, Andrew Mattarella-Micke, Mohit Iyyer

Language models are generally trained on short, truncated input sequences, which limits their ability to use discourse-level information present in long-range context to improve their predictions.

Sentence

IGA : An Intent-Guided Authoring Assistant

1 code implementation14 Apr 2021 Simeng Sun, Wenlong Zhao, Varun Manjunatha, Rajiv Jain, Vlad Morariu, Franck Dernoncourt, Balaji Vasan Srinivasan, Mohit Iyyer

While large-scale pretrained language models have significantly improved writing assistance functionalities such as autocomplete, more complex and controllable writing assistants have yet to be explored.

Language Modelling Sentence

Revisiting Simple Neural Probabilistic Language Models

1 code implementation NAACL 2021 Simeng Sun, Mohit Iyyer

Recent progress in language modeling has been driven not only by advances in neural architectures, but also through hardware and optimization improvements.

Language Modelling Word Embeddings

Learning Omni-frequency Region-adaptive Representations for Real Image Super-Resolution

no code implementations11 Dec 2020 Xin Li, Xin Jin, Tao Yu, Yingxue Pang, Simeng Sun, Zhizheng Zhang, Zhibo Chen

Traditional single image super-resolution (SISR) methods that focus on solving single and uniform degradation (i. e., bicubic down-sampling), typically suffer from poor performance when applied into real-world low-resolution (LR) images due to the complicated realistic degradations.

Image Super-Resolution

Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models

1 code implementation ACL 2021 Sumanta Bhattacharyya, Amirmohammad Rooshenas, Subhajit Naskar, Simeng Sun, Mohit Iyyer, Andrew McCallum

To benefit from this observation, we train an energy-based model to mimic the behavior of the task measure (i. e., the energy-based model assigns lower energy to samples with higher BLEU score), which is resulted in a re-ranking algorithm based on the samples drawn from NMT: energy-based re-ranking (EBR).

Computational Efficiency Machine Translation +4

Multi-scale Grouped Dense Network for VVC Intra Coding

no code implementations16 May 2020 Xin Li, Simeng Sun, Zhizheng Zhang, Zhibo Chen

Versatile Video Coding (H. 266/VVC) standard achieves better image quality when keeping the same bits than any other conventional image codec, such as BPG, JPEG, and etc.

Generative Adversarial Network

Hard-Coded Gaussian Attention for Neural Machine Translation

1 code implementation ACL 2020 Weiqiu You, Simeng Sun, Mohit Iyyer

Recent work has questioned the importance of the Transformer's multi-headed attention for achieving high translation quality.

Machine Translation Translation

How to Compare Summarizers without Target Length? Pitfalls, Solutions and Re-Examination of the Neural Summarization Literature

no code implementations WS 2019 Simeng Sun, Ori Shapira, Ido Dagan, Ani Nenkova

We show that plain ROUGE F1 scores are not ideal for comparing current neural systems which on average produce different lengths.

Cannot find the paper you are looking for? You can Submit a new open access paper.