Search Results for author: Simeng Sun

Found 21 papers, 9 papers with code

How Much Do Modifications to Transformer Language Models Affect Their Ability to Learn Linguistic Knowledge?

no code implementations • insights (ACL) 2022 • Simeng Sun, Brian Dillon, Mohit Iyyer

Recent progress in large pretrained language models (LMs) has led to a growth of analyses examining what kinds of linguistic knowledge are encoded by these models.

Paper
Add Code

IGA: An Intent-Guided Authoring Assistant

no code implementations • EMNLP 2021 • Simeng Sun, Wenlong Zhao, Varun Manjunatha, Rajiv Jain, Vlad Morariu, Franck Dernoncourt, Balaji Vasan Srinivasan, Mohit Iyyer

While large-scale pretrained language models have significantly improved writing assistance functionalities such as autocomplete, more complex and controllable writing assistants have yet to be explored.

Language Modelling Sentence

Paper
Add Code

RULER: What's the Real Context Size of Your Long-Context Language Models?

1 code implementation • 9 Apr 2024 • Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Yang Zhang, Boris Ginsburg

Despite achieving nearly perfect accuracy in the vanilla NIAH test, all models exhibit large performance drops as the context length increases.

Long-Context Understanding

Paper
Code

TopicGPT: A Prompt-based Topic Modeling Framework

1 code implementation • 2 Nov 2023 • Chau Minh Pham, Alexander Hoyle, Simeng Sun, Philip Resnik, Mohit Iyyer

Topic modeling is a well-established technique for exploring text corpora.

Specificity Topic Models

166

Paper
Code

Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF

1 code implementation • 16 Sep 2023 • Simeng Sun, Dhawal Gupta, Mohit Iyyer

During the last stage of RLHF, a large language model is aligned to human intents via PPO training, a process that generally requires large-scale computational resources.

Language Modelling Large Language Model

Paper
Code

PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents

1 code implementation • 23 May 2023 • Simeng Sun, Yang Liu, Shuohang Wang, Chenguang Zhu, Mohit Iyyer

PEARL outperforms zero-shot and chain-of-thought prompting on this dataset, and ablation experiments show that each stage of PEARL is critical to its performance.

Paper
Code

How Does In-Context Learning Help Prompt Tuning?

no code implementations • 22 Feb 2023 • Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, Mohit Iyyer

This motivates the use of parameter-efficient adaptation methods such as prompt tuning (PT), which adds a small number of tunable embeddings to an otherwise frozen model, and in-context learning (ICL), in which demonstrations of the task are provided to the model in natural language without any additional training.

In-Context Learning Text Generation

Paper
Add Code

Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages

no code implementations • 7 Feb 2023 • Simeng Sun, Maha Elbayad, Anna Sun, James Cross

With multilingual machine translation (MMT) models continuing to grow in size and number of supported languages, it is natural to reuse and upgrade existing models to save computation as data becomes available in more languages.

Machine Translation Translation

Paper
Add Code

Image Coding for Machines with Omnipotent Feature Learning

no code implementations • 5 Jul 2022 • Ruoyu Feng, Xin Jin, Zongyu Guo, Runsen Feng, Yixin Gao, Tianyu He, Zhizheng Zhang, Simeng Sun, Zhibo Chen

Learning a kind of feature that is both general (for AI tasks) and compact (for compression) is pivotal for its success.

Self-Supervised Learning

Paper
Add Code

ChapterBreak: A Challenge Dataset for Long-Range Language Models

1 code implementation • NAACL 2022 • Simeng Sun, Katherine Thai, Mohit Iyyer

While numerous architectures for long-range language models (LRLMs) have recently been proposed, a meaningful evaluation of their discourse-level language understanding capabilities has not yet followed.

Paper
Code

Semantically Video Coding: Instill Static-Dynamic Clues into Structured Bitstream for AI Tasks

no code implementations • 25 Jan 2022 • Xin Jin, Ruoyu Feng, Simeng Sun, Runsen Feng, Tianyu He, Zhibo Chen

Traditional media coding schemes typically encode image/video into a semantic-unknown binary stream, which fails to directly support downstream intelligent tasks at the bitstream level.

Action Recognition Object +8

Paper
Add Code

Alternative Input Signals Ease Transfer in Multilingual Machine Translation

no code implementations • ACL 2022 • Simeng Sun, Angela Fan, James Cross, Vishrav Chaudhary, Chau Tran, Philipp Koehn, Francisco Guzman

Further, we find that incorporating alternative inputs via self-ensemble can be particularly effective when training set is small, leading to +5 BLEU when only 5% of the total training data is accessible.

Machine Translation Translation

Paper
Add Code

Do Long-Range Language Models Actually Use Long-Range Context?

no code implementations • EMNLP 2021 • Simeng Sun, Kalpesh Krishna, Andrew Mattarella-Micke, Mohit Iyyer

Language models are generally trained on short, truncated input sequences, which limits their ability to use discourse-level information present in long-range context to improve their predictions.

2k 8k +1

Paper
Add Code

IGA : An Intent-Guided Authoring Assistant

1 code implementation • 14 Apr 2021 • Simeng Sun, Wenlong Zhao, Varun Manjunatha, Rajiv Jain, Vlad Morariu, Franck Dernoncourt, Balaji Vasan Srinivasan, Mohit Iyyer

Language Modelling Sentence

Paper
Code

Revisiting Simple Neural Probabilistic Language Models

1 code implementation • NAACL 2021 • Simeng Sun, Mohit Iyyer

Recent progress in language modeling has been driven not only by advances in neural architectures, but also through hardware and optimization improvements.

Ranked #59 on Language Modelling on WikiText-103

Language Modelling Word Embeddings

Paper
Code

Learning Omni-frequency Region-adaptive Representations for Real Image Super-Resolution

no code implementations • 11 Dec 2020 • Xin Li, Xin Jin, Tao Yu, Yingxue Pang, Simeng Sun, Zhizheng Zhang, Zhibo Chen

Traditional single image super-resolution (SISR) methods that focus on solving single and uniform degradation (i. e., bicubic down-sampling), typically suffer from poor performance when applied into real-world low-resolution (LR) images due to the complicated realistic degradations.

Image Super-Resolution

Paper
Add Code

Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models

1 code implementation • ACL 2021 • Sumanta Bhattacharyya, Amirmohammad Rooshenas, Subhajit Naskar, Simeng Sun, Mohit Iyyer, Andrew McCallum

To benefit from this observation, we train an energy-based model to mimic the behavior of the task measure (i. e., the energy-based model assigns lower energy to samples with higher BLEU score), which is resulted in a re-ranking algorithm based on the samples drawn from NMT: energy-based re-ranking (EBR).

Computational Efficiency Machine Translation +4

Paper
Code

Multi-scale Grouped Dense Network for VVC Intra Coding

no code implementations • 16 May 2020 • Xin Li, Simeng Sun, Zhizheng Zhang, Zhibo Chen

Versatile Video Coding (H. 266/VVC) standard achieves better image quality when keeping the same bits than any other conventional image codec, such as BPG, JPEG, and etc.

Generative Adversarial Network