Search Results for author: Jinfeng Rao

Found 21 papers, 7 papers with code

DSI++: Updating Transformer Memory with New Documents

no code implementations • 19 Dec 2022 • Sanket Vaibhav Mehta, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, Donald Metzler

In this work, we introduce DSI++, a continual learning challenge for DSI to incrementally index new documents while being able to answer queries related to both previously and newly indexed documents.

Continual Learning Natural Questions +1

Paper
Add Code

Transcending Scaling Laws with 0.1% Extra Compute

no code implementations • 20 Oct 2022 • Yi Tay, Jason Wei, Hyung Won Chung, Vinh Q. Tran, David R. So, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc V. Le, Mostafa Dehghani

This paper proposes UL2R, a method that substantially improves existing language models and their scaling curves with a relatively tiny amount of extra compute.

Ranked #2 on Cross-Lingual Question Answering on TyDiQA-GoldP

Arithmetic Reasoning Cross-Lingual Question Answering +4

Paper
Add Code

Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?

no code implementations • 21 Jul 2022 • Yi Tay, Mostafa Dehghani, Samira Abnar, Hyung Won Chung, William Fedus, Jinfeng Rao, Sharan Narang, Vinh Q. Tran, Dani Yogatama, Donald Metzler

There have been a lot of interest in the scaling properties of Transformer models.

Inductive Bias

Paper
Add Code

ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning

3 code implementations • ICLR 2022 • Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler

Despite the recent success of multi-task learning and transfer learning for natural language processing (NLP), few works have systematically studied the effect of scaling up the number of tasks during pre-training.

Denoising Multi-Task Learning

5,890

Paper
Code

Improving Compositional Generalization with Self-Training for Data-to-Text Generation

1 code implementation • ACL 2022 • Sanket Vaibhav Mehta, Jinfeng Rao, Yi Tay, Mihir Kale, Ankur P. Parikh, Emma Strubell

Data-to-text generation focuses on generating fluent natural language responses from structured meaning representations (MRs).

Data-to-Text Generation

32,745

Paper
Code

Scale Efficiently: Insights from Pretraining and Finetuning Transformers

no code implementations • ICLR 2022 • Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler

The key findings of this paper are as follows: (1) we show that aside from only the model size, model shape matters for downstream fine-tuning, (2) scaling protocols operate differently at different compute regions, (3) widely adopted T5-base and T5-large sizes are Pareto-inefficient.

Paper
Add Code

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

3 code implementations • 22 Sep 2021 • Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler

32,736

Paper
Code

Long Range Arena : A Benchmark for Efficient Transformers

no code implementations • ICLR 2021 • Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

Transformers do not scale very well to long sequence lengths largely because of quadratic self-attention complexity.

16k Benchmarking

Paper
Add Code

Long Range Arena: A Benchmark for Efficient Transformers

5 code implementations • 8 Nov 2020 • Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.

Ranked #18 on Long-range modeling on LRA (Pathfinder metric)

16k Benchmarking +1

680

Paper
Code

Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling

no code implementations • IJCNLP 2019 • Linqing Liu, Wei Yang, Jinfeng Rao, Raphael Tang, Jimmy Lin

Semantic similarity modeling is central to many NLP problems such as natural language inference and question answering.

Natural Language Inference Question Answering +2

Paper
Add Code

Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling

no code implementations • IJCNLP 2019 • Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, Jimmy Lin

A core problem of information retrieval (IR) is relevance matching, which is to rank documents by relevance to a user{'}s query.

Information Retrieval Paraphrase Identification +3

Paper
Add Code

The OSU/Facebook Realizer for SRST 2019: Seq2Seq Inflection and Serialized Tree2Tree Linearization

no code implementations • WS 2019 • Kartikeya Upasani, David King, Jinfeng Rao, Anusha Balakrishnan, Michael White

We describe our exploratory system for the shallow surface realization task, which combines morphological inflection using character sequence-to-sequence models with a baseline linearizer that implements a tree-to-tree model using sequence-to-sequence models on serialized trees.

Morphological Inflection valid

Paper
Add Code

A Tree-to-Sequence Model for Neural NLG in Task-Oriented Dialog

no code implementations • WS 2019 • Jinfeng Rao, Kartikeya Upasani, Anusha Balakrishnan, Michael White, Anuj Kumar, Rajen Subba

Generating fluent natural language responses from structured semantic representations is a critical step in task-oriented conversational systems.

Sentence

Paper
Add Code

Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue

1 code implementation • ACL 2019 • Anusha Balakrishnan, Jinfeng Rao, Kartikeya Upasani, Michael White, Rajen Subba

Generating fluent natural language responses from structured semantic representations is a critical step in task-oriented conversational systems.

Sentence

Paper
Code

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

1 code implementation • ACL 2019 • Yi Tay, Aston Zhang, Luu Anh Tuan, Jinfeng Rao, Shuai Zhang, Shuohang Wang, Jie Fu, Siu Cheung Hui

Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient.

Paper
Code

Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives

no code implementations • ACL 2019 • Yi Tay, Shuohang Wang, Luu Anh Tuan, Jie Fu, Minh C. Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, Aston Zhang

This paper tackles the problem of reading comprehension over long narratives where documents easily span over thousands of tokens.

Reading Comprehension

Paper
Add Code

Simple Attention-Based Representation Learning for Ranking Short Social Media Posts

no code implementations • NAACL 2019 • Peng Shi, Jinfeng Rao, Jimmy Lin

This paper explores the problem of ranking short social media posts with respect to user queries using neural networks.

Representation Learning

Paper
Add Code

Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search

3 code implementations • 21 May 2018 • Jinfeng Rao, Wei Yang, Yuhao Zhang, Ferhan Ture, Jimmy Lin

To our best knowledge, this paper presents the first substantial work tackling search over social media posts using neural ranking models.

Information Retrieval Retrieval

Paper
Code

Integrating Lexical and Temporal Signals in Neural Ranking Models for Searching Social Media Streams

no code implementations • 25 Jul 2017 • Jinfeng Rao, Hua He, Haotian Zhang, Ferhan Ture, Royal Sequiera, Salman Mohammed, Jimmy Lin

To our knowledge, we are the first to integrate lexical and temporal signals in an end-to-end neural network architecture, in which existing neural ranking models are used to generate query-document similarity vectors that feed into a bidirectional LSTM layer for temporal modeling.

Density Estimation Document Ranking

Paper
Add Code

Exploring the Effectiveness of Convolutional Neural Networks for Answer Selection in End-to-End Question Answering

no code implementations • 25 Jul 2017 • Royal Sequiera, Gaurav Baruah, Zhucheng Tu, Salman Mohammed, Jinfeng Rao, Haotian Zhang, Jimmy Lin

Most work on natural language question answering today focuses on answer selection: given a candidate list of sentences, determine which contains the answer.

Answer Selection Retrieval

Paper
Add Code

UMD-TTIC-UW at SemEval-2016 Task 1: Attention-Based Multi-Perspective Convolutional Neural Networks for Textual Similarity Measurement

no code implementations • SEMEVAL 2016 • Hua He, John Wieting, Kevin Gimpel, Jinfeng Rao, Jimmy Lin

Feature Engineering Question Answering +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.