Search Results for author: Jinfeng Rao

Found 18 papers, 7 papers with code

ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning

2 code implementations ICLR 2022 Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler

Despite the recent success of multi-task learning and transfer learning for natural language processing (NLP), few works have systematically studied the effect of scaling up the number of tasks during pre-training.

Denoising Multi-Task Learning

Scale Efficiently: Insights from Pretraining and Finetuning Transformers

no code implementations ICLR 2022 Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler

The key findings of this paper are as follows: (1) we show that aside from only the model size, model shape matters for downstream fine-tuning, (2) scaling protocols operate differently at different compute regions, (3) widely adopted T5-base and T5-large sizes are Pareto-inefficient.

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

2 code implementations22 Sep 2021 Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler

The key findings of this paper are as follows: (1) we show that aside from only the model size, model shape matters for downstream fine-tuning, (2) scaling protocols operate differently at different compute regions, (3) widely adopted T5-base and T5-large sizes are Pareto-inefficient.

Long Range Arena: A Benchmark for Efficient Transformers

5 code implementations8 Nov 2020 Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.

Long-range modeling

The OSU/Facebook Realizer for SRST 2019: Seq2Seq Inflection and Serialized Tree2Tree Linearization

no code implementations WS 2019 Kartikeya Upasani, David King, Jinfeng Rao, Anusha Balakrishnan, Michael White

We describe our exploratory system for the shallow surface realization task, which combines morphological inflection using character sequence-to-sequence models with a baseline linearizer that implements a tree-to-tree model using sequence-to-sequence models on serialized trees.

Morphological Inflection

A Tree-to-Sequence Model for Neural NLG in Task-Oriented Dialog

no code implementations WS 2019 Jinfeng Rao, Kartikeya Upasani, Anusha Balakrishnan, Michael White, Anuj Kumar, Rajen Subba

Generating fluent natural language responses from structured semantic representations is a critical step in task-oriented conversational systems.

Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue

1 code implementation ACL 2019 Anusha Balakrishnan, Jinfeng Rao, Kartikeya Upasani, Michael White, Rajen Subba

Generating fluent natural language responses from structured semantic representations is a critical step in task-oriented conversational systems.

Simple Attention-Based Representation Learning for Ranking Short Social Media Posts

no code implementations NAACL 2019 Peng Shi, Jinfeng Rao, Jimmy Lin

This paper explores the problem of ranking short social media posts with respect to user queries using neural networks.

Representation Learning

Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search

3 code implementations21 May 2018 Jinfeng Rao, Wei Yang, Yuhao Zhang, Ferhan Ture, Jimmy Lin

To our best knowledge, this paper presents the first substantial work tackling search over social media posts using neural ranking models.

Information Retrieval

Exploring the Effectiveness of Convolutional Neural Networks for Answer Selection in End-to-End Question Answering

no code implementations25 Jul 2017 Royal Sequiera, Gaurav Baruah, Zhucheng Tu, Salman Mohammed, Jinfeng Rao, Haotian Zhang, Jimmy Lin

Most work on natural language question answering today focuses on answer selection: given a candidate list of sentences, determine which contains the answer.

Answer Selection

Integrating Lexical and Temporal Signals in Neural Ranking Models for Searching Social Media Streams

no code implementations25 Jul 2017 Jinfeng Rao, Hua He, Haotian Zhang, Ferhan Ture, Royal Sequiera, Salman Mohammed, Jimmy Lin

To our knowledge, we are the first to integrate lexical and temporal signals in an end-to-end neural network architecture, in which existing neural ranking models are used to generate query-document similarity vectors that feed into a bidirectional LSTM layer for temporal modeling.

Density Estimation Document Ranking

Cannot find the paper you are looking for? You can Submit a new open access paper.