Search Results for author: Lifu Tu

Found 21 papers, 11 papers with code

XGen-7B Technical Report

1 code implementation7 Sep 2023 Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong

Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.

2k 8k

Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning

1 code implementation3 Apr 2023 Lifu Tu, Jin Qu, Semih Yavuz, Shafiq Joty, Wenhao Liu, Caiming Xiong, Yingbo Zhou

Our results demonstrate the strong and efficient modeling ability of NLI-based classifiers and the large cross-lingual transfer improvements achieved by our aligned prompts, particularly in few-shot settings.

Cross-Lingual Transfer intent-classification +4

Prompt-Tuning Can Be Much Better Than Fine-Tuning on Cross-lingual Understanding With Multilingual Language Models

2 code implementations22 Oct 2022 Lifu Tu, Caiming Xiong, Yingbo Zhou

Pre-trained multilingual language models show significant performance gains for zero-shot cross-lingual model transfer on a wide range of natural language understanding (NLU) tasks.

Cross-Lingual Transfer Natural Language Understanding +3

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

7 code implementations25 Mar 2022 Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.

Code Generation HumanEval +3

Learning Energy-Based Approximate Inference Networks for Structured Applications in NLP

no code implementations27 Aug 2021 Lifu Tu

In this dissertation, we discuss the concept of the energy function and structured models with different energy functions.

Representation Learning Structured Prediction

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

1 code implementation EMNLP 2020 Lifu Tu, Tianyu Liu, Kevin Gimpel

Many tasks in natural language processing involve predicting structured outputs, e. g., sequence labeling, semantic role labeling, parsing, and machine translation.

Machine Translation Representation Learning +2

An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models

1 code implementation14 Jul 2020 Lifu Tu, Garima Lalwani, Spandana Gella, He He

Recent work has shown that pre-trained language models such as BERT improve robustness to spurious correlations in the dataset.

Diversity Multi-Task Learning +2

ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation

1 code implementation ACL 2020 Lifu Tu, Richard Yuanzhe Pang, Sam Wiseman, Kevin Gimpel

We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model.

de-en Machine Translation +1

Generating Diverse Story Continuations with Controllable Semantics

no code implementations WS 2019 Lifu Tu, Xiaoan Ding, Dong Yu, Kevin Gimpel

We propose a simple and effective modeling framework for controlled generation of multiple, diverse outputs.

Diversity Sentence

Benchmarking Approximate Inference Methods for Neural Structured Prediction

1 code implementation NAACL 2019 Lifu Tu, Kevin Gimpel

One approach is to perform gradient descent with respect to the output structure directly (Belanger and McCallum, 2016).

Benchmarking Structured Prediction

Learning Approximate Inference Networks for Structured Prediction

3 code implementations ICLR 2018 Lifu Tu, Kevin Gimpel

Prior work used gradient descent for inference, relaxing the structured output to a set of continuous variables and then optimizing the energy with respect to them.

Language Modelling Multi-Label Classification +2

Network Inference by Learned Node-Specific Degree Prior

no code implementations7 Feb 2016 Qingming Tang, Lifu Tu, Weiran Wang, Jinbo Xu

We propose a novel method for network inference from partially observed edges using a node-specific degree prior.

Matrix Completion

Cannot find the paper you are looking for? You can Submit a new open access paper.