Search Results for author: Tianlu Wang

Found 29 papers, 17 papers with code

Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference

no code implementations3 Oct 2024 Wei Cheng, Tianlu Wang, Yanmin Ji, Fan Yang, Keren Tan, Yiyu Zheng

While in-context learning with large language models (LLMs) has shown impressive performance, we have discovered a unique miscalibration behavior where both correct and incorrect predictions are assigned the same level of confidence.

In-Context Learning

Self-Taught Evaluators

no code implementations5 Aug 2024 Tianlu Wang, Ilia Kulikov, Olga Golovneva, Ping Yu, Weizhe Yuan, Jane Dwivedi-Yu, Richard Yuanzhe Pang, Maryam Fazel-Zarandi, Jason Weston, Xian Li

Model-based evaluation is at the heart of successful model development -- as a reward model for training, and as a replacement for human evaluation.

Contextual Position Encoding: Learning to Count What's Important

no code implementations29 May 2024 Olga Golovneva, Tianlu Wang, Jason Weston, Sainbayar Sukhbaatar

Incorporating position encoding (PE) makes it possible to address by position, such as attending to the i-th token.

Language Modelling Position +1

Efficient Tool Use with Chain-of-Abstraction Reasoning

no code implementations30 Jan 2024 Silin Gao, Jane Dwivedi-Yu, Ping Yu, Xiaoqing Ellen Tan, Ramakanth Pasunuru, Olga Golovneva, Koustuv Sinha, Asli Celikyilmaz, Antoine Bosselut, Tianlu Wang

LLM agents trained with our method also show more efficient tool use, with inference speed being on average ~1. 4x faster than baseline tool-augmented LLMs.

Math Mathematical Reasoning +1

PathFinder: Guided Search over Multi-Step Reasoning Paths

no code implementations8 Dec 2023 Olga Golovneva, Sean O'Brien, Ramakanth Pasunuru, Tianlu Wang, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

Using constrained reasoning, PathFinder integrates novel quality constraints, pruning, and exploration methods to enhance the efficiency and the quality of generation.

Pathfinder

The ART of LLM Refinement: Ask, Refine, and Trust

no code implementations14 Nov 2023 Kumar Shridhar, Koustuv Sinha, Andrew Cohen, Tianlu Wang, Ping Yu, Ram Pasunuru, Mrinmaya Sachan, Jason Weston, Asli Celikyilmaz

In recent years, Large Language Models (LLMs) have demonstrated remarkable generative abilities, but can they judge the quality of their own generations?

Arithmetic Reasoning GSM8K +2

Shepherd: A Critic for Language Model Generation

1 code implementation8 Aug 2023 Tianlu Wang, Ping Yu, Xiaoqing Ellen Tan, Sean O'Brien, Ramakanth Pasunuru, Jane Dwivedi-Yu, Olga Golovneva, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs.

Language Modelling

Understanding In-Context Learning via Supportive Pretraining Data

no code implementations26 Jun 2023 Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, Tianlu Wang

We observe that a continued pretraining on this small subset significantly improves the model's ICL ability, by up to 18%.

In-Context Learning

Open-Domain Text Evaluation via Contrastive Distribution Methods

1 code implementation20 Jun 2023 Sidi Lu, Hongyi Liu, Asli Celikyilmaz, Tianlu Wang, Nanyun Peng

We investigate CDM for open-domain text generation evaluation under two paradigms: 1) _Generative_ CDM, which harnesses the contrast of two language models' distributions to generate synthetic examples for training discriminator-based metrics; 2) _Discriminative_ CDM, which directly uses distribution disparities between two language models for evaluation.

Abstractive Text Summarization Coherence Evaluation +1

Gender Biases in Automatic Evaluation Metrics for Image Captioning

1 code implementation24 May 2023 Haoyi Qiu, Zi-Yi Dou, Tianlu Wang, Asli Celikyilmaz, Nanyun Peng

Model-based evaluation metrics (e. g., CLIPScore and GPTScore) have demonstrated decent correlations with human judgments in various language generation tasks.

Fairness Image Captioning +1

Variation of Gender Biases in Visual Recognition Models Before and After Finetuning

no code implementations14 Mar 2023 Jaspreet Ranjit, Tianlu Wang, Baishakhi Ray, Vicente Ordonez

We also find that (2) models finetuned on larger scale datasets are more likely to introduce new biased associations.

Object Recognition

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

1 code implementation22 Dec 2022 Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.

Language Modelling Meta-Learning +2

ALERT: Adapting Language Models to Reasoning Tasks

no code implementations16 Dec 2022 Ping Yu, Tianlu Wang, Olga Golovneva, Badr Alkhamissy, Gargi Ghosh, Mona Diab, Asli Celikyilmaz

Current large language models can perform reasonably well on complex tasks that require step-by-step reasoning with few-shot learning.

Few-Shot Learning Language Modelling +1

Text Characterization Toolkit

no code implementations4 Oct 2022 Daniel Simig, Tianlu Wang, Verna Dankers, Peter Henderson, Khuyagbaatar Batsuren, Dieuwke Hupkes, Mona Diab

In NLP, models are usually evaluated by reporting single-number performance scores on a number of readily available benchmarks, without much deeper analysis.

Selective Annotation Makes Language Models Better Few-Shot Learners

1 code implementation5 Sep 2022 Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu

Departing from recent in-context learning methods, we formulate an annotation-efficient, two-step framework: selective annotation that chooses a pool of examples to annotate from unlabeled data in advance, followed by prompt retrieval that retrieves task examples from the annotated pool at test time.

Code Generation In-Context Learning +1

Identifying and Mitigating Spurious Correlations for Improving Robustness in NLP Models

1 code implementation Findings (NAACL) 2022 Tianlu Wang, Rohit Sridhar, Diyi Yang, Xuezhi Wang

Recently, NLP models have achieved remarkable progress across a variety of tasks; however, they have also been criticized for being not robust.

General Multi-label Image Classification with Transformers

2 code implementations CVPR 2021 Jack Lanchantin, Tianlu Wang, Vicente Ordonez, Yanjun Qi

Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image.

Classification General Classification +1

CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation

no code implementations EMNLP 2020 Tianlu Wang, Xuezhi Wang, Yao Qin, Ben Packer, Kang Li, Jilin Chen, Alex Beutel, Ed Chi

Experiments on real-world NLP datasets demonstrate that our method can generate more diverse and fluent adversarial texts, compared to many existing adversarial text generation approaches.

Adversarial Text Attribute +3

Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

1 code implementation ACL 2020 Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong

Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models.

Word Embeddings

Gender Bias in Contextualized Word Embeddings

2 code implementations NAACL 2019 Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, Kai-Wei Chang

In this paper, we quantify, analyze and mitigate gender bias exhibited in ELMo's contextualized word vectors.

Word Embeddings

Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations

2 code implementations ICCV 2019 Tianlu Wang, Jieyu Zhao, Mark Yatskar, Kai-Wei Chang, Vicente Ordonez

In this work, we present a framework to measure and mitigate intrinsic biases with respect to protected variables --such as gender-- in visual recognition tasks.

Temporal Action Localization

Feedback-prop: Convolutional Neural Network Inference under Partial Evidence

1 code implementation CVPR 2018 Tianlu Wang, Kota Yamaguchi, Vicente Ordonez

We propose an inference procedure for deep convolutional neural networks (CNNs) when partial evidence is available.

Cannot find the paper you are looking for? You can Submit a new open access paper.