Search Results for author: Yingshan Chang

Found 9 papers, 5 papers with code

Tools Fail: Detecting Silent Errors in Faulty Tools

no code implementations27 Jun 2024 Jimin Sun, So Yeon Min, Yingshan Chang, Yonatan Bisk

Tools have become a mainstay of LLMs, allowing them to retrieve knowledge not in their weights, to perform tasks on the web, and even to control robots.

DiffusionPID: Interpreting Diffusion via Partial Information Decomposition

no code implementations7 Jun 2024 Rushikesh Zawar, Shaurya Dewan, Prakanshul Saxena, Yingshan Chang, Andrew Luo, Yonatan Bisk

Text-to-image diffusion models have made significant progress in generating naturalistic images from textual inputs, and demonstrate the capacity to learn and represent complex visual-semantic relationships.

Denoising

Language Models Need Inductive Biases to Count Inductively

1 code implementation30 May 2024 Yingshan Chang, Yonatan Bisk

We find that while traditional RNNs trivially achieve inductive counting, Transformers have to rely on positional embeddings to count out-of-domain.

State Space Models

Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching

2 code implementations29 May 2024 Yasi Zhang, Peiyu Yu, Yaxuan Zhu, Yingshan Chang, Feng Gao, Ying Nian Wu, Oscar Leong

We validate our approach for various linear inverse problems, such as super-resolution, deblurring, inpainting, and compressed sensing, and demonstrate that we can outperform other methods based on flow matching.

Deblurring Image Generation +1

Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation

1 code implementation25 Mar 2024 Yingshan Chang, Yasi Zhang, Zhiyuan Fang, YingNian Wu, Yonatan Bisk, Feng Gao

We hypothesize that the underlying phenomenological coverage has not been proportionally scaled up, leading to a skew of the presented phenomenon which harms generalization.

Relational Reasoning Text-to-Image Generation

VISREAS: Complex Visual Reasoning with Unanswerable Questions

no code implementations23 Feb 2024 Syeda Nahida Akter, Sangwu Lee, Yingshan Chang, Yonatan Bisk, Eric Nyberg

The unique feature of this task, validating question answerability with respect to an image before answering, and the poor performance of state-of-the-art models inspired the design of a new modular baseline, LOGIC2VISION that reasons by producing and executing pseudocode without any external modules to generate the answer.

Question Answering Visual Question Answering +1

Toxicity Detection with Generative Prompt-based Inference

no code implementations24 May 2022 Yau-Shian Wang, Yingshan Chang

It is a long-known risk that language models (LMs), once trained on corpus containing undesirable content, have the power to manifest biases and toxicity.

Language Modelling Prompt Engineering

Training Vision-Language Transformers from Captions

1 code implementation19 May 2022 Liangke Gui, Yingshan Chang, Qiuyuan Huang, Subhojit Som, Alex Hauptmann, Jianfeng Gao, Yonatan Bisk

Vision-Language Transformers can be learned without low-level human labels (e. g. class labels, bounding boxes, etc).

WebQA: Multihop and Multimodal QA

2 code implementations CVPR 2022 Yingshan Chang, Mridu Narang, Hisami Suzuki, Guihong Cao, Jianfeng Gao, Yonatan Bisk

Scaling Visual Question Answering (VQA) to the open-domain and multi-hop nature of web searches, requires fundamental advances in visual representation learning, knowledge aggregation, and language generation.

Image Retrieval Multimodal Reasoning +4

Cannot find the paper you are looking for? You can Submit a new open access paper.