Search Results for author: Deqing Fu

Found 11 papers, 2 papers with code

VisualLens: Personalization through Visual History

no code implementations25 Nov 2024 Wang Bill Zhu, Deqing Fu, Kai Sun, Yi Lu, Zhaojiang Lin, Seungwhan Moon, Kanika Narang, Mustafa Canim, Yue Liu, Anuj Kumar, Xin Luna Dong

We hypothesize that a user's visual history with images reflecting their daily life, offers valuable insights into their interests and preferences, and can be leveraged for personalization.

Diversity Recommendation Systems

TLDR: Token-Level Detective Reward Model for Large Vision Language Models

no code implementations7 Oct 2024 Deqing Fu, Tong Xiao, Rui Wang, Wang Zhu, Pengchuan Zhang, Guan Pang, Robin Jia, Lawrence Chen

Although reward models have been successful in improving multimodal large language models, the reward models themselves remain brutal and contain minimal information.

Hallucination Hallucination Evaluation

Pre-trained Large Language Models Use Fourier Features to Compute Addition

no code implementations5 Jun 2024 Tianyi Zhou, Deqing Fu, Vatsal Sharan, Robin Jia

This paper shows that pre-trained LLMs add numbers using Fourier features -- dimensions in the hidden state that represent numbers via a set of features sparse in the frequency domain.

Mathematical Reasoning

IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations

no code implementations1 Apr 2024 Deqing Fu, Ruohao Guo, Ghazal Khalighinejad, Ollie Liu, Bhuwan Dhingra, Dani Yogatama, Robin Jia, Willie Neiswanger

Current foundation models exhibit impressive capabilities when prompted either with text only or with both image and text inputs.

Benchmarking Math

Simplicity Bias of Transformers to Learn Low Sensitivity Functions

no code implementations11 Mar 2024 Bhavya Vasudeva, Deqing Fu, Tianyi Zhou, Elliott Kau, Youqi Huang, Vatsal Sharan

Transformers achieve state-of-the-art accuracy and robustness across many tasks, but an understanding of the inductive biases that they have and how those biases are different from other neural network architectures remains elusive.

DeLLMa: Decision Making Under Uncertainty with Large Language Models

no code implementations4 Feb 2024 Ollie Liu, Deqing Fu, Dani Yogatama, Willie Neiswanger

The potential of large language models (LLMs) as decision support tools is increasingly being explored in fields such as business, engineering, and medicine, which often face challenging tasks of decision-making under uncertainty.

Decision Making Decision Making Under Uncertainty +3

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback

no code implementations29 Nov 2023 Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian

Then, it uses two VLMs to select the best generation: a Visual Question Answering model that measures the alignment of generated images to the text, and another that measures the generation's aesthetic quality.

Question Answering Text-to-Image Generation +1

Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression

1 code implementation26 Oct 2023 Deqing Fu, Tian-Qi Chen, Robin Jia, Vatsal Sharan

Transformers excel at in-context learning (ICL) -- learning from demonstrations without parameter updates -- but how they do so remains a mystery.

In-Context Learning

SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative Examples

1 code implementation13 May 2023 Deqing Fu, Ameya Godbole, Robin Jia

In this work, we propose Self-labeled Counterfactuals for Extrapolating to Negative Examples (SCENE), an automatic method for synthesizing training data that greatly improves models' ability to detect challenging negative examples.

Data Augmentation Natural Language Inference +2

Topological Regularization for Dense Prediction

no code implementations22 Nov 2021 Deqing Fu, Bradley J. Nelson

Dense prediction tasks such as depth perception and semantic segmentation are important applications in computer vision that have a concrete topological description in terms of partitioning an image into connected components or estimating a function with a small number of local extrema corresponding to objects in the image.

Prediction Semantic Segmentation

Harnessing the Conditioning Sensorium for Improved Image Translation

no code implementations ICCV 2021 Cooper Nederhood, Nicholas Kolkin, Deqing Fu, Jason Salavon

Multi-modal domain translation typically refers to synthesizing a novel image that inherits certain localized attributes from a 'content' image (e. g. layout, semantics, or geometry), and inherits everything else (e. g. texture, lighting, sometimes even semantics) from a 'style' image.

Decoder Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.