Search Results for author: Yaser Yacoob

Found 10 papers, 5 papers with code

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

3 code implementations • 15 Nov 2023 • Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu

Recognizing the need for a comprehensive evaluation of LMM chart understanding, we also propose a MultiModal Chart Benchmark (\textbf{MMC-Benchmark}), a comprehensive human-annotated benchmark with nine distinct tasks evaluating reasoning capabilities over charts.

205

Paper
Code

HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models

5 code implementations • 23 Oct 2023 • Tianrui Guan, Fuxiao Liu, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou

Our comprehensive case studies within HallusionBench shed light on the challenges of hallucination and illusion in LVLMs.

Ranked #1 on Visual Question Answering (VQA) on HallusionBench

Hallucination Visual Question Answering (VQA)

205

Paper
Code

Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

3 code implementations • 26 Jun 2023 • Fuxiao Liu, Kevin Lin, Linjie Li, JianFeng Wang, Yaser Yacoob, Lijuan Wang

To efficiently measure the hallucination generated by LMMs, we propose GPT4-Assisted Visual Instruction Evaluation (GAVIE), a stable approach to evaluate visual instruction tuning like human experts.

Ranked #4 on Visual Question Answering (VQA) on HallusionBench

Hallucination Visual Question Answering

205

Paper
Code

COVID-VTS: Fact Extraction and Verification on Short Video Platforms

1 code implementation • 15 Feb 2023 • Fuxiao Liu, Yaser Yacoob, Abhinav Shrivastava

We introduce a new benchmark, COVID-VTS, for fact-checking multi-modal information involving short-duration videos with COVID19- focused information from both the real world and machine generation.

Fact Checking Fact Selection +1

Paper
Code

One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2

no code implementations • 15 Feb 2023 • Trevine Oorloff, Yaser Yacoob

While recent research has progressively overcome the low-resolution constraint of one-shot face video re-enactment with the help of StyleGAN's high-fidelity portrait generation, these approaches rely on at least one of the following: explicit 2D/3D priors, optical flow based warping as motion descriptors, off-the-shelf encoders, etc., which constrain their performance (e. g., inconsistent predictions, inability to capture fine facial details and accessories, poor generalization, artifacts).

Attribute Disentanglement +2

Paper
Add Code

Robust One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2

no code implementations • ICCV 2023 • Trevine Oorloff, Yaser Yacoob

Addressing these limitations, we propose a novel framework exploiting the implicit 3D prior and inherent latent properties of StyleGAN2 to facilitate one-shot face re-enactment at 1024x1024 (1) with zero dependencies on explicit structural priors, (2) accommodating attribute edits, and (3) robust to diverse facial expressions and head poses of the source frame.

Attribute

Paper
Add Code

Expressive Talking Head Video Encoding in StyleGAN2 Latent-Space

2 code implementations • 28 Mar 2022 • Trevine Oorloff, Yaser Yacoob

To this end, we propose an end-to-end expressive face video encoding approach that facilitates data-efficient high-quality video re-synthesis by optimizing low-dimensional edits of a single Identity-latent.

Paper
Code

Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Faces

no code implementations • CVPR 2018 • Hao Zhou, Jin Sun, Yaser Yacoob, David W. Jacobs

We propose to train a deep Convolutional Neural Network (CNN) to regress lighting parameters from a single face image.

Denoising Image Forgery Detection +2

Paper
Add Code

Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images

no code implementations • CVPR 2018 • Hao Zhou, Jin Sun, Yaser Yacoob, David W. Jacobs

We propose to train a deep Convolutional Neural Network (CNN) to regress lighting parameters from a single face image.

Denoising Image Forgery Detection +2

Paper
Add Code

Modeling Colors of Single Attribute Variations with Application to Food Appearance

no code implementations • 18 Dec 2015 • Yaser Yacoob

This paper considers the intra-image color-space of an object or a scene when these are subject to a dominant single-source of variation.

Attribute Object +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.