Search Results for author: Yunhao Ge

Found 25 papers, 11 papers with code

BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

no code implementations CVPR 2024 Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu

We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models, based on the newly developed embodied AI benchmark, BEHAVIOR-1K.

Scene Understanding

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

no code implementations CVPR 2024 Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui

VFC consists of three steps: 1) proposal, where image-to-text captioning models propose multiple initial captions; 2) verification, where a large language model (LLM) utilizes tools such as object detection and VQA models to fact-check proposed captions; 3) captioning, where an LLM generates the final caption by summarizing caption proposals and the fact check verification results.

Caption Generation Hallucination +7

DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

no code implementations21 Dec 2023 Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge

We introduce a solution that allows a pretrained T2I diffusion model to learn a set of soft prompts, enabling the generation of novel images by sampling prompts from the learned distribution.

Text to 3D

Evaluating Pretrained models for Deployable Lifelong Learning

no code implementations22 Nov 2023 Kiran Lekkala, Eshan Bhargava, Yunhao Ge, Laurent Itti

We create a novel benchmark for evaluating a Deployable Lifelong Learning system for Visual Reinforcement Learning (RL) that is pretrained on a curated dataset, and propose a novel Scalable Lifelong Learning system capable of retaining knowledge from the previously learnt RL tasks.

Atari Games Few-Shot Class-Incremental Learning +2

CLR: Channel-wise Lightweight Reprogramming for Continual Learning

1 code implementation ICCV 2023 Yunhao Ge, Yuecheng Li, Shuo Ni, Jiaping Zhao, Ming-Hsuan Yang, Laurent Itti

Reprogramming parameters are task-specific and exclusive to each task, which makes our method immune to catastrophic forgetting.

Continual Learning Image Classification

Robustness Analysis on Foundational Segmentation Models

no code implementations15 Jun 2023 Madeline Chantry Schiappa, Shehreen Azad, Sachidanand VS, Yunhao Ge, Ondrej Miksik, Yogesh S. Rawat, Vibhav Vineet

In this work, we perform a robustness analysis of Visual Foundation Models (VFMs) for segmentation tasks and focus on robustness against real-world distribution shift inspired perturbations.

object-detection Object Detection +1

Building One-class Detector for Anything: Open-vocabulary Zero-shot OOD Detection Using Text-image Models

no code implementations26 May 2023 Yunhao Ge, Jie Ren, Jiaping Zhao, KaiFeng Chen, Andrew Gallagher, Laurent Itti, Balaji Lakshminarayanan

Despite considerable effort, the problem remains significantly challenging in deep learning models due to their propensity to output over-confident predictions for OOD inputs.

Out of Distribution (OOD) Detection

Lightweight Learner for Shared Knowledge Lifelong Learning

1 code implementation24 May 2023 Yunhao Ge, Yuecheng Li, Di wu, Ao Xu, Adam M. Jones, Amanda Sofie Rios, Iordanis Fostiropoulos, Shixian Wen, Po-Hsuan Huang, Zachary William Murdock, Gozde Sahin, Shuo Ni, Kiran Lekkala, Sumedh Anand Sontakke, Laurent Itti

We propose a new Shared Knowledge Lifelong Learning (SKILL) challenge, which deploys a decentralized population of LL agents that each sequentially learn different tasks, with all agents operating independently and in parallel.

Image Classification

EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation

1 code implementation15 Dec 2022 Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Laurent Itti, Vibhav Vineet

Finally, the third component creates a large-scale pseudo-labeled instance segmentation training dataset by compositing the foreground object masks onto the original and generated background images.

Instance Segmentation Object +4

Neural-Sim: Learning to Generate Training Data with NeRF

1 code implementation22 Jul 2022 Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet

However, existing approaches either require human experts to manually tune each scene property or use automatic methods that provide little to no control; this requires rendering large amounts of random data variations, which is slow and is often suboptimal for the target domain.

Object Detection

Contributions of Shape, Texture, and Color in Visual Recognition

1 code implementation19 Jul 2022 Yunhao Ge, Yao Xiao, Zhi Xu, Xingrui Wang, Laurent Itti

We use human experiments to confirm that both HVE and humans predominantly use some specific features to support the classification of specific classes (e. g., texture is the dominant feature to distinguish a zebra from other quadrupeds, both for humans and HVE).

Attribute General Classification +2

DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

no code implementations20 Jun 2022 Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

For foreground object mask generation, we use a simple textual template with object class name as input to DALL-E to generate a diverse set of foreground images.

Image Captioning Image Generation +4

Invariant Structure Learning for Better Generalization and Causal Explainability

no code implementations13 Jun 2022 Yunhao Ge, Sercan Ö. Arik, Jinsung Yoon, Ao Xu, Laurent Itti, Tomas Pfister

ISL splits the data into different environments, and learns a structure that is invariant to the target across different environments by imposing a consistency constraint.

Self-Supervised Learning

Encouraging Disentangled and Convex Representation with Controllable Interpolation Regularization

no code implementations6 Dec 2021 Yunhao Ge, Zhi Xu, Yao Xiao, Gan Xin, Yunkui Pang, Laurent Itti

(2) They lack convexity constraints, which is important for meaningfully manipulating specific attributes for downstream tasks.

Data Augmentation Disentanglement +2

Towards Generic Interface for Human-Neural Network Knowledge Exchange

no code implementations29 Sep 2021 Yunhao Ge, Yao Xiao, Zhi Xu, Linwei Li, Ziyan Wu, Laurent Itti

Take image classification as an example, HNI visualizes the reasoning logic of a NN with class-specific Structural Concept Graphs (c-SCG), which are human-interpretable.

Image Classification Zero-Shot Learning

Generative Auto-Encoder: Non-adversarial Controllable Synthesis with Disentangled Exploration

no code implementations1 Jan 2021 Yunhao Ge, Gan Xin, Zhi Xu, Yao Xiao, Yunkui Pang, Yining HE, Laurent Itti

DEAE can become a generative model and synthesis semantic controllable samples by interpolating latent code, which can even synthesis novel attribute value never is shown in the original dataset.

Attribute Data Augmentation +3

Beneficial Perturbation Network for designing general adaptive artificial intelligence systems

no code implementations27 Sep 2020 Shixian Wen, Amanda Rios, Yunhao Ge, Laurent Itti

Continual learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby a previously learned mapping of an old task is erased when learning new mappings for new tasks.

Continual Learning

Zero-shot Synthesis with Group-Supervised Learning

1 code implementation ICLR 2021 Yunhao Ge, Sami Abu-El-Haija, Gan Xin, Laurent Itti

Visual cognition of primates is superior to that of artificial neural networks in its ability to 'envision' a visual object, even a newly-introduced one, in different attributes including pose, position, color, texture, etc.

Pose Augmentation: Class-agnostic Object Pose Transformation for Object Recognition

1 code implementation ECCV 2020 Yunhao Ge, Jiaping Zhao, Laurent Itti

After training on unbalanced discrete poses (5 classes with 6 poses per object instance, plus 5 classes with only 2 poses), we show that OPT-Net can synthesize balanced continuous new poses along yaw and pitch axes with high quality.

Object Object Recognition

Synthesis and Inpainting-Based MR-CT Registration for Image-Guided Thermal Ablation of Liver Tumors

no code implementations30 Jul 2019 Dongming Wei, Sahar Ahmad, Jiayu Huo, Wen Peng, Yunhao Ge, Zhong Xue, Pew-Thian Yap, Wentao Li, Dinggang Shen, Qian Wang

Then, an unsupervised registration network is used to efficiently align the pre-procedural CT (pCT) with the inpainted iCT (inpCT) image.

Image Registration

Cannot find the paper you are looking for? You can Submit a new open access paper.