Search Results for author: Qihao Liu

Found 19 papers, 8 papers with code

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

no code implementations19 Dec 2024 Qihao Liu, Xi Yin, Alan Yuille, Andrew Brown, Mannat Singh

For cross-modal tasks such as text-to-image generation, this same mapping from noise to image is learnt whilst including a conditioning mechanism in the model.

Depth Estimation Image Captioning +2

Automatic programming via large language models with population self-evolution for dynamic job shop scheduling problem

no code implementations30 Oct 2024 Jin Huang, Xinyu Li, Liang Gao, Qihao Liu, Yue Teng

To enhance the capabilities of LLMs in automatic HDRs design, this paper proposes a novel population self-evolutionary (SeEvo) method, a general search framework inspired by the self-reflective design strategies of human experts.

Deep Reinforcement Learning Evolutionary Algorithms +3

Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data

no code implementations18 Jul 2024 Wufei Ma, Kai Li, Zhongshi Jiang, Moustafa Meshry, Qihao Liu, Huiyu Wang, Christian Häne, Alan Yuille

In order to narrow the gap between video-text models and human performance on RCAD, we identify a key limitation of current contrastive approaches on video-text data and introduce LLM-teacher, a more effective approach to learn action semantics by leveraging knowledge obtained from a pretrained large language model.

Language Modelling Large Language Model +2

ImageNet3D: Towards General-Purpose Object-Level 3D Understanding

1 code implementation13 Jun 2024 Wufei Ma, Guanning Zeng, Guofeng Zhang, Qihao Liu, Letian Zhang, Adam Kortylewski, Yaoyao Liu, Alan Yuille

A vision model with general-purpose object-level 3D understanding should be capable of inferring both 2D (e. g., class name and bounding box) and 3D information (e. g., 3D location and 3D viewpoint) for arbitrary rigid objects in natural images.

Image Captioning Linear Probing Object-Level 3D Awareness +2

Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization

no code implementations13 Jun 2024 Qihao Liu, Zhanpeng Zeng, Ju He, Qihang Yu, Xiaohui Shen, Liang-Chieh Chen

This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization.

Image Generation

DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data

1 code implementation CVPR 2024 Qihao Liu, Yi Zhang, Song Bai, Adam Kortylewski, Alan Yuille

Unlike recent 3D generative models that rely on clean and well-aligned 3D data, limiting them to single or few-class generation, our model is directly trained on extensive noisy and unaligned `in-the-wild' 3D assets, mitigating the key challenge (i. e., data scarcity) in large-scale 3D generation.

3D Generation Text to 3D

Continual Adversarial Defense

1 code implementation15 Dec 2023 Qian Wang, Yaoyao Liu, Hefei Ling, Yingwei Li, Qihao Liu, Ping Li, Jiazhong Chen, Alan Yuille, Ning Yu

In response to the rapidly evolving nature of adversarial attacks against visual classifiers on a monthly basis, numerous defenses have been proposed to generalize against as many known attacks as possible.

Adversarial Defense Continual Learning +2

Generating Images with 3D Annotations Using Diffusion Models

no code implementations13 Jun 2023 Wufei Ma, Qihao Liu, Jiahao Wang, Angtian Wang, Xiaoding Yuan, Yi Zhang, Zihao Xiao, Guofeng Zhang, Beijia Lu, Ruxiao Duan, Yongrui Qi, Adam Kortylewski, Yaoyao Liu, Alan Yuille

With explicit 3D geometry control, we can easily change the 3D structures of the objects in the generated images and obtain ground-truth 3D annotations automatically.

3D geometry 3D Pose Estimation +1

Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search

no code implementations1 Jun 2023 Qihao Liu, Adam Kortylewski, Yutong Bai, Song Bai, Alan Yuille

(2) We find regions in the latent space that lead to distorted images independent of the text prompt, suggesting that parts of the latent space are not well-structured.

Adversarial Attack Efficient Exploration +1

InstMove: Instance Motion for Object-centric Video Segmentation

1 code implementation CVPR 2023 Qihao Liu, Junfeng Wu, Yi Jiang, Xiang Bai, Alan Yuille, Song Bai

A common solution is to use optical flow to provide motion information, but essentially it only considers pixel-level motion, which still relies on appearance similarity and hence is often inaccurate under occlusion and fast movement.

Object Optical Flow Estimation +3

PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation

1 code implementation CVPR 2023 Qihao Liu, Adam Kortylewski, Alan Yuille

We introduce a learning-based testing method, termed PoseExaminer, that automatically diagnoses HPS algorithms by searching over the parameter space of human pose images to find the failure modes.

Multi-agent Reinforcement Learning

The Runner-up Solution for YouTube-VIS Long Video Challenge 2022

no code implementations18 Nov 2022 Junfeng Wu, Yi Jiang, Qihao Liu, Xiang Bai, Song Bai

This technical report describes our 2nd-place solution for the ECCV 2022 YouTube-VIS Long Video Challenge.

Contrastive Learning Instance Segmentation +2

Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation

no code implementations29 Jul 2022 Qihao Liu, Yi Zhang, Song Bai, Alan Yuille

Inspired by the remarkable ability of humans to infer occluded joints from visible cues, we develop a method to explicitly model this process that significantly improves bottom-up multi-person human pose estimation with or without occlusions.

3D Human Pose Estimation 3D Multi-Person Pose Estimation (absolute) +2

In Defense of Online Models for Video Instance Segmentation

2 code implementations21 Jul 2022 Junfeng Wu, Qihao Liu, Yi Jiang, Song Bai, Alan Yuille, Xiang Bai

In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models gradually attracted less attention possibly due to their inferior performance.

Contrastive Learning Instance Segmentation +5

Nothing But Geometric Constraints: A Model-Free Method for Articulated Object Pose Estimation

no code implementations30 Nov 2020 Qihao Liu, Weichao Qiu, Weiyao Wang, Gregory D. Hager, Alan L. Yuille

We propose an unsupervised vision-based system to estimate the joint configurations of the robot arm from a sequence of RGB or RGB-D images without knowing the model a priori, and then adapt it to the task of category-independent articulated object pose estimation.

Optical Flow Estimation Pose Estimation

PNS: Population-Guided Novelty Search for Reinforcement Learning in Hard Exploration Environments

no code implementations26 Nov 2018 Qihao Liu, Yujia Wang, Xiaofeng Liu

To balance exploration and exploitation, the Novelty Search (NS) is employed in every chief agent to encourage policies with high novelty while maximizing per-episode performance.

continuous-control Continuous Control +3

Cannot find the paper you are looking for? You can Submit a new open access paper.