Search Results for author: Liang-Yan Gui

Found 14 papers, 5 papers with code

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation

no code implementations • 18 Apr 2024 • Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang

Using raw images as the sole training data, our method achieves unprecedented performance in self-supervised open-world segmentation, marking a significant milestone towards high-quality open-world entity segmentation in the absence of human-annotated masks.

Segmentation

Paper
Add Code

InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction

no code implementations • 28 Mar 2024 • Sirui Xu, Ziyin Wang, Yu-Xiong Wang, Liang-Yan Gui

However, extending such success to 3D dynamic human-object interaction (HOI) generation faces notable challenges, primarily due to the lack of large-scale interaction data and comprehensive descriptions that align with these interactions.

Human-Object Interaction Detection Language Modelling +2

Paper
Add Code

HASSOD: Hierarchical Adaptive Self-Supervised Object Detection

1 code implementation • NeurIPS 2023 • Shengcao Cao, Dhiraj Joshi, Liang-Yan Gui, Yu-Xiong Wang

The human visual perception system demonstrates exceptional capabilities in learning without explicit supervision and understanding the part-to-whole composition of objects.

Object object-detection +2

Paper
Code

Aligning Large Multimodal Models with Factually Augmented RLHF

no code implementations • 25 Sep 2023 • Zhiqing Sun, Sheng Shen, Shengcao Cao, Haotian Liu, Chunyuan Li, Yikang Shen, Chuang Gan, Liang-Yan Gui, Yu-Xiong Wang, Yiming Yang, Kurt Keutzer, Trevor Darrell

Large Multimodal Models (LMM) are built across modalities and the misalignment between two modalities can result in "hallucination", generating textual outputs that are not grounded by the multimodal information in context.

Hallucination Image Captioning +1

Paper
Add Code

InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion

1 code implementation • ICCV 2023 • Sirui Xu, Zhengyuan Li, Yu-Xiong Wang, Liang-Yan Gui

This paper addresses a novel task of anticipating 3D human-object interactions (HOIs).

3D Human Dynamics Human motion prediction +6

201

Paper
Code

Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation

no code implementations • 17 Aug 2023 • Shengcao Cao, Mengtian Li, James Hays, Deva Ramanan, Yi-Xiong Wang, Liang-Yan Gui

To distill knowledge from a highly accurate but complex teacher model, we construct a sequence of teachers to help the student gradually adapt.

Edge-computing Instance Segmentation +5

Paper
Add Code

Stochastic Multi-Person 3D Motion Forecasting

1 code implementation • 8 Jun 2023 • Sirui Xu, Yu-Xiong Wang, Liang-Yan Gui

This paper aims to deal with the ignored real-world complexities in prior work on human motion forecasting, emphasizing the social properties of multi-person motion, the diversity of motion and social interactions, and the complexity of articulated motion.

Motion Forecasting Stochastic Human Motion Prediction

Paper
Code

DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception

no code implementations • 5 May 2023 • Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

In this work, we propose DualCross, a cross-modality cross-domain adaptation framework to facilitate the learning of a more robust monocular bird's-eye-view (BEV) perception model, which transfers the point cloud knowledge from a LiDAR sensor in one domain during the training phase to the camera-only testing scenario in a different domain.

Domain Adaptation

Paper
Add Code

Contrastive Mean Teacher for Domain Adaptive Object Detectors

1 code implementation • CVPR 2023 • Shengcao Cao, Dhiraj Joshi, Liang-Yan Gui, Yu-Xiong Wang

Object detectors often suffer from the domain gap between training (source domain) and real-world applications (target domain).

Contrastive Learning Object +4

Paper
Code

Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors

1 code implementation • 9 Feb 2023 • Sirui Xu, Yu-Xiong Wang, Liang-Yan Gui

Predicting diverse human motions given a sequence of historical poses has received increasing attention.

Ranked #1 on Human Pose Forecasting on Human3.6M (MMADE metric)

Human Pose Forecasting motion prediction +1

Paper
Code

BEV-Guided Multi-Modality Fusion for Driving Perception

no code implementations • CVPR 2023 • Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

We design a BEV-guided multi-sensor attention block to take queries from BEV embeddings and learn the BEV representation from sensor-specific features.

Autonomous Driving Representation Learning

Paper
Add Code

Pose Guided Attention for Multi-label Fashion Image Classification

no code implementations • 12 Nov 2019 • Beatriz Quintino Ferreira, João P. Costeira, Ricardo G. Sousa, Liang-Yan Gui, João P. Gomes

We propose a compact framework with guided attention for multi-label classification in the fashion domain.

Classification General Classification +2

Paper
Add Code

Adversarial Geometry-Aware Human Motion Prediction

no code implementations • ECCV 2018 • Liang-Yan Gui, Yu-Xiong Wang, Xiaodan Liang, Jose M. F. Moura

We explore an approach to forecasting human motion in a few milliseconds given an input 3D skeleton sequence based on a recurrent encoder-decoder framework.

Human motion prediction motion prediction

Paper
Add Code

Few-Shot Human Motion Prediction via Meta-Learning

no code implementations • ECCV 2018 • Liang-Yan Gui, Yu-Xiong Wang, Deva Ramanan, Jose M. F. Moura

This paper addresses the problem of few-shot human motion prediction, in the spirit of the recent progress on few-shot learning and meta-learning.

Few-Shot Learning Human motion prediction +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.