Search Results for author: Wei Zhai

Found 43 papers, 26 papers with code

EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting

no code implementations20 Oct 2024 Bohao Liao, Wei Zhai, Zengyu Wan, Tianzhu Zhang, Yang Cao, Zheng-Jun Zha

First, we leverage the Event Generation Model (EGM) to fuse events and frames, supervising the rendered views observed by the event stream.

Visual-Geometric Collaborative Guidance for Affordance Learning

1 code implementation15 Oct 2024 Hongchen Luo, Wei Zhai, Jiao Wang, Yang Cao, Zheng-Jun Zha

Perceiving potential ``action possibilities'' (\ie, affordance) regions of images and learning interactive functionalities of objects from human demonstration is a challenging task due to the diversity of human-object interactions.

Human-Object Interaction Detection Object

MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media

1 code implementation14 Oct 2024 Wei Zhai, Nan Bai, Qing Zhao, Jianqiang Li, Fan Wang, Hongzhi Qi, Meng Jiang, Xiaoqin Wang, Bing Xiang Yang, Guanghui Fu

The proposed models were evaluated on three downstream tasks and achieved better or comparable performance compared to deep learning models, generalized LLMs, and task fine-tuned LLMs.

MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling

no code implementations14 Oct 2024 Jian Yang, Dacheng Yin, Yizhou Zhou, Fengyun Rao, Wei Zhai, Yang Cao, Zheng-Jun Zha

However, we have identified that recent methods inevitably suffer from loss of image information during understanding task, due to either image discretization or diffusion denoising steps.

Denoising Image Generation +2

VMAD: Visual-enhanced Multimodal Large Language Model for Zero-Shot Anomaly Detection

no code implementations30 Sep 2024 Huilin Deng, Hongchen Luo, Wei Zhai, Yang Cao, Yu Kang

Furthermore, we introduce the Real Industrial Anomaly Detection (RIAD), a comprehensive IAD dataset with detailed anomaly descriptions and analyses, offering a valuable resource for MLLM-based IAD development.

Anomaly Detection Language Modelling +3

Grounding 3D Scene Affordance From Egocentric Interactions

no code implementations29 Sep 2024 Cuiyu Liu, Wei Zhai, Yuhang Yang, Hongchen Luo, Sen Liang, Yang Cao, Zheng-Jun Zha

To empower the model with such abilities, we introduce a novel task: grounding 3D scene affordance from egocentric interactions, where the goal is to identify the corresponding affordance regions in a 3D scene based on an egocentric video of an interaction.

PEAR: Phrase-Based Hand-Object Interaction Anticipation

no code implementations31 Jul 2024 Zichen Zhang, Hongchen Luo, Wei Zhai, Yang Cao, Yu Kang

To address this, we propose a novel model, PEAR (Phrase-Based Hand-Object Interaction Anticipation), which jointly anticipates interaction intention and manipulation.

Object

CrysToGraph: A Comprehensive Predictive Model for Crystal Materials Properties and the Benchmark

no code implementations23 Jul 2024 Hongyi Wang, Ji Sun, Jinzhe Liang, Li Zhai, Zitian Tang, Zijian Li, Wei Zhai, Xusheng Wang, Weihao Gao, Sheng Gong

In this paper, we propose CrysToGraph ($\textbf{Crys}$tals with $\textbf{T}$ransformers $\textbf{o}$n $\textbf{Graph}$s), a novel transformer-based geometric graph network designed specifically for unconventional crystalline systems, and UnconvBench, a comprehensive benchmark to evaluate models' predictive performance on unconventional crystal materials such as defected crystals, low-dimension crystals and MOF.

EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

no code implementations22 May 2024 Yuhang Yang, Wei Zhai, Chengfeng Wang, Chengjun Yu, Yang Cao, Zheng-Jun Zha

For the egocentric HOI, in addition to perceiving semantics e. g., ''what'' interaction is occurring, capturing ''where'' the interaction specifically manifests in 3D space is also crucial, which links the perception and operation.

Human-Object Interaction Detection Object

ViViD: Video Virtual Try-on using Diffusion Models

1 code implementation20 May 2024 Zixun Fang, Wei Zhai, Aimin Su, Hongliang Song, Kai Zhu, Mao Wang, Yu Chen, Zhiheng Liu, Yang Cao, Zheng-Jun Zha

Video virtual try-on aims to transfer a clothing item onto the video of a target person.

Virtual Try-on

Bidirectional Progressive Transformer for Interaction Intention Anticipation

no code implementations9 May 2024 Zichen Zhang, Hongchen Luo, Wei Zhai, Yang Cao, Yu Kang

Building upon this relationship, a novel Bidirectional prOgressive Transformer (BOT), which introduces a Bidirectional Progressive mechanism into the anticipation of interaction intention is established.

Trajectory Forecasting

SOS-1K: A Fine-grained Suicide Risk Classification Dataset for Chinese Social Media Analysis

1 code implementation19 Apr 2024 Hongzhi Qi, Hanfei Liu, Jianqiang Li, Qing Zhao, Wei Zhai, Dan Luo, Tian Yu He, Shuo Liu, Bing Xiang Yang, Guanghui Fu

Seven pre-trained models were evaluated in two tasks: high and low suicide risk, and fine-grained suicide risk classification on a level of 0 to 10.

Data Augmentation Language Modelling +1

MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking

no code implementations18 Apr 2024 Zhong Wang, Zengyu Wan, Han Han, Bohao Liao, Yuliang Wu, Wei Zhai, Yang Cao, Zheng-Jun Zha

Event-based eye tracking has shown great promise with the high temporal resolution and low redundancy provided by the event camera.

Data Augmentation Diversity

AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts

1 code implementation17 Apr 2024 Meng Jiang, Yi Jing Yu, Qing Zhao, Jianqiang Li, Changwei Song, Hongzhi Qi, Wei Zhai, Dan Luo, Xiaoqin Wang, Guanghui Fu, Bing Xiang Yang

Cognitive Behavioral Therapy (CBT) is an effective technique for addressing the irrational thoughts stemming from mental illnesses, but it necessitates precise identification of cognitive pathways to be successfully implemented in patient care.

Deep Learning Hallucination +3

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

3 code implementations16 Apr 2024 Bin Ren, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang, Wei Zhai, Renjing Pei, Jiaming Guo, Songcen Xu, Yang Cao, ZhengJun Zha, Yan Wang, Yi Liu, Qing Wang, Gang Zhang, Liou Zhang, Shijie Zhao, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Xin Liu, Min Yan, Menghan Zhou, Yiqiang Yan, Yixuan Liu, Wensong Chan, Dehua Tang, Dong Zhou, Li Wang, Lu Tian, Barsoum Emad, Bohan Jia, Junbo Qiao, Yunshuai Zhou, Yun Zhang, Wei Li, Shaohui Lin, Shenglong Zhou, Binbin Chen, Jincheng Liao, Suiyi Zhao, Zhao Zhang, Bo wang, Yan Luo, Yanyan Wei, Feng Li, Mingshen Wang, Yawei Li, Jinhan Guan, Dehua Hu, Jiawei Yu, Qisheng Xu, Tao Sun, Long Lan, Kele Xu, Xin Lin, Jingtong Yue, Lehan Yang, Shiyi Du, Lu Qi, Chao Ren, Zeyu Han, YuHan Wang, Chaolin Chen, Haobo Li, Mingjun Zheng, Zhongbao Yang, Lianhong Song, Xingzhuo Yan, Minghan Fu, Jingyi Zhang, Baiang Li, Qi Zhu, Xiaogang Xu, Dan Guo, Chunle Guo, Jiadi Chen, Huanhuan Long, Chunjiang Duanmu, Xiaoyan Lei, Jie Liu, Weilin Jia, Weifeng Cao, Wenlong Zhang, Yanyu Mao, Ruilong Guo, Nihao Zhang, Qian Wang, Manoj Pandey, Maksym Chernozhukov, Giang Le, Shuli Cheng, Hongyuan Wang, Ziyan Wei, Qingting Tang, Liejun Wang, Yongming Li, Yanhui Guo, Hao Xu, Akram Khatami-Rizi, Ahmad Mahmoudi-Aznaveh, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi

In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking.

Image Super-Resolution

Event-based Asynchronous HDR Imaging by Temporal Incident Light Modulation

no code implementations14 Mar 2024 Yuliang Wu, Ganchao Tan, Jinze Chen, Wei Zhai, Yang Cao, Zheng-Jun Zha

In this paper, we propose AsynHDR, a Pixel-Asynchronous HDR imaging system, based on key insights into the challenges in HDR imaging and the unique event-generating mechanism of Dynamic Vision Sensors (DVS).

Intention-driven Ego-to-Exo Video Generation

no code implementations14 Mar 2024 Hongchen Luo, Kai Zhu, Wei Zhai, Yang Cao

Finally, the inferred human movement and high-level action descriptions jointly guide the generation of exocentric motion and interaction content (i. e., corresponding optical flow and occlusion maps) in the backward process of the diffusion model, ultimately warping them into the corresponding exocentric video.

Optical Flow Estimation Stereo Matching +1

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images

no code implementations CVPR 2024 Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Zheng-Jun Zha

Which underexploit certain correlations between the interaction counterparts (human and object), and struggle to address the uncertainty in interactions.

Human-Object Interaction Detection Object +1

Likelihood-Aware Semantic Alignment for Full-Spectrum Out-of-Distribution Detection

1 code implementation4 Dec 2023 Fan Lu, Kai Zhu, Kecheng Zheng, Wei Zhai, Yang Cao

Full-spectrum out-of-distribution (F-OOD) detection aims to accurately recognize in-distribution (ID) samples while encountering semantic and covariate shifts simultaneously.

Out-of-Distribution Detection

Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation

2 code implementations22 Sep 2023 Wei Zhai, Pingyu Wu, Kai Zhu, Yang Cao, Feng Wu, Zheng-Jun Zha

In addition, our method also achieves state-of-the-art weakly supervised semantic segmentation performance on the PASCAL VOC 2012 and MS COCO 2014 datasets.

Object Weakly-Supervised Object Localization +2

Spatial-Aware Token for Weakly Supervised Object Localization

1 code implementation ICCV 2023 Pingyu Wu, Wei Zhai, Yang Cao, Jiebo Luo, Zheng-Jun Zha

Specifically, a spatial token is first introduced in the input space to aggregate representations for localization task.

Object Weakly-Supervised Object Localization

Uncertainty-Aware Optimal Transport for Semantically Coherent Out-of-Distribution Detection

1 code implementation CVPR 2023 Fan Lu, Kai Zhu, Wei Zhai, Kecheng Zheng, Yang Cao

Semantically coherent out-of-distribution (SCOOD) detection aims to discern outliers from the intended data distribution with access to unlabeled extra set.

Out-of-Distribution Detection

Grounding 3D Object Affordance from 2D Interactions in Images

1 code implementation ICCV 2023 Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Jiebo Luo, Zheng-Jun Zha

Comprehensive experiments on PIAD demonstrate the reliability of the proposed task and the superiority of our method.

Object

Leverage Interactive Affinity for Affordance Learning

1 code implementation CVPR 2023 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

Perceiving potential "action possibilities" (i. e., affordance) regions of images and learning interactive functionalities of objects from human demonstration is a challenging task due to the diversity of human-object interactions.

Human-Object Interaction Detection Object

Grounded Affordance from Exocentric View

2 code implementations28 Aug 2022 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

Due to the diversity of interactive affordance, the uniqueness of different individuals leads to diverse interactions, which makes it difficult to establish an explicit link between object parts and affordance labels.

Diversity Human-Object Interaction Detection +2

Location-Free Camouflage Generation Network

1 code implementation18 Mar 2022 Yangyang Li, Wei Zhai, Yang Cao, Zheng-Jun Zha

However, these methods struggle in 1) efficiently generating camouflage images using foreground and background with arbitrary structure; 2) camouflaging foreground objects to regions with multiple appearances (e. g. the junction of the vegetation and the mountains), which limit their practical application.

Learning Affordance Grounding from Exocentric Images

2 code implementations CVPR 2022 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

To empower an agent with such ability, this paper proposes a task of affordance grounding from exocentric view, i. e., given exocentric human-object interaction and egocentric object images, learning the affordance knowledge of the object and transferring it to the egocentric image using only the affordance label as supervision.

Diversity Human-Object Interaction Detection +2

Phrase-Based Affordance Detection via Cyclic Bilateral Interaction

4 code implementations24 Feb 2022 Liangsheng Lu, Wei Zhai, Hongchen Luo, Yu Kang, Yang Cao

In this paper, we explore to perceive affordance from a vision-language perspective and consider the challenging phrase-based affordance detection problem, i. e., given a set of phrases describing the action purposes, all the object regions in a scene with the same affordance should be detected.

Affordance Detection

Background Activation Suppression for Weakly Supervised Object Localization

2 code implementations CVPR 2022 Pingyu Wu, Wei Zhai, Yang Cao

Existing FPM-based methods use cross-entropy (CE) to evaluate the foreground prediction map and to guide the learning of generator.

Object Weakly-Supervised Object Localization

On Exploring and Improving Robustness of Scene Text Detection Models

1 code implementation12 Oct 2021 Shilian Wu, Wei Zhai, Yongrui Li, Kewei Wang, Zengfu Wang

It is crucial to understand the robustness of text detection models with regard to extensive corruptions, since scene text detection techniques have many practical applications.

Region Proposal Scene Text Detection +2

Learning Visual Affordance Grounding from Demonstration Videos

no code implementations12 Aug 2021 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

For the object branch, we introduce a semantic enhancement module (SEM) to make the network focus on different parts of the object according to the action classes and utilize a distillation loss to align the output features of the object branch with that of the video branch and transfer the knowledge in the video branch to the object branch.

Action Recognition Object +1

One-Shot Object Affordance Detection in the Wild

1 code implementation8 Aug 2021 Wei Zhai, Hongchen Luo, Jing Zhang, Yang Cao, DaCheng Tao

To empower robots with this ability in unseen scenarios, we first study the challenging one-shot affordance detection problem in this paper, i. e., given a support image that depicts the action purpose, all objects in a scene with the common affordance should be detected.

Action Recognition Affordance Detection +3

One-Shot Affordance Detection

2 code implementations28 Jun 2021 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

To empower robots with this ability in unseen scenarios, we consider the challenging one-shot affordance detection problem in this paper, i. e., given a support image that depicts the action purpose, all objects in a scene with the common affordance should be detected.

4k Affordance Detection

Deep Structure-Revealed Network for Texture Recognition

no code implementations CVPR 2020 Wei Zhai, Yang Cao, Zheng-Jun Zha, HaiYong Xie, Feng Wu

Next, these primitives are associated with a dependence learning module (DLM) to generate structural representation, in which a two-way collaborative relationship strategy is introduced to perceive the spatial dependencies among multiple primitives.

Semantic Segmentation

Self-Supervised Tuning for Few-Shot Segmentation

no code implementations12 Apr 2020 Kai Zhu, Wei Zhai, Zheng-Jun Zha, Yang Cao

Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples.

Meta-Learning Segmentation

Deep Multiple-Attribute-Perceived Network for Real-World Texture Recognition

no code implementations ICCV 2019 Wei Zhai, Yang Cao, Jing Zhang, Zheng-Jun Zha

Texture recognition is a challenging visual task as multiple perceptual attributes may be perceived from the same texture image when combined with different spatial context.

Attribute

One-Shot Texture Retrieval with Global Context Metric

no code implementations16 May 2019 Kai Zhu, Wei Zhai, Zheng-Jun Zha, Yang Cao

In this paper, we tackle one-shot texture retrieval: given an example of a new reference texture, detect and segment all the pixels of the same texture category within an arbitrary image.

Relation Relation Network +2

Cannot find the paper you are looking for? You can Submit a new open access paper.