1 code implementation • Findings (ACL) 2022 • Dawei Li, Yanran Li, Jiayi Zhang, Ke Li, Chen Wei, Jianwei Cui, Bin Wang
Existing commonsense knowledge bases often organize tuples in an isolated manner, which is deficient for commonsense conversational models to plan the next steps.
no code implementations • CCL 2021 • Chen Wei, Chen Xiaoying, Xiong Shengwu
In the paper we propose a global entity alignment model with gated latent space neighborhood aggregation (LatsEA) to address this challenge.
no code implementations • 16 Jun 2025 • Runtao Liu, Jiahao Zhan, Yingqing He, Chen Wei, Alan Yuille, Qifeng Chen
An effective reward model plays a pivotal role in reinforcement learning for post-training enhancement of visual generative models.
no code implementations • 9 Jun 2025 • Yifei Li, Hanane Nour Moussa, Ziru Chen, Shijie Chen, Botao Yu, Mingyi Xue, Benjamin Burns, Tzu-Yao Chiu, Vishal Dey, Zitong Lu, Chen Wei, Qianheng Zhang, Tianyu Zhang, Song Gao, Xuhui Huang, Xia Ning, Nesreen K. Ahmed, Ali Payani, Huan Sun
Using our pipeline, we construct AutoSDT-5K, a dataset of 5, 404 coding tasks for data-driven discovery that covers four scientific disciplines and 756 unique Python packages.
1 code implementation • 9 Jun 2025 • Yunfei Xie, Yinsong Ma, Shiyi Lan, Alan Yuille, Junfei Xiao, Chen Wei
Developing generalizable reasoning capabilities in multimodal large language models (MLLMs) remains challenging.
no code implementations • 6 May 2025 • Chen Wei, Chi Zhang, Jiachen Zou, Haotian Deng, Dietmar Heinke, Quanying Liu
Human decision-making in cognitive tasks and daily life exhibits considerable variability, shaped by factors such as task difficulty, individual preferences, and personal experiences.
no code implementations • 30 Apr 2025 • Zhu Jiawei, Chen Wei
Through an in-depth analysis of the evaluation results, this report reveals the strengths and weaknesses of LLMs in handling high school science questions and discusses their implications for educational practice.
1 code implementation • 22 Apr 2025 • Zebin Yao, Lei Ren, Huixing Jiang, Chen Wei, Xiaojie Wang, Ruifan Li, Fangxiang Feng
Subject-driven image generation aims to synthesize novel scenes that faithfully preserve subject identity from reference images while adhering to textual guidance, yet existing methods struggle with a critical trade-off between fidelity and efficiency.
1 code implementation • 17 Apr 2025 • Daniel Bolya, Po-Yao Huang, Peize Sun, Jang Hyun Cho, Andrea Madotto, Chen Wei, Tengyu Ma, Jiale Zhi, Jathushan Rajasegaran, Hanoona Rasheed, Junke Wang, Marco Monteiro, Hu Xu, Shiyu Dong, Nikhila Ravi, Daniel Li, Piotr Dollár, Christoph Feichtenhofer
Together with the core contrastive checkpoint, our PE family of models achieves state-of-the-art performance on a wide variety of tasks, including zero-shot image and video classification and retrieval; document, image, and video Q&A; and spatial tasks such as detection, depth estimation, and tracking.
Ranked #1 on
Object Detection
on COCO minival
(using extra training data)
3 code implementations • 16 Jan 2025 • Wanqi Yin, Zhongang Cai, Ruisi Wang, Ailing Zeng, Chen Wei, Qingping Sun, Haiyi Mei, Yanjun Wang, Hui En Pang, Mingyuan Zhang, Lei Zhang, Chen Change Loy, Atsushi Yamashita, Lei Yang, Ziwei Liu
To exclude the influence of algorithmic design, we base our experiments on two minimalist architectures: SMPLer-X, which consists of an intermediate step for hand and face localization, and SMPLest-X, an even simpler version that reduces the network to its bare essentials and highlights significant advances in the capture of articulated hands.
no code implementations • CVPR 2025 • Runtao Liu, HaoYu Wu, Zheng Ziqiang, Chen Wei, Yingqing He, Renjie Pi, Qifeng Chen
Unlike previous image alignment methods that focus solely on either (i) visual quality or (ii) semantic alignment between text and videos, we comprehensively consider both dimensions and construct a preference score accordingly, which we term the OmniScore.
no code implementations • 12 Dec 2024 • Taiming Lu, Tianmin Shu, Junfei Xiao, Luoxin Ye, Jiahao Wang, Cheng Peng, Chen Wei, Daniel Khashabi, Rama Chellappa, Alan Yuille, Jieneng Chen
In this work, we take a step toward this goal by introducing GenEx, a system capable of planning complex embodied world exploration, guided by its generative imagination that forms priors (expectations) about the surrounding environments.
1 code implementation • 7 Oct 2024 • Ziru Chen, Shijie Chen, Yuting Ning, Qianheng Zhang, Boshi Wang, Botao Yu, Yifei Li, Zeyi Liao, Chen Wei, Zitong Lu, Vishal Dey, Mingyi Xue, Frazier N. Baker, Benjamin Burns, Daniel Adu-Ampratwum, Xuhui Huang, Xia Ning, Song Gao, Yu Su, Huan Sun
To this end, we present ScienceAgentBench, a new benchmark for evaluating language agents for data-driven scientific discovery.
1 code implementation • 20 Jul 2024 • Chen Wei, Jiachen Zou, Dietmar Heinke, Quanying Liu
Similarity judgment tasks are effective for exploring these concepts.
no code implementations • 2 Jul 2024 • Chaofan Luo, Donglin Di, Xun Yang, Yongjia Ma, Zhou Xue, Chen Wei, Yebin Liu
To validate the effectiveness of our method, we analyze 2D examples to demonstrate the improved consistency with the VCAC module.
no code implementations • 24 May 2024 • Sucheng Ren, Hongru Zhu, Chen Wei, Yijiang Li, Alan Yuille, Cihang Xie
This paper presents a new self-supervised video representation learning framework, ARVideo, which autoregressively predicts the next video token in a tailored sequence order.
1 code implementation • 25 Apr 2024 • Chen Wei, Jiachen Zou, Dietmar Heinke, Quanying Liu
Generating visual stimuli with controlling concepts is the key.
1 code implementation • CVPR 2024 • Qingping Sun, Yanjun Wang, Ailing Zeng, Wanqi Yin, Chen Wei, Wenjia Wang, Haiyi Mei, Chi Sing Leung, Ziwei Liu, Lei Yang, Zhongang Cai
Expressive human pose and shape estimation (a. k. a.
Ranked #1 on
3D Multi-Person Mesh Recovery
on AGORA
1 code implementation • 19 Mar 2024 • Wanqi Yin, Zhongang Cai, Ruisi Wang, Fanzhou Wang, Chen Wei, Haiyi Mei, Weiye Xiao, Zhitao Yang, Qingping Sun, Atsushi Yamashita, Ziwei Liu, Lei Yang
In this study, we aim to recover expressive parametric human models (i. e., SMPL-X) and corresponding camera poses jointly, by leveraging the synergy between three critical players: the world, the human, and the camera.
1 code implementation • CVPR 2024 • Qi Chen, Xiaoxi Chen, Haorui Song, Zhiwei Xiong, Alan Yuille, Chen Wei, Zongwei Zhou
Tumor synthesis enables the creation of artificial tumors in medical images, facilitating the training of AI models for tumor detection and segmentation.
1 code implementation • 15 Feb 2024 • Weixiang Zhao, Zhuojun Li, Shilong Wang, Yang Wang, Yulin Hu, Yanyan Zhao, Chen Wei, Bing Qin
Emotional Intelligence (EI), consisting of emotion perception, emotion cognition and emotion expression, plays the critical roles in improving user interaction experience for the current large language model (LLM) based conversational general AI assistants.
no code implementations • 4 Feb 2024 • Youzhi Qu, Chen Wei, Penghui Du, Wenxin Che, Chi Zhang, Wanli Ouyang, Yatao Bian, Feiyang Xu, Bin Hu, Kai Du, Haiyan Wu, Jia Liu, Quanying Liu
During the evolution of large models, performance evaluation is necessarily performed to assess their capabilities and ensure safety before practical application.
no code implementations • 31 Jan 2024 • Song Wang, Chen Wei, Kexin Lou, Dongfeng Gu, Quanying Liu
Here, we present a novel method which utilizes the Brain Geometric-informed Basis Functions (GBFs) as priors to enhance EEG/MEG source imaging.
no code implementations • 18 Dec 2023 • Bingchen Zhao, Haoqin Tu, Chen Wei, Jieru Mei, Cihang Xie
This paper introduces an efficient strategy to transform Large Language Models (LLMs) into Multi-Modal Large Language Models (MLLMs).
no code implementations • CVPR 2024 • Zhongang Cai, Jianping Jiang, Zhongfei Qing, Xinying Guo, Mingyuan Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan, Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren, Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu
In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment.
Ranked #3 on
Motion Synthesis
on InterHuman
no code implementations • 27 Nov 2023 • Jiang Liu, Chen Wei, Yuxiang Guo, Heng Yu, Alan Yuille, Soheil Feizi, Chun Pong Lau, Rama Chellappa
We propose Instruct2Attack (I2A), a language-guided semantic attack that generates semantically meaningful perturbations according to free-form language instructions.
1 code implementation • CVPR 2024 • Chen Wei, Chenxi Liu, Siyuan Qiao, Zhishuai Zhang, Alan Yuille, Jiahui Yu
We demonstrate text as a strong cross-modal interface.
2 code implementations • NeurIPS 2023 • Zhongang Cai, Wanqi Yin, Ailing Zeng, Chen Wei, Qingping Sun, Yanjun Wang, Hui En Pang, Haiyi Mei, Mingyuan Zhang, Lei Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
1) For the data scaling, we perform a systematic investigation on 32 EHPS datasets, including a wide range of scenarios that a model trained on any single dataset cannot handle.
Ranked #2 on
3D Human Pose Estimation
on UBody
1 code implementation • 13 Sep 2023 • Haoqin Tu, Bingchen Zhao, Chen Wei, Cihang Xie
Multi-modal large language models (MLLMs) are trained based on large language models (LLM), with an enhanced capability to comprehend multi-modal inputs and generate textual responses.
no code implementations • 28 Aug 2023 • Zhongang Cai, Liang Pan, Chen Wei, Wanqi Yin, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
To tackle these challenges, we propose a principled framework, PointHPS, for accurate 3D HPS from point clouds captured in real-world settings, which iteratively refines point features through a cascaded architecture.
4 code implementations • 1 Jun 2023 • Chaitanya Ryali, Yuan-Ting Hu, Daniel Bolya, Chen Wei, Haoqi Fan, Po-Yao Huang, Vaibhav Aggarwal, Arkabandhu Chowdhury, Omid Poursaeed, Judy Hoffman, Jitendra Malik, Yanghao Li, Christoph Feichtenhofer
Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.
Ranked #1 on
Image Classification
on iNaturalist 2019
(using extra training data)
1 code implementation • ICCV 2023 • Chen Wei, Karttikeya Mangalam, Po-Yao Huang, Yanghao Li, Haoqi Fan, Hu Xu, Huiyu Wang, Cihang Xie, Alan Yuille, Christoph Feichtenhofer
There has been a longstanding belief that generation can facilitate a true understanding of visual data.
1 code implementation • ICCV 2023 • Zhitao Yang, Zhongang Cai, Haiyi Mei, Shuai Liu, Zhaoxi Chen, Weiye Xiao, Yukun Wei, Zhongfei Qing, Chen Wei, Bo Dai, Wayne Wu, Chen Qian, Dahua Lin, Ziwei Liu, Lei Yang
Synthetic data has emerged as a promising source for 3D human research as it offers low-cost access to large-scale human datasets.
no code implementations • 17 Mar 2023 • Xiuying Chen, Mingzhe Li, Jiayi Zhang, Xiaoqiang Xia, Chen Wei, Jianwei Cui, Xin Gao, Xiangliang Zhang, Rui Yan
As it is cumbersome and expensive to acquire a huge amount of data for training neural dialog models, data augmentation is proposed to effectively utilize existing training samples.
1 code implementation • 20 Dec 2022 • Junyang Wu, Xianhang Li, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie
This paper presents a simple and effective visual prompting method for adapting pre-trained models to downstream recognition tasks.
no code implementations • ICCV 2023 • Yuanze Lin, Chen Wei, Huiyu Wang, Alan Yuille, Cihang Xie
Coupling all these designs allows our method to enjoy both competitive performances on text-to-video retrieval and video question answering tasks, and much less pre-training costs by 1. 9X or more.
1 code implementation • CVPR 2023 • Yutong Bai, Zeyu Wang, Junfei Xiao, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie
For example, by distilling the knowledge from an MAE pre-trained ViT-L into a ViT-B, our method achieves 84. 0% ImageNet top-1 accuracy, outperforming the baseline of directly distilling a fine-tuned ViT-L by 1. 2%.
1 code implementation • 23 Jul 2022 • Chen Wei, Shenghan Ren, Kaitai Guo, Haihong Hu, Jimin Liang
Most of the existing Transformer-based networks for medical image segmentation are U-Net-like architecture that contains an encoder that utilizes a sequence of Transformer blocks to convert the input medical image from high-resolution representation into low-resolution feature maps and a decoder that gradually recovers the high-resolution representation from low-resolution feature maps.
1 code implementation • 3 May 2022 • Xianhang Li, Huiyu Wang, Chen Wei, Jieru Mei, Alan Yuille, Yuyin Zhou, Cihang Xie
Inspired by this observation, we hypothesize that the key to effectively leveraging image pre-training lies in the decomposition of learning spatial and temporal features, and revisiting image pre-training as the appearance prior to initializing 3D kernels.
1 code implementation • 6 Apr 2022 • Dawei Li, Yanran Li, Jiayi Zhang, Ke Li, Chen Wei, Jianwei Cui, Bin Wang
Existing commonsense knowledge bases often organize tuples in an isolated manner, which is deficient for commonsense conversational models to plan the next steps.
1 code implementation • 22 Mar 2022 • Feng Wang, Huiyu Wang, Chen Wei, Alan Yuille, Wei Shen
Recent advances in self-supervised contrastive learning yield good image-level representation, which favors classification tasks but usually neglects pixel-level detailed information, leading to unsatisfactory transfer performance to dense prediction tasks such as semantic segmentation.
6 code implementations • CVPR 2022 • Chen Wei, Haoqi Fan, Saining Xie, Chao-yuan Wu, Alan Yuille, Christoph Feichtenhofer
We present Masked Feature Prediction (MaskFeat) for self-supervised pre-training of video models.
Ranked #8 on
Action Recognition
on AVA v2.2
(using extra training data)
1 code implementation • 2 Dec 2021 • Junjie Yu, Chenyi Li, Kexin Lou, Chen Wei, Quanying Liu
DeepSeparator employs an encoder to extract and amplify the features in the raw EEG, a module called decomposer to extract the trend, detect and suppress artifact and a decoder to reconstruct the denoised signal.
no code implementations • 16 Nov 2021 • Yuxuan Liang, Chuang Niu, Chen Wei, Shenghan Ren, Wenxiang Cong, Ge Wang
The phase function is a key element of a light propagation model for Monte Carlo (MC) simulation, which is usually fitted with an analytic function with associated parameters.
2 code implementations • 15 Nov 2021 • Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong
We present a self-supervised framework iBOT that can perform masked prediction with an online tokenizer.
Ranked #4 on
Unsupervised Image Classification
on ImageNet
no code implementations • 14 Oct 2021 • Zhongang Cai, Mingyuan Zhang, Jiawei Ren, Chen Wei, Daxuan Ren, Zhengyu Lin, Haiyu Zhao, Lei Yang, Chen Change Loy, Ziwei Liu
Specifically, we contribute GTA-Human, a large-scale 3D human dataset generated with the GTA-V game engine, featuring a highly diverse set of subjects, actions, and scenarios.
no code implementations • ICLR 2022 • Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong
The success of language Transformers is primarily attributed to the pretext task of masked language modeling (MLM), where texts are first tokenized into semantically meaningful pieces.
no code implementations • 11 May 2021 • Yanran Li, Ke Li, Hongke Ning, Xiaoqiang Xia, Yalong Guo, Chen Wei, Jianwei Cui, Bin Wang
Existing emotion-aware conversational models usually focus on controlling the response contents to align with a specific emotion class, whereas empathy is the ability to understand and concern the feelings and experience of others.
1 code implementation • CVPR 2021 • Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang
Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied.
no code implementations • 18 Feb 2021 • Chen Wei, Kexin Lou, Zhengyang Wang, Mingqi Zhao, Dante Mantini, Quanying Liu
EEG source localization is an important technical issue in EEG analysis.
1 code implementation • 15 Dec 2020 • Jiayi Zhang, Zhi Cui, Xiaoqiang Xia, Yalong Guo, Yanran Li, Chen Wei, Jianwei Cui
In this paper, we propose a new task of Writing Polishment with Simile (WPS) to investigate whether machines are able to polish texts with similes as we human do.
1 code implementation • 14 Dec 2020 • Xiuying Chen, Zhi Cui, Jiayi Zhang, Chen Wei, Jianwei Cui, Bin Wang, Dongyan Zhao, Rui Yan
Hence, in this paper, we propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question, where the question is focused on the omitted information in the dialog.
no code implementations • 3 Dec 2020 • Chen Dengyi, Hu Yiming, Ma Tao, Su Yang, Yang Jianfeng, Wang Jianping, Xu Guangzhou, Jiang Xiankai, Guo Jianhua, Zhang Yongqiang, Zhang Yan, Chen Wei, Chang Jin, Zhang Zhe
The HXI collimator (HXI-C) is a spatial modulation X-ray telescope designed to observe hard X-rays emitted by energetic electrons in solar flares.
Instrumentation and Methods for Astrophysics Solar and Stellar Astrophysics High Energy Physics - Experiment
1 code implementation • 31 Oct 2020 • Chen Wei, Yiping Tang, Chuang Niu, Haihong Hu, Yue Wang, Jimin Liang
To enhance the predictive performance of neural predictors, we devise two self-supervised learning methods from different perspectives to pre-train the architecture embedding part of neural predictors to generate a meaningful representation of neural architectures.
no code implementations • 22 Oct 2020 • Haoming Zhang, Chen Wei, Mingqi Zhao, Haiyan Wu, Quanying Liu
The recorded electroencephalography (EEG) signals are usually contaminated by many artifacts.
no code implementations • ICLR 2021 • Chen Wei, Huiyu Wang, Wei Shen, Alan Yuille
Regarding the similarity of the query crop to each crop from other images as "unlabeled", the consistency term takes the corresponding similarity of a positive crop as a pseudo label, and encourages consistency between these two similarities.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhi Cui, Yan-ran Li, Jiayi Zhang, Jianwei Cui, Chen Wei, Bin Wang
To model diverse responses for a given post, one promising way is to introduce a latent variable into Seq2Seq models.
2 code implementations • 24 Sep 2020 • Haoming Zhang, Mingqi Zhao, Chen Wei, Dante Mantini, Zherui Li, Quanying Liu
Here, we present EEGdenoiseNet, a benchmark EEG dataset that is suited for training and testing deep learning-based denoising models, as well as for performance comparisons across models.
1 code implementation • 28 Mar 2020 • Chen Wei, Chuang Niu, Yiping Tang, Yue Wang, Haihong Hu, Jimin Liang
In this paper, we propose a neural predictor guided evolutionary algorithm to enhance the exploration ability of EA for NAS (NPENAS) and design two kinds of neural predictors.
1 code implementation • CVPR 2019 • Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiaying Liu, Qi Tian, Alan L. Yuille
We consider spatial contexts, for which we solve so-called jigsaw puzzles, i. e., each image is cut into grids and then disordered, and the goal is to recover the correct configuration.
no code implementations • 29 Nov 2018 • Xutong Ren, Lingxi Xie, Chen Wei, Siyuan Qiao, Chi Su, Jiaying Liu, Qi Tian, Elliot K. Fishman, Alan L. Yuille
Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn.
3 code implementations • 14 Aug 2018 • Chen Wei, Wenjing Wang, Wenhan Yang, Jiaying Liu
Based on the decomposition, subsequent lightness enhancement is conducted on illumination by an enhancement network called Enhance-Net, and for joint denoising there is a denoising operation on reflectance.