no code implementations • 10 Sep 2024 • Junyi Chen, Weicai Ye, Yifan Wang, Danpeng Chen, Di Huang, Wanli Ouyang, Guofeng Zhang, Yu Qiao, Tong He
To this end, we propose GigaGS, the first work for high-quality surface reconstruction for large-scale scenes using 3DGS.
no code implementations • 2 Sep 2024 • Xiangyuan Xue, Zeyu Lu, Di Huang, Wanli Ouyang, Lei Bai
Much previous AI research has focused on developing monolithic models to maximize their intelligence and capability, with the primary goal of enhancing performance on specific tasks.
no code implementations • 20 Aug 2024 • Zili Liu, Hao Chen, Lei Bai, Wenyuan Li, Wanli Ouyang, Zhengxia Zou, Zhenwei Shi
In an era of frequent extreme weather and global warming, obtaining precise, fine-grained near-surface weather forecasts is increasingly essential for human activities.
no code implementations • 19 Aug 2024 • Yifan Wang, Di Huang, Weicai Ye, Guofeng Zhang, Wanli Ouyang, Tong He
Signed Distance Function (SDF)-based volume rendering has demonstrated significant capabilities in surface reconstruction.
no code implementations • 14 Aug 2024 • Junxian Li, Di Zhang, Xunzhi Wang, Zeying Hao, Jingdi Lei, Qian Tan, Cai Zhou, Wei Liu, Yaotian Yang, Xinrui Xiong, Weiyun Wang, Zhe Chen, Wenhai Wang, Wei Li, Shufei Zhang, Mao Su, Wanli Ouyang, Yuqiang Li, Dongzhan Zhou
We benchmark ChemVLM against a range of open-source and proprietary multimodal large language models on various tasks.
no code implementations • 13 Aug 2024 • Tianning Zhang, Feng Liu, Yuming Yuan, Rui Su, Wanli Ouyang, Lei Bai
FisH is designed to process real-time streaming seismic data and generate simultaneous results for phase picking, location estimation, and magnitude estimation in an end-to-end fashion.
no code implementations • 21 Jul 2024 • Xiaoyang Wu, Xiang Xu, Lingdong Kong, Liang Pan, Ziwei Liu, Tong He, Wanli Ouyang, Hengshuang Zhao
In this technical report, we detail our first-place solution for the 2024 Waymo Open Dataset Challenge's semantic segmentation track.
no code implementations • 17 Jul 2024 • Sijie Zhao, Hao Chen, Xueliang Zhang, Pengfeng Xiao, Lei Bai, Wanli Ouyang
By capturing the uncertainties in vegetation changes and modeling the complex influence of relevant variables, VegeDiff outperforms existing deterministic methods, providing clear and accurate forecasting results of future vegetation states.
1 code implementation • 16 Jul 2024 • Wang Zeng, Sheng Jin, Lumin Xu, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang
Our dynamic tokens possess two crucial characteristics: (1) Representing image regions with similar semantic meanings using the same vision token, even if those regions are not adjacent, and (2) concentrating on regions with valuable details and represent them using fine tokens.
no code implementations • CVPR 2024 • Xiaopei Wu, Yuenan Hou, Xiaoshui Huang, Binbin Lin, Tong He, Xinge Zhu, Yuexin Ma, Boxi Wu, Haifeng Liu, Deng Cai, Wanli Ouyang
To fully exploit rich information hidden in long-term temporal point clouds and images, we present the Temporal Aggregation Network, termed TASeg.
no code implementations • 13 Jul 2024 • Xiaopei Wu, Liang Peng, Liang Xie, Yuenan Hou, Binbin Lin, Xiaoshui Huang, Haifeng Liu, Deng Cai, Wanli Ouyang
In this paper, we propose PatchTeacher, which focuses on partial scene 3D object detection to provide high-quality pseudo labels for the student.
1 code implementation • 11 Jul 2024 • Zidong Wang, Zeyu Lu, Di Huang, Tong He, Xihui Liu, Wanli Ouyang, Lei Bai
In this paper, we introduce PredBench, a benchmark tailored for the holistic evaluation of spatio-temporal prediction networks.
no code implementations • 10 Jul 2024 • Jingwen He, Tianfan Xue, Dongyang Liu, Xinqi Lin, Peng Gao, Dahua Lin, Yu Qiao, Wanli Ouyang, Ziwei Liu
Given a generated low-quality video, our approach can increase its spatial and temporal resolution simultaneously with arbitrary up-sampling space and time scales through a unified video diffusion model.
no code implementations • 20 Jun 2024 • Shilong Li, Yancheng He, Hangyu Guo, Xingyuan Bu, Ge Bai, Jie Liu, Jiaheng Liu, Xingwei Qu, Yangguang Li, Wanli Ouyang, Wenbo Su, Bo Zheng
Long-context capabilities are essential for large language models (LLMs) to tackle complex and long-input tasks.
no code implementations • 17 Jun 2024 • YuAn Wang, Zhao Wang, Junhao Gong, Di Huang, Tong He, Wanli Ouyang, Jile Jiao, Xuetao Feng, Qi Dou, Shixiang Tang, Dan Xu
In this paper, we introduce a novel path to $\textit{general}$ human motion generation by focusing on 2D space.
no code implementations • 17 Jun 2024 • Jie Liu, Zhanhui Zhou, Jiaheng Liu, Xingyuan Bu, Chao Yang, Han-sen Zhong, Wanli Ouyang
In this work, we identify a pitfall of vanilla iterative DPO - improved response quality can lead to increased verbosity.
1 code implementation • 14 Jun 2024 • Yuchen Ren, ZhiYuan Chen, Lifeng Qiao, Hongtai Jing, Yuchen Cai, Sheng Xu, Peng Ye, Xinzhu Ma, Siqi Sun, Hongliang Yan, Dong Yuan, Wanli Ouyang, Xihui Liu
RNA plays a pivotal role in translating genetic instructions into functional outcomes, underscoring its importance in biological processes and disease mechanisms.
1 code implementation • 11 Jun 2024 • Di Zhang, Xiaoshui Huang, Dongzhan Zhou, Yuqiang Li, Wanli Ouyang
This paper introduces the MCT Self-Refine (MCTSr) algorithm, an innovative integration of Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS), designed to enhance performance in complex mathematical reasoning tasks.
1 code implementation • 5 Jun 2024 • Le Zhuo, Ruoyi Du, Han Xiao, Yangguang Li, Dongyang Liu, Rongjie Huang, Wenze Liu, Lirui Zhao, Fu-Yun Wang, Zhanyu Ma, Xu Luo, Zehan Wang, Kaipeng Zhang, Xiangyang Zhu, Si Liu, Xiangyu Yue, Dingning Liu, Wanli Ouyang, Ziwei Liu, Yu Qiao, Hongsheng Li, Peng Gao
Lumina-T2X is a nascent family of Flow-based Large Diffusion Transformers that establishes a unified framework for transforming noise into various modalities, such as images and videos, conditioned on text instructions.
no code implementations • 5 Jun 2024 • Minglei Li, Peng Ye, Yongqi Huang, Lin Zhang, Tao Chen, Tong He, Jiayuan Fan, Wanli Ouyang
Parameter-efficient fine-tuning (PEFT) has become increasingly important as foundation models continue to grow in both popularity and size.
no code implementations • 3 Jun 2024 • Kun Chen, Tao Chen, Peng Ye, Hao Chen, Kang Chen, Tao Han, Wanli Ouyang, Lei Bai
Data assimilation is a vital component in modern global medium-range weather forecasting systems to obtain the best estimation of the atmospheric state by combining the short-term forecast and observations.
1 code implementation • 29 May 2024 • Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou, Kaijing Ma, Minghao Liu, Morry Niu, Noah Wang, Quehry Que, Ruibo Liu, Sine Liu, Shawn Guo, Soren Gao, Wangchunshu Zhou, Xinyue Zhang, Yizhi Zhou, YuBo Wang, Yuelin Bai, Yuhan Zhang, Yuxiang Zhang, Zenith Wang, Zhenzhu Yang, Zijian Zhao, Jiajun Zhang, Wanli Ouyang, Wenhao Huang, Wenhu Chen
To improve the transparency of LLMs, the research community has formed to open-source truly open LLMs (e. g., Pythia, Amber, OLMo), where more details (e. g., pre-training corpus and training code) are being provided.
1 code implementation • 28 May 2024 • Weizhen He, Yiheng Deng, Yunfeng Yan, Feng Zhu, Yizhou Wang, Lei Bai, Qingsong Xie, Donglian Qi, Wanli Ouyang, Shixiang Tang
In the task-specific evaluation setting, gallery sets are categorized according to specific ReID tasks.
no code implementations • 24 May 2024 • Zijie Guo, Pumeng Lyu, Fenghua Ling, Jing-Jia Luo, Niklas Boers, Wanli Ouyang, Lei Bai
Hindcasts of key oceanic variables demonstrate ORCA's remarkable prediction skills in predicting ocean variations compared with state-of-the-art numerical OGCMs and abilities in capturing occurrences of extreme events at the subsurface ocean and ENSO vertical patterns.
1 code implementation • 23 May 2024 • Chenyu Huang, Peng Ye, Tao Chen, Tong He, Xiangyu Yue, Wanli Ouyang
In this paper, we rethink and analyze the existing model merging paradigm.
1 code implementation • 22 May 2024 • Huanjin Yao, Wenhao Wu, Taojiannan Yang, Yuxin Song, Mengxi Zhang, Haocheng Feng, Yifan Sun, Zhiheng Li, Wanli Ouyang, Jingdong Wang
We witness the rise of larger and higher-quality instruction datasets, as well as the involvement of larger-sized LLMs.
1 code implementation • 22 May 2024 • Wanghan Xu, Fenghua Ling, Wenlong Zhang, Tao Han, Hao Chen, Wanli Ouyang, Lei Bai
Data-driven artificial intelligence (AI) models have made significant advancements in weather forecasting, particularly in medium-range and nowcasting.
no code implementations • 19 May 2024 • Gengchen Wei, Xinle Pang, Tianning Zhang, Yu Sun, Xun Qian, Chen Lin, Han-sen Zhong, Wanli Ouyang
With over 200 million published academic documents and millions of new documents being written each year, academic researchers face the challenge of searching for information within this vast corpus.
1 code implementation • 15 May 2024 • Renqi Chen, Wenwei Han, Haohao Zhang, Haoyang Su, Zhefan Wang, Xiaolei Liu, Hao Jiang, Wanli Ouyang, Nanqing Dong
Genomic selection (GS), as a critical crop breeding strategy, plays a key role in enhancing food production and addressing the global hunger crisis.
no code implementations • 16 Apr 2024 • Ruifeng Li, Dongzhan Zhou, Ancheng Shen, Ao Zhang, Mao Su, Mingqian Li, Hongyang Chen, Gang Chen, Yin Zhang, Shufei Zhang, Yuqiang Li, Wanli Ouyang
Overall, our work illustrates the benefits and potential of using PEMAL in AIDD and other scenarios with data scarcity and noise.
1 code implementation • 11 Apr 2024 • Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella, Xiaoshui Huang, Dinh Phung, Wanli Ouyang, Jianfei Cai
Generative models, e. g., Stable Diffusion, have enabled the creation of photorealistic images from text prompts.
no code implementations • 4 Apr 2024 • Dongang Wang, Peilin Liu, Hengrui Wang, Heidi Beadnall, Kain Kyle, Linda Ly, Mariano Cabezas, Geng Zhan, Ryan Sullivan, Weidong Cai, Wanli Ouyang, Fernando Calamante, Michael Barnett, Chenyu Wang
This paper focuses on an early stage phase of deep learning research, prior to model development, and proposes a strategic framework for estimating the amount of annotated data required to train patch-based segmentation networks.
1 code implementation • 3 Apr 2024 • Sijie Zhao, Hao Chen, Xueliang Zhang, Pengfeng Xiao, Lei Bai, Wanli Ouyang
RSM is specifically designed to capture the global context of remote sensing images with linear complexity, facilitating the effective processing of large VHR images.
Ranked #1 on Road Segmentation on Massachusetts Roads Dataset (F1 metric)
Building change detection for remote sensing images Change Detection +1
no code implementations • 28 Mar 2024 • Zeren Chen, Zhelun Shi, Xiaoya Lu, Lehan He, Sucheng Qian, Hao Shu Fang, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, Cewu Lu, Lu Sheng
The ultimate goals of robotic learning is to acquire a comprehensive and generalizable robotic system capable of performing both seen skills within the training distribution and unseen skills in novel environments.
no code implementations • 19 Mar 2024 • Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang, Chun Yuan, Wanli Ouyang, Tong He
To better optimize the representation of these details, we propose a unique pruning and densifying method named the Candidate Pool Strategy, enhancing detail fidelity through selective optimization.
1 code implementation • 19 Mar 2024 • Yixuan Wu, Yizhou Wang, Shixiang Tang, Wenhao Wu, Tong He, Wanli Ouyang, Philip Torr, Jian Wu
We present DetToolChain, a novel prompting paradigm, to unleash the zero-shot object detection ability of multimodal large language models (MLLMs), such as GPT-4V and Gemini.
1 code implementation • 18 Mar 2024 • Sha Zhang, Jiajun Deng, Lei Bai, Houqiang Li, Wanli Ouyang, Yanyong Zhang
We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network with a pre-trained image network in an unsupervised man- ner.
no code implementations • 18 Mar 2024 • Sha Zhang, Di Huang, Jiajun Deng, Shixiang Tang, Wanli Ouyang, Tong He, Yanyong Zhang
The ability to understand and reason the 3D real world is a crucial milestone towards artificial general intelligence.
no code implementations • 14 Mar 2024 • Jiajun Deng, Sha Zhang, Feras Dayoub, Wanli Ouyang, Yanyong Zhang, Ian Reid
In particular, our PoIFusion follows the paradigm of query-based object detection, formulating object queries as dynamic 3D boxes and generating a set of PoIs based on each query box.
no code implementations • 4 Mar 2024 • Yu Sun, Dongzhan Zhou, Chen Lin, Conghui He, Wanli Ouyang, Han-sen Zhong
Academic documents are packed with texts, equations, tables, and figures, requiring comprehensive understanding for accurate Optical Character Recognition (OCR).
no code implementations • 2 Mar 2024 • Lian Xu, Mohammed Bennamoun, Farid Boussaid, Wanli Ouyang, Ferdous Sohel, Dan Xu
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from these saliency maps and the significant inter-task correlation between saliency detection and semantic segmentation.
1 code implementation • 22 Feb 2024 • Chenxi Huang, Yuenan Hou, Weicai Ye, Di Huang, Xiaoshui Huang, Binbin Lin, Deng Cai, Wanli Ouyang
We project the freely available 3D segmentation annotations onto the 2D plane and leverage the corresponding 2D semantic maps as the supervision signal, significantly enhancing the semantic awareness of multi-view detectors.
1 code implementation • 22 Feb 2024 • Yanan Wu, Jie Liu, Xingyuan Bu, Jiaheng Liu, Zhanhui Zhou, Yuanxing Zhang, Chenchen Zhang, Zhiqi Bai, Haibin Chen, Tiezheng Ge, Wanli Ouyang, Wenbo Su, Bo Zheng
This paper introduces ConceptMath, a bilingual (English and Chinese), fine-grained benchmark that evaluates concept-wise mathematical reasoning of Large Language Models (LLMs).
1 code implementation • 22 Feb 2024 • Ge Bai, Jie Liu, Xingyuan Bu, Yancheng He, Jiaheng Liu, Zhanhui Zhou, Zhuoran Lin, Wenbo Su, Tiezheng Ge, Bo Zheng, Wanli Ouyang
By conducting a detailed analysis of real multi-turn dialogue data, we construct a three-tier hierarchical ability taxonomy comprising 4208 turns across 1388 multi-turn dialogues in 13 distinct tasks.
2 code implementations • 19 Feb 2024 • Zidong Wang, Zeyu Lu, Di Huang, Cai Zhou, Wanli Ouyang, Lei Bai
We have released all the codes and models at https://github. com/whlzy/FiT to promote the exploration of diffusion transformer models for arbitrary-resolution image generation.
1 code implementation • 19 Feb 2024 • Zhanhui Zhou, Jie Liu, Zhichen Dong, Jiaheng Liu, Chao Yang, Wanli Ouyang, Yu Qiao
Large language models (LLMs) undergo safety alignment to ensure safe conversations with humans.
no code implementations • 15 Feb 2024 • Gengyuan Hu, Gengchen Wei, Zekun Lou, Philip H. S. Torr, Wanli Ouyang, Han-sen Zhong, Chen Lin
Machine learning has emerged as a significant approach to efficiently tackle electronic structure problems.
no code implementations • 13 Feb 2024 • Chen Lin, Liheng Ma, Yiyang Chen, Wanli Ouyang, Michael M. Bronstein, Philip H. S. Torr
\textbf{Secondly}, we propose the {\em Continuous Unified Ricci Curvature} (\textbf{CURC}), an extension of celebrated {\em Ollivier-Ricci Curvature} for directed and weighted graphs.
1 code implementation • 10 Feb 2024 • Di Zhang, Wei Liu, Qian Tan, Jingdan Chen, Hang Yan, Yuliang Yan, Jiatong Li, Weiran Huang, Xiangyu Yue, Wanli Ouyang, Dongzhan Zhou, Shufei Zhang, Mao Su, Han-sen Zhong, Yuqiang Li
However, the community lacks an LLM specifically designed for chemistry.
no code implementations • 6 Feb 2024 • Junchao Gong, Lei Bai, Peng Ye, Wanghan Xu, Na Liu, Jianhua Dai, Xiaokang Yang, Wanli Ouyang
Precipitation nowcasting based on radar data plays a crucial role in extreme weather prediction and has broad implications for disaster management.
1 code implementation • 4 Feb 2024 • Haoyi Zhu, Yating Wang, Di Huang, Weicai Ye, Wanli Ouyang, Tong He
These outcomes suggest that the 3D point cloud is a valuable observation modality for intricate robotic tasks.
no code implementations • 4 Feb 2024 • Youzhi Qu, Chen Wei, Penghui Du, Wenxin Che, Chi Zhang, Wanli Ouyang, Yatao Bian, Feiyang Xu, Bin Hu, Kai Du, Haiyan Wu, Jia Liu, Quanying Liu
During the evolution of large models, performance evaluation is necessarily performed to assess their capabilities and ensure safety before practical application.
1 code implementation • 2 Feb 2024 • Wanghan Xu, Kang Chen, Tao Han, Hao Chen, Wanli Ouyang, Lei Bai
Data-driven weather forecast based on machine learning (ML) has experienced rapid development and demonstrated superior performance in the global medium-range forecast compared to traditional physics-based dynamical models.
1 code implementation • 2 Feb 2024 • Jian Liu, Xiaoshui Huang, Tianyu Huang, Lu Chen, Yuenan Hou, Shixiang Tang, Ziwei Liu, Wanli Ouyang, WangMeng Zuo, Junjun Jiang, Xianming Liu
Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e. g., text, image, video, audio and 3D.
1 code implementation • 28 Jan 2024 • Tao Han, Song Guo, Fenghua Ling, Kang Chen, Junchao Gong, Jingjia Luo, Junxia Gu, Kan Dai, Wanli Ouyang, Lei Bai
Therefore, building a kilometer-scale global forecast model is a persistent pursuit in the meteorology domain.
no code implementations • 26 Jan 2024 • Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, LiMin Wang, Lu Sheng, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He, Yingchun Wang, Yixu Wang, Yongting Zhang, Yu Qiao, Yujiong Shen, Yurong Mou, Yuxi Chen, Zaibin Zhang, Zhelun Shi, Zhenfei Yin, Zhipin Wang
Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents.
1 code implementation • 22 Jan 2024 • Zili Liu, Hao Chen, Lei Bai, Wenyuan Li, Keyan Chen, Zhengyi Wang, Wanli Ouyang, Zhengxia Zou, Zhenwei Shi
In this paper, we extend meteorological downscaling to arbitrary scattered station scales, establish a brand new benchmark and dataset, and retrieve meteorological states at any given station location from a coarse-resolution meteorological field.
1 code implementation • 7 Jan 2024 • Peng Zheng, Dehong Gao, Deng-Ping Fan, Li Liu, Jorma Laaksonen, Wanli Ouyang, Nicu Sebe
It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef).
Ranked #1 on Camouflaged Object Segmentation on COD
Camouflaged Object Segmentation Dichotomous Image Segmentation +3
1 code implementation • CVPR 2024 • Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao
This paper is not motivated to seek innovation within the attention mechanism.
no code implementations • 25 Dec 2023 • Peng Ye, Yongqi Huang, Chongjun Tu, Minglei Li, Tao Chen, Tong He, Wanli Ouyang
We first validate eight manually-defined partial fine-tuning strategies across kinds of datasets and vision transformer architectures, and find that some partial fine-tuning strategies (e. g., ffn only or attention only) can achieve better performance with fewer tuned parameters than full fine-tuning, and selecting appropriate layers is critical to partial fine-tuning.
no code implementations • 25 Dec 2023 • Peng Ye, Chenyu Huang, Mingzhu Shen, Tao Chen, Yongqi Huang, Yuning Zhang, Wanli Ouyang
This work targets to merge various Vision Transformers (ViTs) trained on different tasks (i. e., datasets with different object categories) or domains (i. e., datasets with the same categories but different environments) into one unified model, yielding still good performance on each task or domain.
no code implementations • 21 Dec 2023 • Chongjun Tu, Peng Ye, Weihao Lin, Hancheng Ye, Chong Yu, Tao Chen, Baopu Li, Wanli Ouyang
Improving the efficiency of Neural Architecture Search (NAS) is a challenging but significant task that has received much attention.
1 code implementation • 18 Dec 2023 • Kun Chen, Lei Bai, Fenghua Ling, Peng Ye, Tao Chen, Jing-Jia Luo, Hao Chen, Yi Xiao, Kang Chen, Tao Han, Wanli Ouyang
Initial states are typically generated by traditional data assimilation components, which are computational expensive and time-consuming.
1 code implementation • 18 Dec 2023 • Zhi Jin, Sheng Xu, Xiang Zhang, Tianze Ling, Nanqing Dong, Wanli Ouyang, Zhiqiang Gao, Cheng Chang, Siqi Sun
De novo peptide sequencing from mass spectrometry (MS) data is a critical task in proteomics research.
no code implementations • 16 Dec 2023 • Pumeng Lyu, Tao Tang, Fenghua Ling, Jing-Jia Luo, Niklas Boers, Wanli Ouyang, Lei Bai
Recent studies have shown that deep learning (DL) models can skillfully predict the El Ni\~no-Southern Oscillation (ENSO) forecasts over 1. 5 years ahead.
no code implementations • 16 Dec 2023 • Yi Xiao, Lei Bai, Wei Xue, Kang Chen, Tao Han, Wanli Ouyang
Weather forecasting is a crucial yet highly challenging task.
3 code implementations • 15 Dec 2023 • Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao
This paper is not motivated to seek innovation within the attention mechanism.
Ranked #1 on 3D Semantic Segmentation on ScanNet++ (using extra training data)
no code implementations • 14 Dec 2023 • Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang
Recent advancements in text-to-3D generation technology have significantly advanced the conversion of textual descriptions into imaginative well-geometrical and finely textured 3D objects.
1 code implementation • 12 Dec 2023 • Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang
In this paper, from a novel perspective, we systematically study the challenges that remain in O2O RL and identify that the reason behind the slow improvement of the performance and the instability of online finetuning lies in the inaccurate Q-value estimation inherited from offline pretraining.
2 code implementations • 4 Dec 2023 • Yizhou Wang, Yixuan Wu, Shixiang Tang, Weizhen He, Xun Guo, Feng Zhu, Lei Bai, Rui Zhao, Jian Wu, Tong He, Wanli Ouyang
Human-centric perception tasks, e. g., pedestrian detection, skeleton-based action recognition, and pose estimation, have wide industrial applications, such as metaverse and sports analysis.
Ranked #1 on Pedestrian Image Caption on CUHK-PEDES
2 code implementations • 27 Nov 2023 • Wenhao Wu, Huanjin Yao, Mengxi Zhang, Yuxin Song, Wanli Ouyang, Jingdong Wang
Our study centers on the evaluation of GPT-4's linguistic and visual capabilities in zero-shot visual recognition tasks: Firstly, we explore the potential of its generated rich textual descriptions across various categories to enhance recognition performance without any training.
no code implementations • CVPR 2024 • Xiao Zheng, Xiaoshui Huang, Guofeng Mei, Yuenan Hou, Zhaoyang Lyu, Bo Dai, Wanli Ouyang, Yongshun Gong
This generator aggregates the features extracted by the backbone and employs them as the condition to guide the point-to-point recovery from the noisy point cloud, thereby assisting the backbone in capturing both local and global geometric priors as well as the global point density distribution of the object.
1 code implementation • 5 Nov 2023 • Zeren Chen, Ziqin Wang, Zhen Wang, Huayang Liu, Zhenfei Yin, Si Liu, Lu Sheng, Wanli Ouyang, Yu Qiao, Jing Shao
While this phenomenon has been overlooked in previous work, we propose a novel and extensible framework, called Octavius, for comprehensive studies and experimentation on multimodal learning with Multimodal Large Language Models (MLLMs).
1 code implementation • 24 Oct 2023 • Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Tong He, Yonghui Li, Wanli Ouyang
It models the uncertainty propagation relationship of the geometry projection during training, improving the stability and efficiency of the end-to-end model learning.
no code implementations • 24 Oct 2023 • Yunyao Mao, Jiajun Deng, Wengang Zhou, Zhenbo Lu, Wanli Ouyang, Houqiang Li
Different from existing distillation solutions that transfer the knowledge of a pre-trained and fixed teacher to the student, in CMD, the knowledge is continuously updated and bidirectionally distilled between modalities during pre-training.
no code implementations • 18 Oct 2023 • Jie Liu, Yinmin Zhang, Chuming Li, Chao Yang, Yaodong Yang, Yu Liu, Wanli Ouyang
Building a single generalist agent with strong zero-shot capability has recently sparked significant advancements.
1 code implementation • 12 Oct 2023 • Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Tong He, Wanli Ouyang
In this paper, we introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation, thereby establishing a pathway to 3D foundational models.
Ranked #2 on Semantic Segmentation on S3DIS (using extra training data)
1 code implementation • CVPR 2024 • Honghui Yang, Sha Zhang, Di Huang, Xiaoyang Wu, Haoyi Zhu, Tong He, Shixiang Tang, Hengshuang Zhao, Qibo Qiu, Binbin Lin, Xiaofei He, Wanli Ouyang
In the context of autonomous driving, the significance of effective feature learning is widely acknowledged.
no code implementations • 11 Oct 2023 • Chaoqi Liang, Lifeng Qiao, Peng Ye, Nanqing Dong, Jianle Sun, Weiqiang Bai, Yuchen Ren, Xinzhu Ma, Hongliang Yan, Chunfeng Song, Wanli Ouyang, WangMeng Zuo
However, existing pre-training methods for DNA sequences largely rely on direct adoptions of BERT pre-training from NLP, lacking a comprehensive understanding and a specifically tailored approach.
no code implementations • ICCV 2023 • Xinzhu Ma, Yongtao Wang, Yinmin Zhang, Zhiyi Xia, Yuan Meng, Zhihui Wang, Haojie Li, Wanli Ouyang
In this work, we build a modular-designed codebase, formulate strong training recipes, design an error diagnosis toolbox, and discuss current methods for image-based 3D object detection.
2 code implementations • 5 Oct 2023 • Zhanhui Zhou, Jie Liu, Jing Shao, Xiangyu Yue, Chao Yang, Wanli Ouyang, Yu Qiao
A single language model, even when aligned with labelers through reinforcement learning from human feedback (RLHF), may not suit all human preferences.
no code implementations • 3 Oct 2023 • Xiaoyu Yue, Lei Bai, Meng Wei, Jiangmiao Pang, Xihui Liu, Luping Zhou, Wanli Ouyang
Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies.
2 code implementations • 1 Oct 2023 • Zekun Moore Wang, Zhongyuan Peng, Haoran Que, Jiaheng Liu, Wangchunshu Zhou, Yuhan Wu, Hongcheng Guo, Ruitong Gan, Zehao Ni, Jian Yang, Man Zhang, Zhaoxiang Zhang, Wanli Ouyang, Ke Xu, Stephen W. Huang, Jie Fu, Junran Peng
The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters.
1 code implementation • ICCV 2023 • Jiawei Yao, Chuming Li, Keqiang Sun, Yingjie Cai, Hao Li, Wanli Ouyang, Hongsheng Li
Monocular 3D Semantic Scene Completion (SSC) has garnered significant attention in recent years due to its potential to predict complex semantics and geometry shapes from a single image, requiring no 3D inputs.
3D Semantic Scene Completion from a single 2D image 3D Semantic Scene Completion from a single RGB image +1
no code implementations • 31 Aug 2023 • Lei Bai, Dongang Wang, Michael Barnett, Mariano Cabezas, Weidong Cai, Fernando Calamante, Kain Kyle, Dongnan Liu, Linda Ly, Aria Nguyen, Chun-Chien Shieh, Ryan Sullivan, Hengrui Wang, Geng Zhan, Wanli Ouyang, Chenyu Wang
Our approach enables collaboration among multiple clinical sites without compromising data privacy under a federated learning paradigm that incorporates a noise-robust training strategy based on label correction.
2 code implementations • 29 Aug 2023 • Xinqi Lin, Jingwen He, Ziyan Chen, Zhaoyang Lyu, Bo Dai, Fanghua Yu, Wanli Ouyang, Yu Qiao, Chao Dong
We present DiffBIR, a general restoration pipeline that could handle different blind image restoration tasks in a unified framework.
Ranked #1 on Blind Face Restoration on LFW
1 code implementation • 26 Aug 2023 • Shengji Tang, Peng Ye, Baopu Li, Weihao Lin, Tao Chen, Tong He, Chong Yu, Wanli Ouyang
Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet groups.
1 code implementation • ICCV 2023 • Tao Han, Lei Bai, Lingbo Liu, Wanli Ouyang
Scale variation is a deep-rooted problem in object counting, which has not been effectively addressed by existing scale-aware algorithms.
1 code implementation • ICCV 2023 • Yunyao Mao, Jiajun Deng, Wengang Zhou, Yao Fang, Wanli Ouyang, Houqiang Li
To be specific, the proposed MAMP takes as input the masked spatio-temporal skeleton sequence and predicts the corresponding temporal motion of the masked human joints.
Ranked #8 on Skeleton Based Action Recognition on NTU RGB+D 120
no code implementations • 11 Aug 2023 • Yongqi Huang, Peng Ye, Xiaoshui Huang, Sheng Li, Tao Chen, Tong He, Wanli Ouyang
As Vision Transformers (ViTs) are gradually surpassing CNNs in various visual tasks, one may question: if a training scheme specifically for ViTs exists that can also achieve performance improvement without increasing inference cost?
1 code implementation • 6 Aug 2023 • Lian Xu, Mohammed Bennamoun, Farid Boussaid, Hamid Laga, Wanli Ouyang, Dan Xu
Building upon the observation that the attended regions of the one-class token in the standard vision transformer can contribute to a class-agnostic localization map, we explore the potential of the transformer model to capture class-specific attention for class-discriminative object localization by learning multiple class tokens.
Object Localization Weakly supervised Semantic Segmentation +1
no code implementations • 24 Jul 2023 • Pan Tan, Mingchen Li, Yuanxi Yu, Fan Jiang, Lirong Zheng, Banghao Wu, Xinyu Sun, Liqi Kang, Jie Song, Liang Zhang, Yi Xiong, Wanli Ouyang, Zhiqiang Hu, Guisheng Fan, Yufeng Pei, Liang Hong
Designing protein mutants of both high stability and activity is a critical yet challenging task in protein engineering.
no code implementations • 24 Jul 2023 • Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang
Model-based reinforcement learning (RL) has demonstrated remarkable successes on a range of continuous control tasks due to its high sample efficiency.
1 code implementation • 20 Jul 2023 • Yiyuan Zhang, Kaixiong Gong, Kaipeng Zhang, Hongsheng Li, Yu Qiao, Wanli Ouyang, Xiangyu Yue
Multimodal learning aims to build models that can process and relate information from multiple modalities.
2 code implementations • ICCV 2023 • Wenhao Wu, Yuxin Song, Zhun Sun, Jingdong Wang, Chang Xu, Wanli Ouyang
We conduct comprehensive ablation studies on the instantiation of ATMs and demonstrate that this module provides powerful temporal modeling capability at a low computational cost.
Ranked #4 on Action Recognition on Something-Something V1
no code implementations • 19 Jun 2023 • Qinghong Sun, Yangguang Li, Zexiang Liu, Xiaoshui Huang, Fenggang Liu, Xihui Liu, Wanli Ouyang, Jing Shao
However, the quality and diversity of existing 3D object generation methods are constrained by the inadequacies of existing 3D object datasets, including issues related to text quality, the incompleteness of multi-modal data representation encompassing 2D rendered images and 3D assets, as well as the size of the dataset.
no code implementations • 19 Jun 2023 • Yaqi Zhang, Di Huang, Bin Liu, Shixiang Tang, Yan Lu, Lu Chen, Lei Bai, Qi Chu, Nenghai Yu, Wanli Ouyang
Generating realistic human motion from given action descriptions has experienced significant advancements because of the emerging requirement of digital humans.
no code implementations • 15 Jun 2023 • YiRong Chen, Ziyue Li, Wanli Ouyang, Michael Lepech
In this work, we propose an Adaptive Hierarchical SpatioTemporal Network (AHSTN) to promote traffic forecasting by exploiting the spatial hierarchy and modeling multi-scale spatial correlations.
1 code implementation • CVPR 2024 • Weizhen He, Yiheng Deng, Shixiang Tang, Qihao Chen, Qingsong Xie, Yizhou Wang, Lei Bai, Feng Zhu, Rui Zhao, Wanli Ouyang, Donglian Qi, Yunfeng Yan
This paper strives to resolve this problem by proposing a new instruct-ReID task that requires the model to retrieve images according to the given image or language instructions.
1 code implementation • NeurIPS 2023 • Zhenfei Yin, Jiong Wang, JianJian Cao, Zhelun Shi, Dingning Liu, Mukai Li, Lu Sheng, Lei Bai, Xiaoshui Huang, Zhiyong Wang, Jing Shao, Wanli Ouyang
To the best of our knowledge, we present one of the very first open-source endeavors in the field, LAMM, encompassing a Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark.
1 code implementation • CVPR 2023 • Yingjie Wang, Jiajun Deng, Yao Li, Jinshui Hu, Cong Liu, Yu Zhang, Jianmin Ji, Wanli Ouyang, Yanyong Zhang
LiDAR and Radar are two complementary sensing approaches in that LiDAR specializes in capturing an object's 3D shape while Radar provides longer detection ranges as well as velocity hints.
1 code implementation • CVPR 2023 • Honghui Yang, Wenxiao Wang, Minghao Chen, Binbin Lin, Tong He, Hua Chen, Xiaofei He, Wanli Ouyang
The key to associating the two different representations is our introduced input-dependent Query Initialization module, which could efficiently generate reference points and content queries.
no code implementations • 10 May 2023 • Xulin Li, Yan Lu, Bin Liu, Yuenan Hou, Yating Liu, Qi Chu, Wanli Ouyang, Nenghai Yu
Clothes-invariant feature extraction is critical to the clothes-changing person re-identification (CC-ReID).
no code implementations • 4 May 2023 • Peng Ye, Tong He, Shengji Tang, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang
In this work, we aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing, and further propose a new training scheme as well as three improved strategies for boosting residual networks beyond their performance limits.
1 code implementation • 6 Apr 2023 • Kang Chen, Tao Han, Junchao Gong, Lei Bai, Fenghua Ling, Jing-Jia Luo, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, Yuanzheng Ci, Bin Li, Xiaokang Yang, Wanli Ouyang
We present FengWu, an advanced data-driven global medium-range weather forecast system based on Artificial Intelligence (AI).
no code implementations • 22 Mar 2023 • Zhilong Liang, Zhenzhi Tan, Ruixin Hong, Wanli Ouyang, Jinying Yuan, ChangShui Zhang
Computer image recognition with machine learning method can make up the defects of artificial judging, giving accurate and quantitative judgement.
1 code implementation • CVPR 2023 • Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang, Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang
Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.
Ranked #1 on Pedestrian Attribute Recognition on PA-100K (using extra training data)
1 code implementation • CVPR 2023 • Yuanzheng Ci, Yizhou Wang, Meilin Chen, Shixiang Tang, Lei Bai, Feng Zhu, Rui Zhao, Fengwei Yu, Donglian Qi, Wanli Ouyang
When adapted to a specific task, UniHCP achieves new SOTAs on a wide range of human-centric tasks, e. g., 69. 8 mIoU on CIHP for human parsing, 86. 18 mA on PA-100K for attribute prediction, 90. 3 mAP on Market1501 for ReID, and 85. 8 JI on CrowdHuman for pedestrian detection, performing better than specialized models tailored for each task.
Ranked #1 on Pose Estimation on MS-COCO
no code implementations • 3 Mar 2023 • Lintao Wang, Kun Hu, Lei Bai, Yu Ding, Wanli Ouyang, Zhiyong Wang
As past poses often contain useful auxiliary hints, in this paper, we propose a task-agnostic deep learning method, namely Multi-scale Control Signal-aware Transformer (MCS-T), with an attention based encoder-decoder architecture to discover the auxiliary information implicitly for synthesizing controllable motion without explicitly requiring auxiliary information such as phase.
no code implementations • 22 Feb 2023 • Meilin Chen, Yizhou Wang, Shixiang Tang, Feng Zhu, Haiyang Yang, Lei Bai, Rui Zhao, Donglian Qi, Wanli Ouyang
Despite being feasible, recent works largely overlooked discovering the most discriminative regions for contrastive learning to object representations in scene images.
no code implementations • 8 Feb 2023 • Geng Zhan, Dongang Wang, Mariano Cabezas, Lei Bai, Kain Kyle, Wanli Ouyang, Michael Barnett, Chenyu Wang
An accurate and robust quantitative measurement of brain volume change is paramount for translational research and clinical applications.
1 code implementation • 29 Jan 2023 • Yangguang Li, Bin Huang, Zeren Chen, Yufeng Cui, Feng Liang, Mingzhu Shen, Fenggang Liu, Enze Xie, Lu Sheng, Wanli Ouyang, Jing Shao
Our Fast-BEV consists of five parts, We novelly propose (1) a lightweight deployment-friendly view transformation which fast transfers 2D image feature to 3D voxel space, (2) an multi-scale image encoder which leverages multi-scale information for better performance, (3) an efficient BEV encoder which is particularly designed to speed up on-vehicle inference.
1 code implementation • 16 Jan 2023 • Peng Ye, Tong He, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang
To address the robustness problem, we first benchmark different NAS methods under a wide range of proxy data, proxy channels, proxy layers and proxy epochs, since the robustness of NAS under different kinds of proxies has not been explored before.
no code implementations • ICCV 2023 • Peixia Li, Pulak Purkait, Thalaiyasingam Ajanthan, Majid Abdolshah, Ravi Garg, Hisham Husain, Chenchen Xu, Stephen Gould, Wanli Ouyang, Anton Van Den Hengel
Each learning group consists of a teacher network, a student network and a novel filter module.
no code implementations • CVPR 2023 • Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Dan Xu
Weakly supervised dense object localization (WSDOL) relies generally on Class Activation Mapping (CAM), which exploits the correlation between the class weights of the image classifier and the pixel-level features.
no code implementations • CVPR 2023 • Yuchen Ren, Zhendong Mao, Shancheng Fang, Yan Lu, Tong He, Hao Du, Yongdong Zhang, Wanli Ouyang
In this paper, we introduce a new setting called Domain Generalization for Image Captioning (DGIC), where the data from the target domain is unseen in the learning process.
no code implementations • CVPR 2023 • Shijie Wang, Jianlong Chang, Haojie Li, Zhihui Wang, Wanli Ouyang, Qi Tian
PLEor could leverage pre-trained CLIP model to infer the discrepancies encompassing both pre-defined and unknown subcategories, called category-specific discrepancies, and transfer them to the backbone network trained in the close-set scenarios.
no code implementations • ICCV 2023 • Di Huang, Sida Peng, Tong He, Honghui Yang, Xiaowei Zhou, Wanli Ouyang
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural rendering.
5 code implementations • CVPR 2023 • Wenhao Wu, Xiaohan Wang, Haipeng Luo, Jingdong Wang, Yi Yang, Wanli Ouyang
In this paper, we propose a novel framework called BIKE, which utilizes the cross-modal bridge to explore bidirectional knowledge: i) We introduce the Video Attribute Association mechanism, which leverages the Video-to-Text knowledge to generate textual auxiliary attributes for complementing video recognition.
Ranked #1 on Zero-Shot Action Recognition on ActivityNet
4 code implementations • CVPR 2023 • Wenhao Wu, Haipeng Luo, Bo Fang, Jingdong Wang, Wanli Ouyang
Most existing text-video retrieval methods focus on cross-modal matching between the visual content of videos and textual query sentences.
Ranked #7 on Video Retrieval on VATEX
no code implementations • CVPR 2023 • Mingye Xu, Mutian Xu, Tong He, Wanli Ouyang, Yali Wang, Xiaoguang Han, Yu Qiao
Besides, such scenes with progressive masking ratios can also serve to self-distill their intrinsic spatial consistency, requiring to learn the consistent representations from unmasked areas.
no code implementations • 17 Dec 2022 • Yuan YAO, Yuanhan Zhang, Zhenfei Yin, Jiebo Luo, Wanli Ouyang, Xiaoshui Huang
The recent success of pre-trained 2D vision models is mostly attributable to learning from large-scale datasets.
2 code implementations • 8 Dec 2022 • Xiaoshui Huang, Zhou Huang, Sheng Li, Wentao Qu, Tong He, Yuenan Hou, Yifan Zuo, Wanli Ouyang
These token embeddings are concatenated with a task token and fed into the frozen CLIP transformer to learn point cloud representation.
1 code implementation • CVPR 2023 • Honghui Yang, Tong He, Jiaheng Liu, Hua Chen, Boxi Wu, Binbin Lin, Xiaofei He, Wanli Ouyang
In contrast to previous 3D MAE frameworks, which either design a complex decoder to infer masked information from maintained regions or adopt sophisticated masking strategies, we instead propose a much simpler paradigm.
no code implementations • 30 Nov 2022 • Di Huang, Xiaopeng Ji, Xingyi He, Jiaming Sun, Tong He, Qing Shuai, Wanli Ouyang, Xiaowei Zhou
The key idea is that the hand motion naturally provides multiple views of the object and the motion can be reliably estimated by a hand pose tracker.
1 code implementation • 29 Nov 2022 • Chuming Li, Jie Liu, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang
In the learning phase, each agent minimizes the TD error that is dependent on how the subsequent agents have reacted to their chosen action.
Ranked #1 on SMAC on SMAC 3s5z_vs_3s6z
no code implementations • 17 Nov 2022 • Jiaheng Liu, Tong He, Honghui Yang, Rui Su, Jiayi Tian, Junran Wu, Hongcheng Guo, Ke Xu, Wanli Ouyang
Previous top-performing methods for 3D instance segmentation often maintain inter-task dependencies and the tendency towards a lack of robustness.
no code implementations • 14 Nov 2022 • Xiaopei Wu, Yang Zhao, Liang Peng, Hua Chen, Xiaoshui Huang, Binbin Lin, Haifeng Liu, Deng Cai, Wanli Ouyang
When training a teacher-student semi-supervised framework, we randomly select gt samples and pseudo samples to both labeled frames and unlabeled frames, making a strong data augmentation for them.
1 code implementation • 11 Oct 2022 • Jingru Tan, Bo Li, Xin Lu, Yongqiang Yao, Fengwei Yu, Tong He, Wanli Ouyang
Long-tail distribution is widely spread in real-world applications.
1 code implementation • 9 Oct 2022 • Peng Ye, Shengji Tang, Baopu Li, Tao Chen, Wanli Ouyang
In this work, we aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing, and further propose a new training strategy to strengthen the performance of residual networks.
1 code implementation • ICCV 2023 • Tianyu Huang, Bowen Dong, Yunhan Yang, Xiaoshui Huang, Rynson W. H. Lau, Wanli Ouyang, WangMeng Zuo
To address this issue, we propose CLIP2Point, an image-depth pre-training method by contrastive learning to transfer CLIP to the 3D domain, and adapt it to point cloud classification.
Ranked #3 on Training-free 3D Point Cloud Classification on ScanObjectNN (using extra training data)
1 code implementation • 23 Sep 2022 • Weitao Feng, Lei Bai, Yongqiang Yao, Fengwei Yu, Wanli Ouyang
In this paper, we propose a Frame Rate Agnostic MOT framework with a Periodic training Scheme (FAPS) to tackle the FraMOT problem for the first time.
1 code implementation • 23 Aug 2022 • Lumin Xu, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang
We propose a single-network approach, termed ZoomNet, to take into account the hierarchical structure of the full human body and solve the scale variation of different body parts.
Ranked #5 on 2D Human Pose Estimation on COCO-WholeBody
no code implementations • 15 Aug 2022 • Xinzhu Ma, Yuan Meng, Yinmin Zhang, Lei Bai, Jun Hou, Shuai Yi, Wanli Ouyang
We hope this work can provide insights for the image-based 3D detection community under a semi-supervised setting.
no code implementations • 29 Jul 2022 • Shijie Wang, Jianlong Chang, Zhihui Wang, Haojie Li, Wanli Ouyang, Qi Tian
In this paper, we develop Fine-grained Retrieval Prompt Tuning (FRPT), which steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompting and feature adaptation.
1 code implementation • 22 Jul 2022 • Hao Meng, Sheng Jin, Wentao Liu, Chen Qian, Mengxiang Lin, Wanli Ouyang, Ping Luo
Unlike most previous works that directly predict the 3D poses of two interacting hands simultaneously, we propose to decompose the challenging interacting hand pose estimation task and estimate the pose of each hand separately.