no code implementations • 8 Mar 2025 • Baining Zhao, Jianjie Fang, Zichao Dai, Ziyou Wang, Jirong Zha, Weichen Zhang, Chen Gao, Yue Wang, Jinqiang Cui, Xinlei Chen, Yong Li
Large multimodal models exhibit remarkable intelligence, yet their embodied cognitive abilities during motion in open-ended urban 3D space remain to be explored.
no code implementations • 18 Feb 2025 • Ruiying Peng, Kaiyuan Li, Weichen Zhang, Chen Gao, Xinlei Chen, Yong Li
Recently, 3D-LLMs, which combine point-cloud encoders with large models, have been proposed to tackle complex tasks in embodied intelligence and scene understanding.
no code implementations • 12 Oct 2024 • Chen Gao, Baining Zhao, Weichen Zhang, Jinzhu Mao, Jun Zhang, Zhiheng Zheng, Fanhang Man, Jianjie Fang, Zile Zhou, Jinqiang Cui, Xinlei Chen, Yong Li
To address it, in this paper, we construct a benchmark platform for embodied intelligence evaluation in real-world city environments.
no code implementations • 19 Jun 2024 • Qinchen Wu, Difei Gao, Kevin Qinghong Lin, Zhuoyu Wu, Xiangwu Guo, Peiran Li, Weichen Zhang, Hengxu Wang, Mike Zheng Shou
The advent of Multimodal LLMs has significantly enhanced image OCR recognition capabilities, making GUI automation a viable reality for increasing efficiency in digital tasks.
no code implementations • CVPR 2024 • Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou
Graphical User Interface (GUI) automation holds significant promise for assisting users with complex tasks thereby boosting human productivity.
1 code implementation • 20 Dec 2023 • Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou
Graphical User Interface (GUI) automation holds significant promise for assisting users with complex tasks, thereby boosting human productivity.
no code implementations • 17 Jun 2023 • Weichen Zhang, Xiang Zhou, Yukang Cao, Wensen Feng, Chun Yuan
We improve from NeRF and propose a novel framework that, by leveraging the parametric 3DMM models, can reconstruct a high-fidelity drivable face avatar and successfully handle the unseen expressions.
no code implementations • 25 Jan 2023 • Yunpeng Bai, Zihan Zhong, Chao Dong, Weichen Zhang, Guowei Xu, Chun Yuan
Then, the text input can be directly accessed into the StyleGAN space and be used to find the semantic shift according to the text description.
1 code implementation • CVPR 2021 • Weichen Zhang, Wen Li, Dong Xu
In this work, we propose a new cross-dataset 3D object detection method named Scale-aware and Range-aware Domain Adaptation Network (SRDAN).
1 code implementation • CVPR 2018 • Weichen Zhang, Wanli Ouyang, Wen Li, Dong Xu
In this paper, we propose a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN) through domain-collaborative and domain-adversarial training of neural networks.
no code implementations • ICCV 2015 • Sijin Li, Weichen Zhang, Antoni B. Chan
The score function is then the dot-product between the image and pose embeddings.
Ranked #336 on
3D Human Pose Estimation
on Human3.6M
no code implementations • CVPR 2014 • Adeel Mumtaz, Weichen Zhang, Antoni B. Chan
We derive an EM algorithm for estimating the parameters of the FBM.