no code implementations • 24 Feb 2025 • HongYu Zhou, Zorah Lähner
With the rising popularity of 3D Gaussian splatting and the expanse of applications from rendering to 3D reconstruction, there comes also a need for geometry processing applications directly on this new representation.
no code implementations • 19 Feb 2025 • Jiahao Gai, Hao Mark Chen, Zhican Wang, HongYu Zhou, Wanru Zhao, Nicholas Lane, Hongxiang Fan
First, the volume of available HDL training data is substantially smaller compared to that for software programming languages.
1 code implementation • 17 Feb 2025 • Ailin Huang, Boyong Wu, Bruce Wang, Chao Yan, Chen Hu, Chengli Feng, Fei Tian, Feiyu Shen, Jingbei Li, Mingrui Chen, Peng Liu, Ruihang Miao, Wang You, Xi Chen, Xuerui Yang, Yechang Huang, Yuxiang Zhang, Zheng Gong, Zixin Zhang, HongYu Zhou, Jianjian Sun, Brian Li, Chengting Feng, Changyi Wan, Hanpeng Hu, Jianchang Wu, Jiangjie Zhen, Ranchen Ming, Song Yuan, Xuelin Zhang, Yu Zhou, Bingxin Li, Buyun Ma, Hongyuan Wang, Kang An, Wei Ji, Wen Li, Xuan Wen, Xiangwen Kong, Yuankai Ma, Yuanwei Liang, Yun Mou, Bahtiyar Ahmidi, Bin Wang, Bo Li, Changxin Miao, Chen Xu, Chenrun Wang, Dapeng Shi, Deshan Sun, Dingyuan Hu, Dula Sai, Enle Liu, Guanzhe Huang, Gulin Yan, Heng Wang, Haonan Jia, Haoyang Zhang, Jiahao Gong, Junjing Guo, Jiashuai Liu, Jiahong Liu, Jie Feng, Jie Wu, Jiaoren Wu, Jie Yang, Jinguo Wang, Jingyang Zhang, Junzhe Lin, Kaixiang Li, Lei Xia, Li Zhou, Liang Zhao, Longlong Gu, Mei Chen, Menglin Wu, Ming Li, Mingxiao Li, Mingliang Li, Mingyao Liang, Na Wang, Nie Hao, Qiling Wu, Qinyuan Tan, Ran Sun, Shuai Shuai, Shaoliang Pang, Shiliang Yang, Shuli Gao, Shanshan Yuan, SiQi Liu, Shihong Deng, Shilei Jiang, Sitong Liu, Tiancheng Cao, Tianyu Wang, Wenjin Deng, Wuxun Xie, Weipeng Ming, Wenqing He, Wen Sun, Xin Han, Xin Huang, Xiaomin Deng, Xiaojia Liu, Xin Wu, Xu Zhao, Yanan Wei, Yanbo Yu, Yang Cao, Yangguang Li, Yangzhen Ma, Yanming Xu, Yaoyu Wang, Yaqiang Shi, Yilei Wang, Yizhuang Zhou, Yinmin Zhong, Yang Zhang, Yaoben Wei, Yu Luo, Yuanwei Lu, Yuhe Yin, Yuchu Luo, Yuanhao Ding, Yuting Yan, Yaqi Dai, Yuxiang Yang, Zhe Xie, Zheng Ge, Zheng Sun, Zhewei Huang, Zhichao Chang, Zhisheng Guan, Zidong Yang, Zili Zhang, Binxing Jiao, Daxin Jiang, Heung-Yeung Shum, Jiansheng Chen, Jing Li, Shuchang Zhou, Xiangyu Zhang, Xinhao Zhang, Yibo Zhu
Based on our new StepEval-Audio-360 evaluation benchmark, Step-Audio achieves state-of-the-art performance in human evaluations, especially in terms of instruction following.
no code implementations • 10 Feb 2025 • Tansheng Zhu, HongYu Zhou, Ke Jin, Xusheng Xu, Qiufan Yuan, Lijie Ji
Bayesian optimization is highly effective for optimizing expensive-to-evaluate black-box functions, but it faces significant computational challenges due to the high computational complexity of Gaussian processes, which results in a total time complexity that is quartic with respect to the number of iterations.
no code implementations • 2 Dec 2024 • HongYu Zhou, Longzhong Lin, Jiabao Wang, Yichong Lu, Dongfeng Bai, Bingbing Liu, Yue Wang, Andreas Geiger, Yiyi Liao
In the past few decades, autonomous driving algorithms have made significant progress in perception, planning, and control.
no code implementations • 28 Nov 2024 • Yichong Lu, Yichi Cai, Shangzhan Zhang, HongYu Zhou, Haoji Hu, Huimin Yu, Andreas Geiger, Yiyi Liao
In this work, we introduce UrbanCAD, a framework that pushes the frontier of the photorealism-controllability trade-off by generating highly controllable and photorealistic 3D vehicle digital twins from a single urban image and a collection of free 3D CAD models and handcrafted materials.
no code implementations • 13 Nov 2024 • YuTao Shen, HongYu Zhou, Xin Yang, Xuqi Lu, Ziyue Guo, Lixi Jiang, Yong He, Haiyan Cen
The SAM module achieved high segmentation accuracy, with a mean intersection over union (mIoU) of 0. 961 and an F1-score of 0. 980.
1 code implementation • 20 Jun 2024 • Hao Mark Chen, Liam Castelli, Martin Ferianc, HongYu Zhou, Shuanglong Liu, Wayne Luk, Hongxiang Fan
Reliable uncertainty estimation plays a crucial role in various safety-critical applications such as medical diagnosis and autonomous driving.
no code implementations • 3 Jun 2024 • Jiahao Shao, Yuanbo Yang, HongYu Zhou, Youmin Zhang, Yujun Shen, Vitor Guizilini, Yue Wang, Matteo Poggi, Yiyi Liao
Moreover, we design an effective training strategy to provide context within a clip.
1 code implementation • CVPR 2024 • Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis, Constantin Aronssohn, Nacim Bouia, Stephanie Fu, Romain Loiseau, Van Nguyen Nguyen, Charles Raude, Elliot Vincent, Lintao XU, HongYu Zhou, Loic Landrieu
Determining the location of an image anywhere on Earth is a complex visual task, which makes it particularly relevant for evaluating computer vision algorithms.
Ranked #2 on
Photo geolocation estimation
on OpenStreetView-5M
no code implementations • CVPR 2024 • HongYu Zhou, Jiahao Shao, Lu Xu, Dongfeng Bai, Weichao Qiu, Bingbing Liu, Yue Wang, Andreas Geiger, Yiyi Liao
Holistic understanding of urban scenes based on RGB images is a challenging yet important problem.
no code implementations • 28 Sep 2023 • HongYu Zhou, Yichen Song, Vasileios Tzoumas
We study how to safely control nonlinear control-affine systems that are corrupted with bounded non-stochastic noise, i. e., noise that is unknown a priori and that is not necessarily governed by a stochastic model.
1 code implementation • 20 Sep 2023 • Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi
This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.
Ranked #5 on
Visual Question Answering
on MMBench
1 code implementation • 23 Aug 2023 • HongYu Zhou, Vasileios Tzoumas
We study the problem of \textit{safe control of linear dynamical systems corrupted with non-stochastic noise}, and provide an algorithm that guarantees (i) zero constraint violation of convex time-varying constraints, and (ii) bounded dynamic regret, \ie bounded suboptimality against an optimal clairvoyant controller that knows the future noise a priori.
no code implementations • 18 Jul 2023 • Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Haoran Wei, HongYu Zhou, Jianjian Sun, Yuang Peng, Runpei Dong, Chunrui Han, Xiangyu Zhang
Based on precise referring instruction, we propose ChatSpot, a unified end-to-end multimodal large language model that supports diverse forms of interactivity including mouse clicks, drag-and-drop, and drawing boxes, which provides a more flexible and seamless interactive experience.
no code implementations • 30 Jun 2023 • Weixin Mao, Jinrong Yang, Zheng Ge, Lin Song, HongYu Zhou, Tiezheng Mao, Zeming Li, Osamu Yoshie
In light of the success of sample mining techniques in 2D object detection, we propose a simple yet effective mining strategy for improving depth perception in 3D object detection.
no code implementations • 10 Mar 2023 • Chunrui Han, Jinrong Yang, Jianjian Sun, Zheng Ge, Runpei Dong, HongYu Zhou, Weixin Mao, Yuang Peng, Xiangyu Zhang
In this paper, we explore an embarrassingly simple long-term recurrent fusion strategy built upon the LSS-based methods and find it already able to enjoy the merits from both sides, i. e., rich long-term information and efficient fusion pipeline.
2 code implementations • 9 Feb 2023 • HongYu Zhou, Xin Zhou, Zhiwei Zeng, Lingzi Zhang, Zhiqi Shen
Recommendation systems have become popular and effective tools to help users discover their interesting items by modeling the user preference and item property based on implicit interactions (e. g., purchasing and clicking).
1 code implementation • 28 Jan 2023 • HongYu Zhou, Xin Zhou, Lingzi Zhang, Zhiqi Shen
On top of the finding, we propose a model that enhances the dyadic relations by learning Dual RepresentAtions of both users and items via constructing homogeneous Graphs for multimOdal recommeNdation.
no code implementations • 2 Jan 2023 • HongYu Zhou, Zirui Xu, Vasileios Tzoumas
In this paper, we enable projection-free online learning within the framework of Online Convex Optimization with Memory (OCO-M) -- OCO-M captures how the history of decisions affects the current outcome by allowing the online learning loss functions to depend on both current and past decisions.
2 code implementations • ICCV 2023 • HongYu Zhou, Zheng Ge, Zeming Li, Xiangyu Zhang
This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT.
Ranked #2 on
Bird's-Eye View Semantic Segmentation
on nuScenes
(IoU lane - 224x480 - 100x100 at 0.5 metric)
no code implementations • 26 Sep 2022 • Zirui Xu, HongYu Zhou, Vasileios Tzoumas
We are motivated by the future of autonomy that involves multiple robots coordinating in dynamic, unstructured, and adversarial environments to complete complex tasks such as target tracking, environmental mapping, and area monitoring.
no code implementations • 19 Aug 2022 • HongYu Zhou, Zheng Ge, Weixin Mao, Zeming Li
To address this problem, we revisit the generation of BEV representation and propose detecting objects in perspective BEV -- a new BEV representation that does not require feature sampling.
no code implementations • 18 Aug 2022 • HongYu Zhou, Vasileios Tzoumas
We present safe control of partially-observed linear time-varying systems in the presence of unknown and unpredictable process and measurement noise.
2 code implementations • 13 Jul 2022 • Xin Zhou, HongYu Zhou, Yong liu, Zhiwei Zeng, Chunyan Miao, Pengwei Wang, Yuan You, Feijun Jiang
Besides the user-item interaction graph, existing state-of-the-art methods usually use auxiliary graphs (e. g., user-user or item-item relation graph) to augment the learned representations of users and/or items.
2 code implementations • 6 Jul 2022 • HongYu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun
To date, the most powerful semi-supervised object detectors (SS-OD) are based on pseudo-boxes, which need a sequence of post-processing with fine-tuned hyper-parameters.
no code implementations • 14 Oct 2021 • Shijie Liu, HongYu Zhou, Xiaozhou Shi, Junwen Pan
In recent years, as the Transformer has performed increasingly well on NLP tasks, many researchers have ported the Transformer structure to vision tasks , bridging the gap between NLP and CV tasks.
no code implementations • 29 Sep 2021 • Rui Zhou, HongYu Zhou, Huidong Gao, Masayoshi Tomizuka, Jiachen Li, Zhuo Xu
Accurate, long-term forecasting of pedestrian trajectories in highly dynamic and interactive scenes is a long-standing challenge.
no code implementations • 19 Jul 2021 • Cong Xie, Shilei Cao, Dong Wei, HongYu Zhou, Kai Ma, Xianli Zhang, Buyue Qian, Liansheng Wang, Yefeng Zheng
Universal lesion detection in computed tomography (CT) images is an important yet challenging task due to the large variations in lesion type, size, shape, and appearance.
no code implementations • 4 Jun 2021 • HongYu Zhou, Zhengru Ren, Mathias Marley, Roger Skjetne
Autonomous marine vessels are expected to avoid inter-vessel collisions and comply with the international regulations for safe voyages.
no code implementations • 12 May 2021 • Hongxiang Fan, Martin Ferianc, Miguel Rodrigues, HongYu Zhou, Xinyu Niu, Wayne Luk
Neural networks (NNs) have demonstrated their potential in a wide range of applications such as image recognition, decision making or recommendation systems.