1 code implementation • 2 Mar 2025 • Mingcong Lei, Ge Wang, Yiming Zhao, Zhixin Mai, Qing Zhao, Yao Guo, Zhen Li, Shuguang Cui, Yatong Han, Jinke Ren
To address these limitations in dynamic environments, we propose Closed-Loop Embodied Agent (CLEA) -- a novel architecture incorporating four specialized open-source LLMs with functional decoupling for closed-loop task management.
no code implementations • 25 Feb 2025 • Yifan Pu, Yiming Zhao, Zhicong Tang, Ruihong Yin, Haoxing Ye, Yuhui Yuan, Dong Chen, Jianmin Bao, Sirui Zhang, Yanbin Wang, Lin Liang, Lijuan Wang, Ji Li, Xiu Li, Zhouhui Lian, Gao Huang, Baining Guo
In this paper, we introduce the Anonymous Region Transformer (ART), which facilitates the direct generation of variable multi-layer transparent images based on a global text prompt and an anonymous region layout.
no code implementations • 14 Feb 2025 • Mingcong Lei, Yiming Zhao, Ge Wang, Zhixin Mai, Shuguang Cui, Yatong Han, Jinke Ren
A key objective of embodied intelligence is enabling agents to perform long-horizon tasks in dynamic environments while maintaining robust decision-making and adaptability.
no code implementations • 29 Sep 2024 • Yiming Zhao, Dewen Guo, Zhouhui Lian, Yue Gao, Jianhong Han, Jie Feng, Guoping Wang, Bingfeng Zhou, Sheng Li
To bridge the gap between artists and non-specialists, we present a unified framework, Neural-Polyptych, to facilitate the creation of expansive, high-resolution paintings by seamlessly incorporating interactive hand-drawn sketches with fragments from original paintings.
no code implementations • 28 Sep 2024 • Jiwei Tang, Jin Xu, Tingwei Lu, Zhicheng Zhang, Yiming Zhao, Lin Hai, Hai-Tao Zheng
Large language models (LLMs) demonstrate exceptional capabilities in various scenarios.
no code implementations • 3 Sep 2024 • Yiming Zhao, Taein Kwon, Paul Streli, Marc Pollefeys, Christian Holz
However, estimating these interactions from an egocentric camera perspective is challenging, largely due to the lack of comprehensive datasets that provide both accurate hand poses on contacting surfaces and detailed annotations of pressure information.
1 code implementation • 21 Aug 2024 • Guo Pu, Yiming Zhao, Zhouhui Lian
The key idea is to initially construct a preliminary mesh from the input panorama, and iteratively refine this mesh using a panoramic RGBD inpainter while collecting photo-realistic 3D-consistent pseudo novel views.
no code implementations • 14 Jun 2024 • Zeyu Liu, Weicong Liang, Yiming Zhao, Bohan Chen, Lin Liang, Lijuan Wang, Ji Li, Yuhui Yuan
With the combination of these techniques, we deliver a powerful customized multilingual text encoder, Glyph-ByT5-v2, and a strong aesthetic graphic generation model, Glyph-SDXL-v2, that can support accurate spelling in 10 different languages.
1 code implementation • 8 Dec 2023 • Yiming Zhao, Zhouhui Lian
Text-to-Image (T2I) generation methods based on diffusion model have garnered significant attention in the last few years.
no code implementations • 1 Dec 2023 • Yiming Zhao, Tao Zhou, Yunqi Gu, Yi Zhou, Yizhe Zhang, Ye Wu, Huazhu Fu
Specifically, we first propose a Cross-level Enhancement and Aggregation Network (CEA-Net) for weakly-supervised polyp segmentation.
no code implementations • ICCV 2023 • Yiming Zhao, Denys Rozumnyi, Jie Song, Otmar Hilliges, Marc Pollefeys, Martin R. Oswald
The key idea is to tackle the inverse problem of image deblurring by modeling the forward problem with a 3D human model, a texture map, and a sequence of poses to describe human motion.
no code implementations • 9 Jan 2023 • Huanyu Bian, Zhilong Jia, Menghan Dou, Yuan Fang, Lei LI, Yiming Zhao, Hanchao Wang, Zhaohui Zhou, Wei Wang, Wenyu Zhu, Ye Li, Yang Yang, Weiming Zhang, Nenghai Yu, Zhaoyun Chen, Guoping Guo
Therefore, based on VQNet 1. 0, we further propose VQNet 2. 0, a new generation of unified classical and quantum machine learning framework that supports hybrid optimization.
no code implementations • 12 Jul 2022 • Lin Bai, Yiming Zhao, Xinming Huang
In this system, a FPGA-based deep learning accelerator core (DPU) is placed next to the LiDAR sensor, to perform point cloud pre-processing and segmentation neural network.
no code implementations • 28 May 2022 • Xinyu Zou, Zhi Hu, Yiming Zhao, Xuchu Ding, Zhongyi Liu, Chenliang Li, Aixin Sun
At each multi-scenario/multi-task layer, a novel expert selection algorithm is proposed to automatically identify scenario-/task-specific and shared experts for each input.
1 code implementation • 16 Sep 2021 • Yiming Zhao, Xiao Zhang, Xinming Huang
The proposed algorithm is implemented with C++ and wrapped as a python function.
1 code implementation • 8 Sep 2021 • Yiming Zhao, Lin Bai, Xinming Huang
In this paper, we propose a new projection-based LiDAR semantic segmentation pipeline that consists of a novel network structure and an efficient post-processing step.
LIDAR Semantic Segmentation
Robust 3D Semantic Segmentation
+1
1 code implementation • 21 Aug 2021 • Yiming Zhao, Xiao Zhang, Xinming Huang
To our best knowledge, we are the first to attempt the point cloud panoptic segmentation with clustering algorithms.
no code implementations • 4 May 2021 • Lin Bai, Yiming Zhao, Xinming Huang
Light Detection And Ranging (LiDAR) has been widely used in autonomous vehicles for perception and localization.
1 code implementation • CVPR 2021 • Yiming Zhao, Xinming Huang, Ziming Zhang
With those properties, directly updating the Lucas-Kanade algorithm on our feature maps will precisely align image pairs with large appearance changes.
1 code implementation • 17 Apr 2021 • Yiming Zhao, Lin Bai, Ziming Zhang, Xinming Huang
Therefore, it is assumed those pixels share the same surface with the nearest LiDAR point, and their respective depth can be estimated as the nearest LiDAR depth value plus a residual error.
1 code implementation • 5 Jul 2020 • Lin Bai, Yiming Zhao, Mahdi Elhousni, Xinming Huang
In this paper, a light-weight network is proposed for the task of LiDAR point cloud depth completion.
no code implementations • 3 Sep 2018 • Lin Bai, Yiming Zhao, Xinming Huang
The state-of-the-art CNNs, such as MobileNetV2 and Xception, adopt depthwise separable convolution to replace the standard convolution for embedded platforms.