no code implementations • 23 Dec 2016 • Rui Chen, Huizhu Jia, Xiaodong Xie, Wen Gao
In this letter, we propose a novel image denoising method based on correlation preserving sparse coding.
no code implementations • 23 Dec 2016 • Rui Chen, Huizhu Jia, Xiaodong Xie, Wen Gao
Aerial images are often degraded by space-varying motion blur and simultaneous uneven illumination.
no code implementations • 4 Apr 2017 • Rui Chen, Huizhu Jia, Xiaodong Xie, Wen Gao
The multiscale dictionary is considered as the product of oscillating dictionary and tolerance dictionary.
no code implementations • 17 May 2017 • Rui Chen, Huizhu Jia, Xiange Wen, Xiaodong Xie
Color artifacts of demosaicked images are often found at contours due to interpolation across edges and cross-channel aliasing.
no code implementations • 9 Dec 2017 • Rui Chen, Changshui Yang, Huizhu Jia, Xiaodong Xie
In this letter, we address the problem of estimating Gaussian noise level from the trained dictionaries in update stage.
no code implementations • 12 Apr 2018 • Cong Ma, Changshui Yang, Fan Yang, Yueqing Zhuang, Ziwei Zhang, Huizhu Jia, Xiaodong Xie
In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU).
Ranked #20 on Multi-Object Tracking on MOT16
no code implementations • 13 Oct 2018 • Fan Yang, Ke Yan, Shijian Lu, Huizhu Jia, Xiaodong Xie, Wen Gao
Person re-identification (ReID) is a challenging task due to arbitrary human pose variations, background clutters, etc.
no code implementations • 11 Jun 2019 • Yuanchao Bai, Huizhu Jia, Ming Jiang, Xian-Ming Liu, Xiaodong Xie, Wen Gao
Blind image deblurring is a challenging problem in computer vision, which aims to restore both the blur kernel and the latent sharp image from only a blurry observation.
3 code implementations • 18 Nov 2019 • Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, Huizhu Jia
The FFA-Net architecture consists of three key components: 1) A novel Feature Attention (FA) module combines Channel Attention with Pixel Attention mechanism, considering that different channel-wise features contain totally different weighted information and haze distribution is uneven on the different image pixels.
Ranked #1 on Image Dehazing on KITTI
1 code implementation • CVPR 2020 • Ziwei Zhang, Chi Su, Liang Zheng, Xiaodong Xie
Compared with the existing practice of feature concatenation, we find that uncovering the correlation among the three factors is a superior way of leveraging the pivotal contextual cues provided by edges and poses.
1 code implementation • IEEE Transactions on Medical Imaging 2020 • Meng Li, William Hsu, Xiaodong Xie, Jason Cong, Wen Gao
We combine these two methods and demonstrate their effectiveness on both CNN-based neural networks and WGAN-based neural networks with comprehensive experiments.
no code implementations • 8 Oct 2021 • Mengxi Guo, Dangqing Huang, Xiaodong Xie
This paper implemented the Transformer model and conditional variational autoencoder (CVAE) to the graphic design layout generation task.
no code implementations • 22 Jan 2022 • Yi Hou, Chengyang Li, Fan Yang, Cong Ma, Liping Zhu, Yuan Li, Huizhu Jia, Xiaodong Xie
Our method can integrate the pedestrian's head and body information to enhance the feature expression ability of the density map.
no code implementations • 22 Jan 2022 • Yi Hou, Chengyang Li, Yuheng Lu, Liping Zhu, Yuan Li, Huizhu Jia, Xiaodong Xie
In this article, we propose a simulated crowd counting dataset CrowdX, which has a large scale, accurate labeling, parameterized realization, and high fidelity.
no code implementations • 5 Jul 2022 • Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang
Current point-cloud detection methods have difficulty detecting the open-vocabulary objects in the real world, due to their limited generalization capability.
no code implementations • CVPR 2023 • Zhaozhi Wang, Kefan Su, Jian Zhang, Huizhu Jia, Qixiang Ye, Xiaodong Xie, Zongqing Lu
In this paper, we propose multi-agent automated machine learning (MA2ML) with the aim to effectively handle joint optimization of modules in automated machine learning (AutoML).
1 code implementation • 24 Dec 2022 • Rui Ma, Mengxi Guo, Yi Hou, Fan Yang, Yuan Li, Huizhu Jia, Xiaodong Xie
The CIN is composed of the invertible part to achieve high imperceptibility and the non-invertible part to strengthen the robustness against strong noise attacks.
1 code implementation • CVPR 2023 • Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang
In this paper, we address open-vocabulary 3D point-cloud detection by a dividing-and-conquering strategy, which involves: 1) developing a point-cloud detector that can learn a general representation for localizing various objects, and 2) connecting textual and point-cloud representations to enable the detector to classify novel object categories based on text prompting.
no code implementations • 1 Jul 2023 • Peidong Jia, Jiaming Liu, Senqiao Yang, Jiarui Wu, Xiaodong Xie, Shanghang Zhang
PDM comprehensively leverages the prompt memory to extract domain-specific knowledge and explicitly constructs a long-term memory space for the data distribution, which represents better domain diversity compared to existing methods.
no code implementations • 28 Nov 2023 • Peidong Jia, Chenxuan Li, Yuhui Yuan, Zeyu Liu, Yichao Shen, Bohan Chen, Xingru Chen, Yinglin Zheng, Dong Chen, Ji Li, Xiaodong Xie, Shanghang Zhang, Baining Guo
Our COLE system comprises multiple fine-tuned Large Language Models (LLMs), Large Multimodal Models (LMMs), and Diffusion Models (DMs), each specifically tailored for design-aware layer-wise captioning, layout planning, reasoning, and the task of generating images and text.
no code implementations • 22 Dec 2023 • Dongmei Zhang, Chang Li, Ray Zhang, Shenghao Xie, Wei Xue, Xiaodong Xie, Shanghang Zhang
In this work, we propose FM-OV3D, a method of Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection, which improves the open-vocabulary localization and recognition abilities of 3D model by blending knowledge from multiple pre-trained foundation models, achieving true open-vocabulary without facing constraints from original 3D datasets.
no code implementations • 4 Jan 2024 • Rui Ma, Qiang Zhou, Bangjun Xiao, Yizhu Jin, Daquan Zhou, Xiuyu Li, Aishani Singh, Yi Qu, Kurt Keutzer, Xiaodong Xie, Jingtong Hu, Zhen Dong, Shanghang Zhang
Copyright is a legal right that grants creators the exclusive authority to reproduce, distribute, and profit from their creative works.