no code implementations • 24 Oct 2024 • Mengfei Xia, Nan Xue, Yujun Shen, Ran Yi, Tieliang Gong, Yong-Jin Liu
Classifier-Free Guidance (CFG), which combines the conditional and unconditional score functions with two coefficients summing to one, serves as a practical technique for diffusion model sampling.
no code implementations • 16 Sep 2024 • Zi-Ming Wang, Nan Xue, Ling Lei, Rebecka Jörnsten, Gui-Song Xia
This paper studies the problem of distribution matching (DM), which is a fundamental machine learning problem seeking to robustly align two probability distributions.
no code implementations • 30 May 2024 • Jianghao Shen, Nan Xue, Tianfu Wu
Child 3D Gaussians are learned via a lightweight Multi-Layer Perceptron (MLP) which takes as input the projected image features of a parent 3D Gaussian and the embedding of a target camera view.
no code implementations • CVPR 2024 • Han Feng, Wenchao Ma, Quankai Gao, Xianwei Zheng, Nan Xue, Huijuan Xu
This task is challenging due to the limited input from Head Mounted Devices, which capture only sparse observations from the head and hands.
no code implementations • CVPR 2024 • Xianpeng Liu, Ce Zheng, Ming Qian, Nan Xue, Chen Chen, Zhebin Zhang, Chen Li, Tianfu Wu
We present Multi-View Attentive Contextualization (MvACon), a simple yet effective method for improving 2D-to-3D feature lifting in query-based multi-view 3D (MV3D) object detection.
no code implementations • 6 May 2024 • Nan Xue, Yaping Sun, Zhiyong Chen, Meixia Tao, Xiaodong Xu, Liang Qian, Shuguang Cui, Ping Zhang
In this paper, we propose a wireless distributed LLMs paradigm based on Mixture of Experts (MoE), named WDMoE, deploying LLMs collaboratively across edge servers of base station (BS) and mobile devices in the wireless communications system.
no code implementations • 26 Apr 2024 • Shangzhan Zhang, Sida Peng, Tao Xu, Yuanbo Yang, Tianrun Chen, Nan Xue, Yujun Shen, Hujun Bao, Ruizhen Hu, Xiaowei Zhou
Instead of relying on extensive paired data, i. e., 3D meshes with material graphs and corresponding text descriptions, to train a material graph generative model, we propose to leverage the pre-trained 2D diffusion model as a bridge to connect the text and material graphs.
no code implementations • 17 Apr 2024 • Zhiheng Liu, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jie Xiao, Kai Zhu, Nan Xue, Yu Liu, Yujun Shen, Yang Cao
3D Gaussians have recently emerged as an efficient representation for novel view synthesis.
1 code implementation • CVPR 2024 • Yuxi Xiao, Qianqian Wang, Shangzhan Zhang, Nan Xue, Sida Peng, Yujun Shen, Xiaowei Zhou
Recovering dense and long-range pixel motion in videos is a challenging problem.
no code implementations • 18 Mar 2024 • Yuting Xiao, Xuan Wang, Jiafei Li, Hongrui Cai, Yanbo Fan, Nan Xue, Minghui Yang, Yujun Shen, Shenghua Gao
To this end, we propose a novel approach, GauMesh, to bridge the 3D Gaussian and Mesh for modeling and rendering the dynamic scenes.
1 code implementation • NeurIPS 2023 • Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Gui-Song Xia, DaCheng Tao
Transformers have been recently explored for 3D point cloud understanding with impressive progress achieved.
Ranked #5 on Semantic Segmentation on S3DIS Area5
no code implementations • 28 Nov 2023 • Jiepan Li, Fangxiao Lu, Nan Xue, Zhuohong Li, Hongyan zhang, wei he
In this paper, we propose an overlapped window cross-level attention (OWinCA) to achieve the low-level feature enhancement guided by the highest-level features.
no code implementations • 6 Sep 2023 • Jiakun Xu, Bowen Xu, Gui-Song Xia, Liang Dong, Nan Xue
In our experiments, we demonstrate how an effective representation of a road graph significantly enhances the performance of vector road mapping on established benchmarks, without requiring extensive modifications to the neural network architecture.
1 code implementation • CVPR 2024 • Nan Xue, Bin Tan, Yuxi Xiao, Liang Dong, Gui-Song Xia, Tianfu Wu, Yujun Shen
Instead of leveraging matching-based solutions from 2D wireframes (or line segments) for 3D wireframe reconstruction as done in prior arts, we present NEAT, a rendering-distilling formulation using neural fields to represent 3D line segments with 2D observations, and bipartite matching for perceiving and distilling of a sparse set of 3D global junctions.
1 code implementation • 20 Jun 2023 • Yuxin Jin, Ming Qian, Jincheng Xiong, Nan Xue, Gui-Song Xia
Our method proposes a depth feature distillation strategy to obtain depth knowledge from a pre-trained monocular depth estimation model and uses a DOF-edge loss to understand the relationship between DOF and depth.
Ranked #1 on Defocus Blur Detection on EBD
1 code implementation • CVPR 2023 • Jian Ding, Nan Xue, Gui-Song Xia, Bernt Schiele, Dengxin Dai
This work studies semantic segmentation under the domain generalization setting, where a model is trained only on the source domain and tested on the unseen target domain.
no code implementations • ICCV 2023 • Xianpeng Liu, Ce Zheng, Kelvin Cheng, Nan Xue, Guo-Jun Qi, Tianfu Wu
Motivated by a new and strong observation that this challenge can be remedied by a 3D-space local-grid search scheme in an ideal case, we propose a stage-wise approach, which combines the information flow from 2D-to-3D (3D bounding box proposal generation with a single 2D image) and 3D-to-2D (proposal verification by denoising with 3D-to-2D contexts) in a top-down manner.
1 code implementation • ICCV 2023 • Ming Qian, Jincheng Xiong, Gui-Song Xia, Nan Xue
This paper aims to develop an accurate 3D geometry representation of satellite images using satellite-ground image pairs.
Ranked #1 on Cross-View Image-to-Image Translation on CVACT
1 code implementation • 30 Nov 2022 • Bin Tan, Nan Xue, Tianfu Wu, Gui-Song Xia
This paper studies the challenging two-view 3D reconstruction in a rigorous sparse-view configuration, which is suffering from insufficient correspondences in the input image pairs for camera pose estimation.
1 code implementation • CVPR 2023 • Yuxi Xiao, Nan Xue, Tianfu Wu, Gui-Song Xia
This paper presents a neural incremental Structure-from-Motion (SfM) approach, Level-S$^2$fM, which estimates the camera poses and scene geometry from a set of uncalibrated images by learning coordinate MLPs for the implicit surfaces and the radiance fields from the established keypoint correspondences.
1 code implementation • 24 Oct 2022 • Nan Xue, Tianfu Wu, Song Bai, Fu-Dong Wang, Gui-Song Xia, Liangpei Zhang, Philip H. S. Torr
This article presents Holistically-Attracted Wireframe Parsing (HAWP), a method for geometric analysis of 2D images containing wireframes formed by line segments and junctions.
1 code implementation • 15 Aug 2022 • Wenchao Ma, Bin Tan, Nan Xue, Tianfu Wu, Xianwei Zheng, Gui-Song Xia
This paper studies the problem of holistic 3D wireframe perception (HoW-3D), a new task of perceiving both the visible 3D wireframes and the invisible ones from single-view 2D images.
1 code implementation • 1 Aug 2022 • Bowen Xu, Jiakun Xu, Nan Xue, Gui-Song Xia
We addressed such an issue by exploiting the hierarchical supervision (of bottom-level vertices, mid-level line segments and the high-level regional masks) and proposed a novel interaction mechanism of feature embedding sourced from different levels of supervision signals to obtain reversible building masks for polygonal mapping of buildings.
no code implementations • CVPR 2022 • Xiangwei Jiang, Rujiao Long, Nan Xue, Zhibo Yang, Cong Yao, Gui-Song Xia
This paper addresses the problem of document image dewarping, which aims at eliminating the geometric distortion in document images for document digitization.
Ranked #3 on Local Distortion on DocUNet
no code implementations • ICLR 2022 • Zi-Ming Wang, Nan Xue, Ling Lei, Gui-Song Xia
To handle large point sets, we propose a scalable PDM algorithm by utilizing the efficient partial Wasserstein-1 (PW) discrepancy.
1 code implementation • CVPR 2022 • Jian Ding, Nan Xue, Gui-Song Xia, Dengxin Dai
2) a zero-shot classification task on segments.
2 code implementations • 9 Dec 2021 • Xianpeng Liu, Nan Xue, Tianfu Wu
It presents the MonoCon method which learns Monocular Contexts, as auxiliary tasks in training, to help monocular 3D object detection.
Ranked #5 on Monocular 3D Object Detection on KITTI Cars Moderate
1 code implementation • CVPR 2022 • Nan Xue, Tianfu Wu, Gui-Song Xia, Liangpei Zhang
This paper studies the problem of multi-person pose estimation in a bottom-up fashion.
2 code implementations • ICCV 2021 • Rujiao Long, Wen Wang, Nan Xue, Feiyu Gao, Zhibo Yang, Yongpan Wang, Gui-Song Xia
In contrast to existing studies that mainly focus on parsing well-aligned tabular images with simple layouts from scanned PDF documents, we aim to establish a practical table structure parsing system for real-world scenarios where tabular input images are taken or scanned with severe deformation, bending or occlusions.
1 code implementation • 30 Aug 2021 • Gui-Song Xia, Jian Ding, Ming Qian, Nan Xue, Jiaming Han, Xiang Bai, Michael Ying Yang, Shengyang Li, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang, Qiang Zhou, Chao-hui Yu, Kaixuan Hu, Yingjia Bu, Wenming Tan, Zhe Yang, Wei Li, Shang Liu, Jiaxuan Zhao, Tianzhi Ma, Zi-han Gao, Lingqi Wang, Yi Zuo, Licheng Jiao, Chang Meng, Hao Wang, Jiahao Wang, Yiming Hui, Zhuojun Dong, Jie Zhang, Qianyue Bao, Zixiao Zhang, Fang Liu
This report summarizes the results of Learning to Understand Aerial Images (LUAI) 2021 challenge held on ICCV 2021, which focuses on object detection and semantic segmentation in aerial images.
no code implementations • ICCV 2021 • Bin Tan, Nan Xue, Song Bai, Tianfu Wu, Gui-Song Xia
This paper presents a neural network built upon Transformers, namely PlaneTR, to simultaneously detect and reconstruct planes from a single image.
4 code implementations • CVPR 2021 • Jiaming Han, Jian Ding, Nan Xue, Gui-Song Xia
More precisely, we incorporate rotation-equivariant networks into the detector to extract rotation-equivariant features, which can accurately predict the orientation and lead to a huge reduction of model size.
Ranked #19 on Object Detection In Aerial Images on DOTA (using extra training data)
1 code implementation • CVPR 2021 • Quankai Gao, Fudong Wang, Nan Xue, Jin-Gang Yu, Gui-Song Xia
Recently, deep learning based methods have demonstrated promising results on the graph matching problem, by relying on the descriptive capability of deep features extracted on graph nodes.
Ranked #8 on Graph Matching on Willow Object Class
2 code implementations • 24 Feb 2021 • Jian Ding, Nan Xue, Gui-Song Xia, Xiang Bai, Wen Yang, Micheal Ying Yang, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang
In this paper, we present a large-scale Dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI.
1 code implementation • 19 Nov 2020 • Linxi Huan, Nan Xue, Xianwei Zheng, wei he, Jianya Gong, Gui-Song Xia
This paper presents a context-aware tracing strategy (CATS) for crisp edge detection with deep edge detectors, based on an observation that the localization ambiguity of deep edge detectors is mainly caused by the mixing phenomenon of convolutional neural networks: feature mixing in edge classification and side mixing during fusing side predictions.
Ranked #2 on Edge Detection on MDBD
1 code implementation • CVPR 2020 • Fu-Dong Wang, Nan Xue, Jin-Gang Yu, Gui-Song Xia
Graph matching (GM), as a longstanding problem in computer vision and pattern recognition, still suffers from numerous cluttered outliers in practical applications.
no code implementations • 25 Mar 2020 • Zhu-Cun Xue, Nan Xue, Gui-Song Xia
This paper presents a novel line-aware rectification network (LaRecNet) to address the problem of fisheye distortion rectification based on the classical observation that straight lines in 3D space should be still straight in image planes.
1 code implementation • CVPR 2020 • Nan Xue, Tianfu Wu, Song Bai, Fu-Dong Wang, Gui-Song Xia, Liangpei Zhang, Philip H. S. Torr
For computing line segment proposals, a novel exact dual representation is proposed which exploits a parsimonious geometric reparameterization for line segments and forms a holistic 4-dimensional attraction field map for an input image.
Ranked #4 on Line Segment Detection on York Urban Dataset
no code implementations • 18 Dec 2019 • Nan Xue, Song Bai, Fu-Dong Wang, Gui-Song Xia, Tianfu Wu, Liangpei Zhang, Philip H. S. Torr
Given a line segment map, the proposed regional attraction first establishes the relationship between line segments and regions in the image lattice.
no code implementations • CVPR 2019 • Zhu-Cun Xue, Nan Xue, Gui-Song Xia, Weiming Shen
This paper presents a new deep-learning based method to simultaneously calibrate the intrinsic parameters of fisheye lens and rectify the distorted images.
1 code implementation • 16 Jan 2019 • Fu-Dong Wang, Gui-Song Xia, Nan Xue, Yi-Peng Zhang, Marcello Pelillo
In this paper, we present a functional representation for graph matching (FRGM) that aims to provide more geometric insights on the problem and reduce the space and time complexities of corresponding algorithms.
1 code implementation • CVPR 2019 • Nan Xue, Song Bai, Fu-Dong Wang, Gui-Song Xia, Tianfu Wu, Liangpei Zhang
In experiments, our method is tested on the WireFrame dataset and the YorkUrban dataset with state-of-the-art performance obtained.
1 code implementation • 1 Dec 2018 • Jian Ding, Nan Xue, Yang Long, Gui-Song Xia, Qikai Lu
Especially when detecting densely packed objects in aerial images, methods relying on horizontal proposals for common object detection often introduce mismatches between the Region of Interests (RoIs) and objects.
Ranked #50 on Object Detection In Aerial Images on DOTA (using extra training data)
no code implementations • 7 Nov 2018 • Gui-Song Xia, Jin Huang, Nan Xue, Qikai Lu, Xiaoxiang Zhu
More precisely, given an image, the geometric saliency is derived from a mid-level geometric representations based on meaningful junctions that can locally describe geometrical structures of images.
no code implementations • ECCV 2018 • Fu-Dong Wang, Nan Xue, Yi-Peng Zhang, Xiang Bai, Gui-Song Xia
Due to an efficient Frank-Wolfe method-based optimization strategy, we can handle graphs with hundreds and thousands of nodes within an acceptable amount of time.
no code implementations • 16 Mar 2017 • Nan Xue, Gui-Song Xia, Xiang Bai, Liangpei Zhang, Weiming Shen
This paper presents a novel approach to junction detection and characterization that exploits the locally anisotropic geometries of a junction and estimates the scales of these geometries using an \emph{a contrario} model.