no code implementations • CCL 2020 • Yuan Hua, Zheng Huang, Jie Guo, Weidong Qiu
Information extraction from documents such as receipts or invoices is a fundamental and crucial step for office automation.
no code implementations • ECCV 2020 • Quewei Li, Jie Guo, Yang Fei, Qinyu Tang, Wenxiu Sun, Jin Zeng, Yanwen Guo
We propose a deep convolutional neural network (CNN) to estimate surface normal from a single color image accompanied with a low-quality depth channel.
no code implementations • 12 Mar 2025 • Peng Chen, Pi Bu, Yingyao Wang, Xinyi Wang, ZiMing Wang, Jie Guo, Yingxiu Zhao, Qi Zhu, Jun Song, Siran Yang, Jiamang Wang, Bo Zheng
Recent advances in Vision-Language-Action models (VLAs) have expanded the capabilities of embodied intelligence.
no code implementations • 3 Jan 2025 • Nisha Huang, Kaer Huang, Yifan Pu, Jiangshan Wang, Jie Guo, Yiqiang Yan, Xiu Li
However, despite their capabilities, direct conditional guidance approaches often face challenges in balancing the expressiveness of textual semantics with the diversity of output results while capturing stylistic features.
no code implementations • 22 Dec 2024 • Ronghui Li, Youliang Zhang, Yachao Zhang, Yuxiang Zhang, Mingyang Su, Jie Guo, Ziwei Liu, Yebin Liu, Xiu Li
Humans perform a variety of interactive motions, among which duet dance is one of the most challenging interactions.
no code implementations • 27 Oct 2024 • Ronghui Li, Hongwen Zhang, Yachao Zhang, Yuxiang Zhang, Youliang Zhang, Jie Guo, Yan Zhang, Xiu Li, Yebin Liu
We propose Lodge++, a choreography framework to generate high-quality, ultra-long, and vivid dances given the music and desired genre.
Ranked #2 on
Motion Synthesis
on FineDance
no code implementations • 17 Oct 2024 • Shuichang Lai, Letian Huang, Jie Guo, Kai Cheng, Bowen Pan, Xiaoxiao Long, Jiangjing Lyu, Chengfei Lv, Yanwen Guo
However, these techniques generally have difficulty in producing believable geometries and materials for glossy objects, a challenge that stems from the inherent ambiguities of inverse rendering.
no code implementations • 14 Oct 2024 • Changfeng Ma, Pengxiao Guo, Shuangyu Yang, Yinuo Chen, Jie Guo, Chongjun Wang, Yanwen Guo, Wenping Wang
Extensive evaluations demonstrate the superiority of our method on reconstruction from point cloud, generation, and interpolation.
no code implementations • 19 Sep 2024 • Letian Huang, Jie Guo, Jialin Dan, Ruoyu Fu, Shujie Wang, Yuanqi Li, Yanwen Guo
Recently, 3D Gaussian Splatting (3D-GS) has achieved impressive results in novel view synthesis, demonstrating high fidelity and efficiency.
1 code implementation • 10 Jul 2024 • Mingjin Zhang, YuChun Wang, Jie Guo, Yunsong Li, Xinbo Gao, Jing Zhang
The recent Segment Anything Model (SAM) is a significant advancement in natural image segmentation, exhibiting potent zero-shot performance suitable for various downstream image segmentation tasks.
no code implementations • 26 Jun 2024 • Jiazhou Ji, Ruizhe Li, Shujun Li, Jie Guo, Weidong Qiu, Zheng Huang, Chiyu Chen, Xiaoyu Jiang, Xinru Lu
Instead, we introduce a novel ternary text classification scheme, adding an "undecided" category for texts that could be attributed to either source, and we show that this new category is crucial to understand how to make the detection result more explainable to lay users.
1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.
1 code implementation • CVPR 2024 • Ronghui Li, Yuxiang Zhang, Yachao Zhang, Hongwen Zhang, Jie Guo, Yan Zhang, Yebin Liu, Xiu Li
In contrast, the second-stage is the local diffusion, which parallelly generates detailed motion sequences under the guidance of the dance primitives and choreographic rules.
Ranked #3 on
Motion Synthesis
on FineDance
1 code implementation • CVPR 2024 • Xiaoyu Zhan, Jianxin Yang, Yuanqi Li, Jie Guo, Yanwen Guo, Wenping Wang
SHERT applies semantic- and normal-based sampling between the detailed surface (e. g. mesh and SDF) and the corresponding SMPL-X model to obtain a partially sampled semantic mesh and then generates the complete semantic mesh by our specifically designed self-supervised completion and refinement networks.
no code implementations • 1 Feb 2024 • Jiayang Bai, Letian Huang, Jie Guo, Wen Gong, Yuanqi Li, Yanwen Guo
This technique typically takes perspective images as input and optimizes a set of 3D elliptical Gaussians by splatting them onto the image planes, resulting in 2D Gaussians.
no code implementations • 1 Feb 2024 • Letian Huang, Jiayang Bai, Jie Guo, Yuanqi Li, Yanwen Guo
This paper addresses the projection error function of 3D Gaussian Splatting, commencing with the residual error from the first-order Taylor expansion of the projection function.
no code implementations • 18 Jan 2024 • Jie Guo, Hao Chen, Bin Song, Yuhao Chi, Chau Yuen, Fei Richard Yu, Geoffrey Ye Li, Dusit Niyato
In this article, we present a novel framework, named distributed task-oriented communication networks (DTCN), based on recent advances in multimodal semantic transmission and edge intelligence.
no code implementations • CVPR 2024 • Yanwen Guo, Yuanqi Li, Dayong Ren, Xiaohong Zhang, Jiawei Li, Liang Pu, Changfeng Ma, Xiaoyu Zhan, Jie Guo, Mingqiang Wei, Yan Zhang, Piaopiao Yu, Shuangyu Yang, Donghao Ji, Huisheng Ye, Hao Sun, Yansong Liu, Yinuo Chen, Jiaqi Zhu, Hongyu Liu
In this paper we present LiDAR-Net a new real-scanned indoor point cloud dataset containing nearly 3. 6 billion precisely point-level annotated points covering an expansive area of 30000m^2.
1 code implementation • CVPR 2024 • Xiaohong Zhang, Huisheng Ye, Jingwen Li, Qinyu Tang, Yuanqi Li, Yanwen Guo, Jie Guo
Explicitly our method focuses on enhancing labeling using synthetic scenes crafted from 3D shapes generated via random prompts.
no code implementations • CVPR 2024 • Zhenyu Chen, Jie Guo, Shuichang Lai, Ruoyu Fu, Mengxun Kong, Chen Wang, Hongyu Sun, Zhebin Zhang, Chen Li, Yanwen Guo
Material appearance is a key component of photorealism with a pronounced impact on human perception.
no code implementations • 1 Jan 2024 • Ronghui Li, Yuqin Dai, Yachao Zhang, Jun Li, Jian Yang, Jie Guo, Xiu Li
Existing music-driven 3D dance generation methods mainly concentrate on high-quality dance generation, but lack sufficient control during the generation process.
1 code implementation • 3 Nov 2023 • Xin Yuan, Jie Guo, Weidong Qiu, Zheng Huang, Shujun Li
Mis- and disinformation online have become a major societal problem as major sources of online harms of different kinds.
no code implementations • 24 Sep 2023 • Zhimin Fan, Pengpei Hong, Jie Guo, Changqing Zou, Yanwen Guo, Ling-Qi Yan
We verify that importance sampling the seed chain in the continuous space reaches the goal of importance sampling the discrete admissible specular chain.
1 code implementation • ICCV 2023 • Mingjin Zhang, Chi Zhang, Qiming Zhang, Jie Guo, Xinbo Gao, Jing Zhang
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
no code implementations • 24 Apr 2023 • Yan Zhou, Jie Guo, Hao Sun, Bin Song, Fei Richard Yu
The main idea of multimodal recommendation is the rational utilization of the item's multimodal information to improve the recommendation performance.
no code implementations • 14 Apr 2023 • Jie Guo, Qimeng Wang, Yan Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Baochang Zhang
CLIP (Contrastive Language-Image Pretraining) is well-developed for open-vocabulary zero-shot image-level recognition, while its applications in pixel-level tasks are less investigated, where most efforts directly adopt CLIP features without deliberative adaptations.
Ranked #5 on
Zero-Shot Semantic Segmentation
on COCO-Stuff
no code implementations • 18 Mar 2023 • Jiayang Bai, Zhen He, Shan Yang, Jie Guo, Zhenyu Chen, Yan Zhang, Yanwen Guo
Recent methods mostly rely on convolutional neural networks (CNNs) to fill the missing contents in the warped panorama.
no code implementations • 10 Mar 2023 • Jiayang Bai, Letian Huang, Wen Gong, Jie Guo, Yanwen Guo
Recently, Neural Radiance Fields (NeRF) have emerged as a potent method for synthesizing novel views from a dense set of images.
no code implementations • ICCV 2023 • Quewei Li, Feichao Li, Jie Guo, Yanwen Guo
We propose UHDNeRF, a new framework for novel view synthesis on the challenging ultra-high-resolution (e. g., 4K) real-world scenes.
no code implementations • CVPR 2023 • Changfeng Ma, Yinuo Chen, Pengxiao Guo, Jie Guo, Chongjun Wang, Yanwen Guo
Extensive experiments and comparisons demonstrate our superiority and generalization and show that our method achieves state-of-the-art performance on unsupervised completion of real scene objects.
no code implementations • 16 Dec 2022 • Jie Guo, Meiting Wang, Yan Zhou, Bin Song, Yuhao Chi, Wei Fan, Jianglong Chang
Then, a multi-granularity shared space is established with a designed Multi-granularity Feature Aggregation and Rearrangement (MFAR) module, which enhances the semantic corresponding relations between the local and global information, and obtains more accurate feature representations for the image and text modalities.
no code implementations • 9 Nov 2022 • Enes Altuncu, Jason R. C. Nurse, Yang Xu, Jie Guo, Shujun Li
Automatic keyword extraction (AKE) has gained more importance with the increasing amount of digital textual data that modern computing systems process.
no code implementations • 30 Jun 2022 • Yuting Wang, Hangning Zhou, Zhigang Zhang, Chen Feng, Huadong Lin, Chaofei Gao, Yizhi Tang, Zhenting Zhao, Shiyu Zhang, Jie Guo, Xuefeng Wang, Ziyao Xu, Chi Zhang
This technical report presents an effective method for motion prediction in autonomous driving.
Ranked #12 on
Motion Forecasting
on Argoverse CVPR 2020
no code implementations • 18 Mar 2022 • Changfeng Ma, Yang Yang, Jie Guo, Chongjun Wang, Yanwen Guo
We propose in this paper an end-to-end network, named CS-Net, to complete the point clouds contaminated by noises or containing outliers.
no code implementations • 17 Mar 2022 • Yuanqi Li, Jianwei Guo, Xinran Yang, Shun Liu, Jie Guo, Xiaopeng Zhang, Yanwen Guo
In this paper, we propose a novel point cloud simplification network (PCS-Net) dedicated to high-quality surface mesh reconstruction while maintaining geometric fidelity.
no code implementations • 13 Feb 2022 • Jiayang Bai, Jie Guo, Chenchen Wan, Zhenyu Chen, Zhen He, Shan Yang, Piaopiao Yu, Yan Zhang, Yanwen Guo
At its core is a new lighting model (dubbed DSGLight) based on depth-augmented Spherical Gaussians (SG) and a Graph Convolutional Network (GCN) that infers the new lighting representation from a single LDR image of limited field-of-view.
1 code implementation • 6 Feb 2022 • Jiayang Bai, Shuichang Lai, Haoyu Qin, Jie Guo, Yanwen Guo
In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image.
Ranked #7 on
Depth Estimation
on Stanford2D3D Panoramic
1 code implementation • CVPR 2022 • Mingjin Zhang, Rui Zhang, Yuxiang Yang, Haichen Bai, Jing Zhang, Jie Guo
TOAA block calculates the low-level information with attention mechanism in both row and column directions and fuses it with the high-level information to capture the shape characteristic of targets and suppress noises.
no code implementations • 21 Dec 2021 • Xu Kang, Bin Song, Jie Guo, Zhijin Qin, F. Richard Yu
The vigorous developments of Internet of Things make it possible to extend its computing and storage capabilities to computing tasks in the aerial system with collaboration of cloud and edge, especially for artificial intelligence (AI) tasks based on deep learning (DL).
no code implementations • 29 Oct 2021 • Tao Wen, Beibei Wang, Lei Zhang, Jie Guo, Nicolas Holzschuch
For efficiency, we train the network in two stages: reusing a trained model to initialize the SVBRDFs and fine-tune it based on the input image.
no code implementations • 7 Oct 2021 • Jinian Luo, Jie Guo, Weidong Qiu, Zheng Huang, Hong Hui
However, most of them ignored the domain generalization scenario and scale variances, with an inferior performance on domain shift situations, and normally were exacerbated by intra-domain and inter-domain scale variances.
no code implementations • 26 Jul 2021 • Fei Pan, Chunlei Xu, Jie Guo, Yanwen Guo
We introduce a transductive maximum margin classifier for few-shot learning (FS-TMMC).
no code implementations • 26 Jul 2021 • Fei Pan, Chunlei Xu, Jie Guo, Yanwen Guo
In order to obtain the similarity of a pair of videos, we predict the alignment scores between all pairs of temporal positions in the two videos with the temporal alignment prediction function.
no code implementations • CVPR 2021 • Fengmin Shi, Jie Guo, Haonan Zhang, Shan Yang, Xiying Wang, Yanwen Guo
We demonstrate that local geometry has a greater impact on the sound than the global geometry and offers more cues in material recognition.
no code implementations • 7 May 2021 • Mingyuan Mao, Baochang Zhang, David Doermann, Jie Guo, Shumin Han, Yuan Feng, Xiaodi Wang, Errui Ding
This leads to a new problem of confidence discrepancy for the detector ensembles.
no code implementations • 1 Apr 2021 • Xinfang Liu, Xiushan Nie, Zhifang Tan, Jie Guo, Yilong Yin
Natural language video localization (NLVL), which aims to locate a target moment from a video that semantically corresponds to a text query, is a novel and challenging task.
no code implementations • 24 Feb 2021 • Jie Guo, Bingyang Hu, Yanjun Chen, Yuanqi Li, Yanwen Guo, Ling-Qi Yan
We consider the scattering of light in participating media composed of sparsely and randomly distributed discrete particles.
Graphics Optics
no code implementations • ICCV 2021 • Piaopiao Yu, Jie Guo, Fan Huang, Cheng Zhou, Hongwei Che, Xiao Ling, Yanwen Guo
However, naively compressing an outdoor panorama into a low-dimensional latent vector, as existing models have done, causes two major problems.
no code implementations • 22 Sep 2020 • Jie Guo, Hao Yan, Chen Zhang, Steven Hoi
We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities.
no code implementations • 6 Jul 2020 • Aocheng Li, Jie Guo, Yanwen Guo
We specifically design a new module to make fully use of existing semantic segmentation networks to accommodate planar segmentation.
no code implementations • 17 May 2020 • Jingwu He, Chuan Wang, Yang Zhang, Jie Guo, Yanwen Guo
To the best of our knowledge, we are the first to enhance the facial attractiveness with GANs in both geometry and appearance aspects.
no code implementations • ECCV 2020 • Hongjie Zhang, Ang Li, Jie Guo, Yanwen Guo
We propose the OpenHybrid framework, which is composed of an encoder to encode the input data into a joint embedding space, a classifier to classify samples to inlier classes, and a flow-based density estimator to detect whether a sample belongs to the unknown category.
2 code implementations • CVPR 2019 • Jianchao Wu, Li-Min Wang, Li Wang, Jie Guo, Gangshan Wu
To this end, we propose to build a flexible and efficient Actor Relation Graph (ARG) to simultaneously capture the appearance and position relation between actors.
Ranked #3 on
Group Activity Recognition
on Collective Activity
no code implementations • 26 Feb 2019 • Xu Kang, Bin Song, Jie Guo, Xiaojiang Du, Mohsen Guizani
In recent years, with the development of the marine industry, navigation environment becomes more complicated.
no code implementations • 13 Feb 2019 • Yinghua Li, Bin Song, Jie Guo, Xiaojiang Du, Mohsen Guizani
The sparsity and self-similarity of the image blocks are taken as the constraints.
no code implementations • 13 Feb 2019 • Junmei Lv, Bin Song, Jie Guo, Xiaojiang Du, Mohsen Guizani
Specifically, the Multimodal IRIS model consists of three modules, i. e., multimodal feature learning module, the Interest-Related Network (IRN) module and item similarity recommendation module.
1 code implementation • ECCV 2018 • Jie Guo, Zuojian Zhou, Li-Min Wang
We propose a sparse and low-rank reflection model for specular highlight detection and removal using a single input image.
no code implementations • 3 Jul 2018 • Jie Guo, Tingfa Xu, Shenwang Jiang, Ziyi Shen
Deep convolutional neural networks (CNNs) have dominated many computer vision domains because of their great power to extract good features automatically.
no code implementations • 5 Apr 2018 • Sijia Chen, Bin Song, Jie Guo, Xiaojiang Du, Mohsen Guizani
The Localization of the target object for data retrieval is a key issue in the Intelligent and Connected Transportation Systems (ICTS).
no code implementations • 11 Sep 2017 • Yuchen Dai, Zheng Huang, Yuting Gao, Youxuan Xu, Kai Chen, Jie Guo, Weidong Qiu
In this paper, we introduce a novel end-end framework for multi-oriented scene text detection from an instance-aware semantic segmentation perspective.
Ranked #12 on
Scene Text Detection
on MSRA-TD500