no code implementations • 25 Jan 2025 • Peilin Yu, Yuwei Wu, Zhi Gao, Xiaomeng Fan, Yunde Jia
However, existing Riemannian meta-optimization methods take up huge memory footprints in large-scale optimization settings, as the learned optimizer can only adapt gradients of a fixed size and thus cannot be shared across different Riemannian parameters.
no code implementations • 20 Dec 2024 • Zhi Gao, Bofei Zhang, Pengxiang Li, Xiaojian Ma, Tao Yuan, Yue Fan, Yuwei Wu, Yunde Jia, Song-Chun Zhu, Qing Li
The advancement of large language models (LLMs) prompts the development of multi-modal agents, which are used as a controller to call external tools, providing a feasible way to solve practical tasks.
1 code implementation • 18 Dec 2024 • Chuanhao Li, Zhen Li, Chenchen Jing, Xiaomeng Fan, Wenbo Ye, Yuwei Wu, Yunde Jia
Compositional generalization is the capability of a model to understand novel compositions composed of seen concepts.
no code implementations • 9 Dec 2024 • Mingliang Zhai, Cheng Li, Zengyuan Guo, Ningrui Yang, Xiameng Qin, Sanyuan Zhao, Junyu Han, Ji Tao, Yuwei Wu, Yunde Jia
The Multi-modal Large Language Models (MLLMs) with extensive world knowledge have revitalized autonomous driving, particularly in reasoning tasks within perceivable regions.
no code implementations • 5 Dec 2024 • Yangkai Xue, Jindou Dai, Zhipeng Lu, Yuwei Wu, Yunde Jia
In this paper, we propose residual hyperbolic graph convolutional networks (R-HGCNs) to address the over-smoothing problem.
no code implementations • 16 Jul 2024 • Pengxiang Li, Zhi Gao, Bofei Zhang, Tao Yuan, Yuwei Wu, Mehrtash Harandi, Yunde Jia, Song-Chun Zhu, Qing Li
Vision language models (VLMs) have achieved impressive progress in diverse applications, becoming a prevalent research direction.
1 code implementation • 16 Jul 2024 • Jiaxi Zeng, Chengtang Yao, Yuwei Wu, Yunde Jia
Based on this coherent state, we introduce a dual-space refinement module to iteratively refine the initialized result in both disparity and disparity gradient spaces, improving estimations in ill-posed regions.
no code implementations • 12 Oct 2023 • Xu Chen, Yunde Jia, Yuwei Wu
In this paper, we propose a fine-grained annotation method for face anti-spoofing.
no code implementations • 30 Jun 2023 • Yi Guo, Che Sun, Yunde Jia, Yuwei Wu
We improve the reconstruction quality of complex geometry scene regions with sparse depth obtained by using the geometric constraints.
no code implementations • 19 May 2023 • Mingliang Zhai, Yulin Li, Xiameng Qin, Chen Yi, Qunyi Xie, Chengquan Zhang, Kun Yao, Yuwei Wu, Yunde Jia
Transformers achieve promising performance in document understanding because of their high effectiveness and still suffer from quadratic computational complexity dependency on the sequence length.
no code implementations • CVPR 2023 • Zhi Gao, Chen Xu, Feng Li, Yunde Jia, Mehrtash Harandi, Yuwei Wu
Our method dynamically expands the geometry of the underlying space to match growing geometric structures induced by new data, and prevents forgetting by keeping geometric structures of old data into account.
no code implementations • ICCV 2023 • Jiaxi Zeng, Chengtang Yao, Lidong Yu, Yuwei Wu, Yunde Jia
In this paper, we propose a parameterized cost volume to encode the entire disparity space using multi-Gaussian distribution.
no code implementations • ICCV 2023 • Chengtang Yao, Lidong Yu, Yuwei Wu, Yunde Jia
The high-resolution local information brought by sparse points refines 3D lanes in the BEV space hierarchically from low resolution to high resolution.
no code implementations • ICCV 2023 • Chenrui Shi, Che Sun, Yuwei Wu, Yunde Jia
In this way, our method is able to reduce the difficulties of learning and avoid converging to sub-optimal solutions.
1 code implementation • CVPR 2023 • Chuanhao Li, Zhen Li, Chenchen Jing, Yunde Jia, Yuwei Wu
Compositional generalization is critical to simulate the compositional capability of humans, and has received much attention in the vision-and-language (V&L) community.
1 code implementation • CVPR 2022 • Chenchen Jing, Yunde Jia, Yuwei Wu, Xinyu Liu, Qi Wu
Existing VQA models can answer a compositional question well, but cannot work well in terms of reasoning consistency in answering the compositional question and its sub-questions.
no code implementations • CVPR 2022 • Che Sun, Yunde Jia, Yi Guo, Yuwei Wu
We propose a novel method of registering less-overlap RGB-D scans.
no code implementations • CVPR 2021 • Chengtang Yao, Yunde Jia, Huijun Di, Pengxiang Li, Yuwei Wu
In this paper, we present a decomposition model for stereo matching to solve the problem of excessive growth in computational cost (time and memory cost) as the resolution increases.
no code implementations • CVPR 2021 • Jindou Dai, Yuwei Wu, Zhi Gao, Yunde Jia
Specifically, we developed a manifold-preserving graph convolution that consists of a hyperbolic feature transformation and a hyperbolic neighborhood aggregation.
no code implementations • ICCV 2021 • Zhi Gao, Yuwei Wu, Yunde Jia, Mehrtash Harandi
Few-shot learning describes the challenging problem of recognizing samples from unseen classes given very few labeled examples.
no code implementations • 2 Sep 2020 • Jingyi Hou, Yunde Jia, Xinxiao wu, Yayun Qi
Through traversing the dependency trees, the sentences are generated to train the captioning model.
no code implementations • 5 Jun 2020 • Chengtang Yao, Yunde Jia, Huijun Di, Yuwei Wu, Lidong Yu
In this paper, we present a content-aware inter-scale cost aggregation method that adaptively aggregates and upsamples the cost volume from coarse-scale to fine-scale by learning dynamic filter weights according to the content of the left and right views on the two scales.
1 code implementation • CVPR 2020 • Sicheng Xu, Jiaolong Yang, Dong Chen, Fang Wen, Yu Deng, Yunde Jia, Xin Tong
We evaluate the accuracy of our method both in 3D and with pose manipulation tasks on 2D images.
no code implementations • 4 Jun 2019 • Jingyi Hou, Xinxiao Wu, Yayun Qi, Wentian Zhao, Jiebo Luo, Yunde Jia
Extensive experiments on the MS-COCO image captioning benchmark and the MSVD video captioning benchmark validate the superiority of our method on leveraging prior commonsense knowledge to enhance relational reasoning for visual captioning.
4 code implementations • 20 Mar 2019 • Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, Xin Tong
Recently, deep learning based 3D face reconstruction methods have shown promising results in both quality and efficiency. However, training deep neural networks typically requires a large volume of data, whereas face images with ground-truth 3D face shapes are scarce.
Ranked #3 on
3D Face Reconstruction
on Florence
(RMSE Cooperative metric)
no code implementations • 15 Mar 2019 • Yanmei Dong, Mingtao Pei, Lijia Zhang, Bin Xu, Yuwei Wu, Yunde Jia
In this paper, we propose to stitch videos from the FF-camera with a wide-angle lens and the DF-camera with a fisheye lens for telepresence robots.
no code implementations • 12 Jan 2018 • Lidong Yu, Yucheng Wang, Yuwei Wu, Yunde Jia
The cost aggregation sub-architecture is realized by a two-stream network: one for the generation of cost aggregation proposals, the other for the selection of the proposals.
no code implementations • 17 Nov 2017 • Zhi Gao, Yuwei Wu, Xingyuan Bu, Yunde Jia
To this end, several new layers are introduced in our network, including a nonlinear kernel aggregation layer, an SPD matrix transformation layer, and a vectorization layer.
no code implementations • 10 Jul 2017 • Changyong Yu, Yunde Jia
We use the anisotropic diffusion to enhance the edges and boundary locations of a face image, and the kernel matrix model to extract face image features which we call the diffusion-kernel (D-K) features.
no code implementations • 31 Mar 2017 • Min Yang, Yuwei Wu, Yunde Jia
In this paper, we present a hybrid data association framework with a min-cost multi-commodity network flow for robust online multi-object tracking.
no code implementations • 11 May 2016 • Jiaolong Yang, Hongdong Li, Dylan Campbell, Yunde Jia
The evaluation demonstrates that the proposed method is able to produce reliable registration results regardless of the initialization.
Ranked #6 on
Point Cloud Registration
on FP-O-H
no code implementations • 10 Oct 2015 • Min Yang, Yunde Jia
The temporal dynamic makes a sufficient complement to the spatial structure of varying appearances in the feature space, which significantly improves the affinity measurement between trajectories and detections.
no code implementations • CVPR 2013 • Xi Song, Tianfu Wu, Yunde Jia, Song-Chun Zhu
This paper presents a method of learning reconfigurable And-Or Tree (AOT) models discriminatively from weakly annotated data for object detection.