no code implementations • 10 Jun 2025 • Peilin Yu, Yuwei Wu, Zhi Gao, Xiaomeng Fan, Shuo Yang, Yunde Jia
Feature augmentation generates novel samples in the feature space, providing an effective way to enhance the generalization ability of learning algorithms with hyperbolic geometry.
1 code implementation • 29 May 2025 • Chuanhao Li, Wenbo Ye, Zhen Li, Yuwei Wu, Yunde Jia
Compositional generalization is the ability of generalizing novel compositions from seen primitives, and has received much attention in vision-and-language (V\&L) recently.
no code implementations • 21 May 2025 • Xintong Zhang, Zhi Gao, Bofei Zhang, Pengxiang Li, Xiaowen Zhang, Yang Liu, Tao Yuan, Yuwei Wu, Yunde Jia, Song-Chun Zhu, Qing Li
Vision language models (VLMs) have achieved impressive performance across a variety of computer vision tasks.
no code implementations • 20 May 2025 • Mingliang Zhai, Zhi Gao, Yuwei Wu, Yunde Jia
Unlike planner-centric EQA models where the memory module cannot fully interact with other modules, MemoryEQA flexible feeds memory information into all modules, thereby enhancing efficiency and accuracy in handling complex tasks, such as those involving multiple targets across different regions.
1 code implementation • 20 May 2025 • Zhidan Liu, Chengtang Yao, Jiaxi Zeng, Yuwei Wu, Yunde Jia
To resolve the multi-label regression problem, we introduce a pixel-wise multivariate Gaussian representation, where the mean vector encodes multiple depth values at the same pixel, and the covariance matrix determines whether a multi-label representation is necessary for a given pixel.
1 code implementation • 20 May 2025 • Chengtang Yao, Lidong Yu, Zhidan Liu, Jiaxi Zeng, Yuwei Wu, Yunde Jia
A direct fusion of a monocular depth map could alleviate the local optima problem, but noisy disparity results computed at the first several iterations will misguide the fusion.
1 code implementation • 19 May 2025 • Chengtang Yao, Zhidan Liu, Jiaxi Zeng, Lidong Yu, Yuwei Wu, Yunde Jia
3D visual illusion is a perceptual phenomenon where a two-dimensional plane is manipulated to simulate three-dimensional spatial relationships, making a flat artwork or object look three-dimensional in the human visual system.
no code implementations • 30 Apr 2025 • Pengxiang Li, Zhi Gao, Bofei Zhang, Yapeng Mi, Xiaojian Ma, Chenrui Shi, Tao Yuan, Yuwei Wu, Yunde Jia, Song-Chun Zhu, Qing Li
The data is subsequently used to update the controller for tool usage through preference tuning, producing a SPORT agent.
no code implementations • 25 Jan 2025 • Peilin Yu, Yuwei Wu, Zhi Gao, Xiaomeng Fan, Yunde Jia
However, existing Riemannian meta-optimization methods take up huge memory footprints in large-scale optimization settings, as the learned optimizer can only adapt gradients of a fixed size and thus cannot be shared across different Riemannian parameters.
no code implementations • 20 Dec 2024 • Zhi Gao, Bofei Zhang, Pengxiang Li, Xiaojian Ma, Tao Yuan, Yue Fan, Yuwei Wu, Yunde Jia, Song-Chun Zhu, Qing Li
The advancement of large language models (LLMs) prompts the development of multi-modal agents, which are used as a controller to call external tools, providing a feasible way to solve practical tasks.
1 code implementation • 18 Dec 2024 • Chuanhao Li, Zhen Li, Chenchen Jing, Xiaomeng Fan, Wenbo Ye, Yuwei Wu, Yunde Jia
Compositional generalization is the capability of a model to understand novel compositions composed of seen concepts.
no code implementations • 9 Dec 2024 • Mingliang Zhai, Cheng Li, Zengyuan Guo, Ningrui Yang, Xiameng Qin, Sanyuan Zhao, Junyu Han, Ji Tao, Yuwei Wu, Yunde Jia
The Multi-modal Large Language Models (MLLMs) with extensive world knowledge have revitalized autonomous driving, particularly in reasoning tasks within perceivable regions.
no code implementations • 5 Dec 2024 • Yangkai Xue, Jindou Dai, Zhipeng Lu, Yuwei Wu, Yunde Jia
In this paper, we propose residual hyperbolic graph convolutional networks (R-HGCNs) to address the over-smoothing problem.
no code implementations • 16 Jul 2024 • Pengxiang Li, Zhi Gao, Bofei Zhang, Tao Yuan, Yuwei Wu, Mehrtash Harandi, Yunde Jia, Song-Chun Zhu, Qing Li
Vision language models (VLMs) have achieved impressive progress in diverse applications, becoming a prevalent research direction.
1 code implementation • 16 Jul 2024 • Jiaxi Zeng, Chengtang Yao, Yuwei Wu, Yunde Jia
Based on this coherent state, we introduce a dual-space refinement module to iteratively refine the initialized result in both disparity and disparity gradient spaces, improving estimations in ill-posed regions.
no code implementations • 23 May 2024 • Chuanhao Li, Zhen Li, Chenchen Jing, Shuo Liu, Wenqi Shao, Yuwei Wu, Ping Luo, Yu Qiao, Kaipeng Zhang
In this paper, we propose a plug-and-play framework, for augmenting existing LVLMs in handling visual question answering (VQA) about up-to-date knowledge, dubbed SearchLVLMs.
no code implementations • 16 May 2024 • Anish Bhattacharya, Nishanth Rao, Dhruv Parikh, Pratik Kunapuli, Yuwei Wu, Yuezhan Tao, Nikolai Matni, Vijay Kumar
We demonstrate the capabilities of an attention-based end-to-end approach for high-speed vision-based quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art learning architectures.
no code implementations • 26 Apr 2024 • Xindi Zheng, Yuwei Wu, Yu Pan, WanYu Lin, Lei Ma, Jianjun Zhao
The crux of our work is that it admits both global and local representations of the input graph signal, which can capture the long-range dependencies.
no code implementations • 23 Feb 2024 • Shijing Si, Yuwei Wu, Le Tang, Yugui Zhang, Jedrek Wosik, Qinliang Su
This study provides insights into the potential and limitations of ChatGPT for spam identification, highlighting its potential as a viable solution for resource-constrained language domains.
no code implementations • 12 Oct 2023 • Xu Chen, Yunde Jia, Yuwei Wu
In this paper, we propose a fine-grained annotation method for face anti-spoofing.
no code implementations • 30 Jun 2023 • Yi Guo, Che Sun, Yunde Jia, Yuwei Wu
We improve the reconstruction quality of complex geometry scene regions with sparse depth obtained by using the geometric constraints.
no code implementations • 19 May 2023 • Mingliang Zhai, Yulin Li, Xiameng Qin, Chen Yi, Qunyi Xie, Chengquan Zhang, Kun Yao, Yuwei Wu, Yunde Jia
Transformers achieve promising performance in document understanding because of their high effectiveness and still suffer from quadratic computational complexity dependency on the sequence length.
no code implementations • CVPR 2023 • Zhi Gao, Chen Xu, Feng Li, Yunde Jia, Mehrtash Harandi, Yuwei Wu
Our method dynamically expands the geometry of the underlying space to match growing geometric structures induced by new data, and prevents forgetting by keeping geometric structures of old data into account.
1 code implementation • CVPR 2023 • Weixiao Liu, Yuwei Wu, Sipu Ruan, Gregory S. Chirikjian
Representing complex objects with basic geometric primitives has long been a topic in computer vision.
no code implementations • 20 Mar 2023 • Yuwei Wu, Zhe Zhang, Xiaolan Qiu, Yao Zhao, Weidong Yu
repetition frequency (PRF).
1 code implementation • CVPR 2023 • Chuanhao Li, Zhen Li, Chenchen Jing, Yunde Jia, Yuwei Wu
Compositional generalization is critical to simulate the compositional capability of humans, and has received much attention in the vision-and-language (V&L) community.
no code implementations • ICCV 2023 • Jiaxi Zeng, Chengtang Yao, Lidong Yu, Yuwei Wu, Yunde Jia
In this paper, we propose a parameterized cost volume to encode the entire disparity space using multi-Gaussian distribution.
no code implementations • ICCV 2023 • Chengtang Yao, Lidong Yu, Yuwei Wu, Yunde Jia
The high-resolution local information brought by sparse points refines 3D lanes in the BEV space hierarchically from low resolution to high resolution.
no code implementations • ICCV 2023 • Chenrui Shi, Che Sun, Yuwei Wu, Yunde Jia
In this way, our method is able to reduce the difficulties of learning and avoid converging to sub-optimal solutions.
no code implementations • CVPR 2022 • Xinke Li, Henghui Ding, Zekun Tong, Yuwei Wu, Yeow Meng Chee
Further study suggests that our strategy can improve the model performance by pretraining and fine-tuning scheme, especially for the dataset with a small scale.
no code implementations • 28 Mar 2022 • Yuwei Wu, Weixiao Liu, Sipu Ruan, Gregory S. Chirikjian
In this paper, we propose a novel non-parametric Bayesian statistical method to infer an abstraction, consisting of an unknown number of geometric primitives, from a point cloud.
no code implementations • CVPR 2022 • Che Sun, Yunde Jia, Yi Guo, Yuwei Wu
We propose a novel method of registering less-overlap RGB-D scans.
1 code implementation • CVPR 2022 • Chenchen Jing, Yunde Jia, Yuwei Wu, Xinyu Liu, Qi Wu
Existing VQA models can answer a compositional question well, but cannot work well in terms of reasoning consistency in answering the compositional question and its sub-questions.
1 code implementation • CVPR 2022 • Weixiao Liu, Yuwei Wu, Sipu Ruan, Gregory S. Chirikjian
Among geometric primitives, superquadrics are well known for their ability to represent a wide range of shapes with few parameters.
1 code implementation • NAACL 2021 • Yuwei Wu, Xuezhe Ma, Diyi Yang
Despite the impressive successes of generation and dialogue systems, how to endow a text generation system with particular personality traits to deliver more personalized responses remains under-investigated.
no code implementations • 10 May 2021 • Zilong Wang, Mingjie Zhan, Houxing Ren, Zhaohui Hou, Yuwei Wu, Xingyan Zhang, Ding Liang
Forms are a common type of document in real life and carry rich information through textual contents and the organizational structure.
1 code implementation • CVPR 2021 • Chengtang Yao, Yunde Jia, Huijun Di, Pengxiang Li, Yuwei Wu
In this paper, we present a decomposition model for stereo matching to solve the problem of excessive growth in computational cost (time and memory cost) as the resolution increases.
no code implementations • CVPR 2021 • Jindou Dai, Yuwei Wu, Zhi Gao, Yunde Jia
Specifically, we developed a manifold-preserving graph convolution that consists of a hyperbolic feature transformation and a hyperbolic neighborhood aggregation.
no code implementations • ICCV 2021 • Zhi Gao, Yuwei Wu, Yunde Jia, Mehrtash Harandi
Few-shot learning describes the challenging problem of recognizing samples from unseen classes given very few labeled examples.
no code implementations • 27 Dec 2020 • Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang
In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention.
1 code implementation • 11 Aug 2020 • Xinke Li, Chongshou Li, Zekun Tong, Andrew Lim, Junsong Yuan, Yuwei Wu, Jing Tang, Raymond Huang
Based on it, we formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.
no code implementations • 5 Jun 2020 • Chengtang Yao, Yunde Jia, Huijun Di, Yuwei Wu, Lidong Yu
In this paper, we present a content-aware inter-scale cost aggregation method that adaptively aggregates and upsamples the cost volume from coarse-scale to fine-scale by learning dynamic filter weights according to the content of the left and right views on the two scales.
1 code implementation • 23 Apr 2020 • Jiaao Chen, Yuwei Wu, Diyi Yang
We present semi-supervised models with data augmentation (SMDA), a semi-supervised text classification system to classify interactive affective responses.
1 code implementation • CVPR 2020 • Yue Zhao, Yuwei Wu, Caihua Chen, Andrew Lim
Armed with the Thompson Sampling, we develop a black-box attack with success rate over 95% on ModelNet40 data set.
1 code implementation • 5 Sep 2019 • Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou, Xiang Zhou
The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks.
Ranked #8 on
Natural Language Inference
on SNLI
2 code implementations • 30 Aug 2019 • Shuailiang Zhang, Hai Zhao, Yuwei Wu, Zhuosheng Zhang, Xi Zhou, Xiang Zhou
Multi-choice reading comprehension is a challenging task to select an answer from a set of candidate options when given passage and question.
1 code implementation • 14 Aug 2019 • Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang
In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention.
Ranked #5 on
Question Answering
on SQuAD2.0 dev
no code implementations • 15 Mar 2019 • Yanmei Dong, Mingtao Pei, Lijia Zhang, Bin Xu, Yuwei Wu, Yunde Jia
In this paper, we propose to stitch videos from the FF-camera with a wide-angle lens and the DF-camera with a fisheye lens for telepresence robots.
no code implementations • 27 Jan 2019 • Shuailiang Zhang, Hai Zhao, Yuwei Wu, Zhuosheng Zhang, Xi Zhou, Xiang Zhou
Multi-choice reading comprehension is a challenging task that requires complex reasoning procedure.
Ranked #3 on
Question Answering
on RACE
1 code implementation • 13 Jan 2019 • Yunxuan Xiao, Yikai Li, Yuwei Wu, LiZhen Zhu
Replacing the background and simultaneously adjusting foreground objects is a challenging task in image editing.
no code implementations • 8 Sep 2018 • Zhuosheng Zhang, Yuwei Wu, Zuchao Li, Hai Zhao
Who did what to whom is a major focus in natural language understanding, which is right the aim of semantic role labeling (SRL) task.
Ranked #11 on
Natural Language Inference
on SNLI
Machine Reading Comprehension
Natural Language Understanding
+1
no code implementations • 12 Jan 2018 • Lidong Yu, Yucheng Wang, Yuwei Wu, Yunde Jia
The cost aggregation sub-architecture is realized by a two-stream network: one for the generation of cost aggregation proposals, the other for the selection of the proposals.
no code implementations • 17 Nov 2017 • Zhi Gao, Yuwei Wu, Xingyuan Bu, Yunde Jia
To this end, several new layers are introduced in our network, including a nonlinear kernel aggregation layer, an SPD matrix transformation layer, and a vectorization layer.
no code implementations • CVPR 2017 • Tan Yu, Yuwei Wu, Junsong Yuan
This paper tackles the problem of efficient and effective object instance search in videos.
no code implementations • 31 Mar 2017 • Min Yang, Yuwei Wu, Yunde Jia
In this paper, we present a hybrid data association framework with a min-cost multi-commodity network flow for robust online multi-object tracking.