Search Results for author: Yuwei Wu

Found 55 papers, 19 papers with code

Hyperbolic Dual Feature Augmentation for Open-Environment

no code implementations10 Jun 2025 Peilin Yu, Yuwei Wu, Zhi Gao, Xiaomeng Fan, Shuo Yang, Yunde Jia

Feature augmentation generates novel samples in the feature space, providing an effective way to enhance the generalization ability of learning algorithms with hyperbolic geometry.

class-incremental learning Class Incremental Learning +6

Multi-Sourced Compositional Generalization in Visual Question Answering

1 code implementation29 May 2025 Chuanhao Li, Wenbo Ye, Zhen Li, Yuwei Wu, Yunde Jia

Compositional generalization is the ability of generalizing novel compositions from seen primitives, and has received much attention in vision-and-language (V\&L) recently.

Question Answering Visual Question Answering

Memory-Centric Embodied Question Answer

no code implementations20 May 2025 Mingliang Zhai, Zhi Gao, Yuwei Wu, Yunde Jia

Unlike planner-centric EQA models where the memory module cannot fully interact with other modules, MemoryEQA flexible feeds memory information into all modules, thereby enhancing efficiency and accuracy in handling complex tasks, such as those involving multiple targets across different regions.

Embodied Question Answering Large Language Model +1

Multi-Label Stereo Matching for Transparent Scene Depth Estimation

1 code implementation20 May 2025 Zhidan Liu, Chengtang Yao, Jiaxi Zeng, Yuwei Wu, Yunde Jia

To resolve the multi-label regression problem, we introduce a pixel-wise multivariate Gaussian representation, where the mean vector encodes multiple depth values at the same pixel, and the covariance matrix determines whether a multi-label representation is necessary for a given pixel.

Depth Estimation regression +2

Diving into the Fusion of Monocular Priors for Generalized Stereo Matching

1 code implementation20 May 2025 Chengtang Yao, Lidong Yu, Zhidan Liu, Jiaxi Zeng, Yuwei Wu, Yunde Jia

A direct fusion of a monocular depth map could alleviate the local optima problem, but noisy disparity results computed at the first several iterations will misguide the fusion.

Stereo Matching

3D Visual Illusion Depth Estimation

1 code implementation19 May 2025 Chengtang Yao, Zhidan Liu, Jiaxi Zeng, Lidong Yu, Yuwei Wu, Yunde Jia

3D visual illusion is a perceptual phenomenon where a two-dimensional plane is manipulated to simulate three-dimensional spatial relationships, making a flat artwork or object look three-dimensional in the human visual system.

Common Sense Reasoning Depth Estimation +2

Large-Scale Riemannian Meta-Optimization via Subspace Adaptation

no code implementations25 Jan 2025 Peilin Yu, Yuwei Wu, Zhi Gao, Xiaomeng Fan, Yunde Jia

However, existing Riemannian meta-optimization methods take up huge memory footprints in large-scale optimization settings, as the learned optimizer can only adapt gradients of a fixed size and thus cannot be shared across different Riemannian parameters.

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

no code implementations20 Dec 2024 Zhi Gao, Bofei Zhang, Pengxiang Li, Xiaojian Ma, Tao Yuan, Yue Fan, Yuwei Wu, Yunde Jia, Song-Chun Zhu, Qing Li

The advancement of large language models (LLMs) prompts the development of multi-modal agents, which are used as a controller to call external tools, providing a feasible way to solve practical tasks.

Language Modeling Language Modelling

Consistency of Compositional Generalization across Multiple Levels

1 code implementation18 Dec 2024 Chuanhao Li, Zhen Li, Chenchen Jing, Xiaomeng Fan, Wenbo Ye, Yuwei Wu, Yunde Jia

Compositional generalization is the capability of a model to understand novel compositions composed of seen concepts.

Meta-Learning Question Answering +2

World knowledge-enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving

no code implementations9 Dec 2024 Mingliang Zhai, Cheng Li, Zengyuan Guo, Ningrui Yang, Xiameng Qin, Sanyuan Zhao, Junyu Han, Ji Tao, Yuwei Wu, Yunde Jia

The Multi-modal Large Language Models (MLLMs) with extensive world knowledge have revitalized autonomous driving, particularly in reasoning tasks within perceivable regions.

Autonomous Driving World Knowledge

Residual Hyperbolic Graph Convolution Networks

no code implementations5 Dec 2024 Yangkai Xue, Jindou Dai, Zhipeng Lu, Yuwei Wu, Yunde Jia

In this paper, we propose residual hyperbolic graph convolutional networks (R-HGCNs) to address the over-smoothing problem.

FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models

no code implementations16 Jul 2024 Pengxiang Li, Zhi Gao, Bofei Zhang, Tao Yuan, Yuwei Wu, Mehrtash Harandi, Yunde Jia, Song-Chun Zhu, Qing Li

Vision language models (VLMs) have achieved impressive progress in diverse applications, becoming a prevalent research direction.

Temporally Consistent Stereo Matching

1 code implementation16 Jul 2024 Jiaxi Zeng, Chengtang Yao, Yuwei Wu, Yunde Jia

Based on this coherent state, we introduce a dual-space refinement module to iteratively refine the initialized result in both disparity and disparity gradient spaces, improving estimations in ill-posed regions.

Depth Estimation Stereo Matching

SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge

no code implementations23 May 2024 Chuanhao Li, Zhen Li, Chenchen Jing, Shuo Liu, Wenqi Shao, Yuwei Wu, Ping Luo, Yu Qiao, Kaipeng Zhang

In this paper, we propose a plug-and-play framework, for augmenting existing LVLMs in handling visual question answering (VQA) about up-to-date knowledge, dubbed SearchLVLMs.

Question Answering RAG +2

Vision Transformers for End-to-End Vision-Based Quadrotor Obstacle Avoidance

no code implementations16 May 2024 Anish Bhattacharya, Nishanth Rao, Dhruv Parikh, Pratik Kunapuli, Yuwei Wu, Yuezhan Tao, Nikolai Matni, Vijay Kumar

We demonstrate the capabilities of an attention-based end-to-end approach for high-speed vision-based quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art learning architectures.

DPGAN: A Dual-Path Generative Adversarial Network for Missing Data Imputation in Graphs

no code implementations26 Apr 2024 Xindi Zheng, Yuwei Wu, Yu Pan, WanYu Lin, Lei Ma, Jianjun Zhao

The crux of our work is that it admits both global and local representations of the input graph signal, which can capture the long-range dependencies.

Generative Adversarial Network Graph Neural Network +1

Evaluating the Performance of ChatGPT for Spam Email Detection

no code implementations23 Feb 2024 Shijing Si, Yuwei Wu, Le Tang, Yugui Zhang, Jedrek Wosik, Qinliang Su

This study provides insights into the potential and limitations of ChatGPT for spam identification, highlighting its potential as a viable solution for resource-constrained language domains.

In-Context Learning Question Answering +1

Fine-Grained Annotation for Face Anti-Spoofing

no code implementations12 Oct 2023 Xu Chen, Yunde Jia, Yuwei Wu

In this paper, we propose a fine-grained annotation method for face anti-spoofing.

Face Anti-Spoofing Segmentation

Neural 3D Scene Reconstruction from Multiple 2D Images without 3D Supervision

no code implementations30 Jun 2023 Yi Guo, Che Sun, Yunde Jia, Yuwei Wu

We improve the reconstruction quality of complex geometry scene regions with sparse depth obtained by using the geometric constraints.

3D Scene Reconstruction

Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding

no code implementations19 May 2023 Mingliang Zhai, Yulin Li, Xiameng Qin, Chen Yi, Qunyi Xie, Chengquan Zhang, Kun Yao, Yuwei Wu, Yunde Jia

Transformers achieve promising performance in document understanding because of their high effectiveness and still suffer from quadratic computational complexity dependency on the sequence length.

document understanding

Exploring Data Geometry for Continual Learning

no code implementations CVPR 2023 Zhi Gao, Chen Xu, Feng Li, Yunde Jia, Mehrtash Harandi, Yuwei Wu

Our method dynamically expands the geometry of the underlying space to match growing geometric structures induced by new data, and prevents forgetting by keeping geometric structures of old data into account.

Continual Learning

Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language

1 code implementation CVPR 2023 Chuanhao Li, Zhen Li, Chenchen Jing, Yunde Jia, Yuwei Wu

Compositional generalization is critical to simulate the compositional capability of humans, and has received much attention in the vision-and-language (V&L) community.

Question Answering Self-Supervised Learning +2

Parameterized Cost Volume for Stereo Matching

no code implementations ICCV 2023 Jiaxi Zeng, Chengtang Yao, Lidong Yu, Yuwei Wu, Yunde Jia

In this paper, we propose a parameterized cost volume to encode the entire disparity space using multi-Gaussian distribution.

Stereo Matching

Sparse Point Guided 3D Lane Detection

no code implementations ICCV 2023 Chengtang Yao, Lidong Yu, Yuwei Wu, Yunde Jia

The high-resolution local information brought by sparse points refines 3D lanes in the BEV space hierarchically from low resolution to high resolution.

3D Lane Detection

Primitive3D: 3D Object Dataset Synthesis from Randomly Assembled Primitives

no code implementations CVPR 2022 Xinke Li, Henghui Ding, Zekun Tong, Yuwei Wu, Yeow Meng Chee

Further study suggests that our strategy can improve the model performance by pretraining and fine-tuning scheme, especially for the dataset with a small scale.

3D Object Classification Dataset Distillation +2

Primitive-based Shape Abstraction via Nonparametric Bayesian Inference

no code implementations28 Mar 2022 Yuwei Wu, Weixiao Liu, Sipu Ruan, Gregory S. Chirikjian

In this paper, we propose a novel non-parametric Bayesian statistical method to infer an abstraction, consisting of an unknown number of geometric primitives, from a point cloud.

Bayesian Inference

Maintaining Reasoning Consistency in Compositional Visual Question Answering

1 code implementation CVPR 2022 Chenchen Jing, Yunde Jia, Yuwei Wu, Xinyu Liu, Qi Wu

Existing VQA models can answer a compositional question well, but cannot work well in terms of reasoning consistency in answering the compositional question and its sub-questions.

Question Answering Visual Question Answering

Robust and Accurate Superquadric Recovery: a Probabilistic Approach

1 code implementation CVPR 2022 Weixiao Liu, Yuwei Wu, Sipu Ruan, Gregory S. Chirikjian

Among geometric primitives, superquadrics are well known for their ability to represent a wide range of shapes with few parameters.

Personalized Response Generation via Generative Split Memory Network

1 code implementation NAACL 2021 Yuwei Wu, Xuezhe Ma, Diyi Yang

Despite the impressive successes of generation and dialogue systems, how to endow a text generation system with particular personality traits to deliver more personalized responses remains under-investigated.

Response Generation Text Generation

A Decomposition Model for Stereo Matching

1 code implementation CVPR 2021 Chengtang Yao, Yunde Jia, Huijun Di, Pengxiang Li, Yuwei Wu

In this paper, we present a decomposition model for stereo matching to solve the problem of excessive growth in computational cost (time and memory cost) as the resolution increases.

Disparity Estimation model +1

A Hyperbolic-to-Hyperbolic Graph Convolutional Network

no code implementations CVPR 2021 Jindou Dai, Yuwei Wu, Zhi Gao, Yunde Jia

Specifically, we developed a manifold-preserving graph convolution that consists of a hyperbolic feature transformation and a hyperbolic neighborhood aggregation.

General Classification Graph Classification +2

Curvature Generation in Curved Spaces for Few-Shot Learning

no code implementations ICCV 2021 Zhi Gao, Yuwei Wu, Yunde Jia, Mehrtash Harandi

Few-shot learning describes the challenging problem of recognizing samples from unseen classes given very few labeled examples.

Few-Shot Learning

SG-Net: Syntax Guided Transformer for Language Representation

no code implementations27 Dec 2020 Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang

In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention.

Machine Reading Comprehension Machine Translation +2

Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene

1 code implementation11 Aug 2020 Xinke Li, Chongshou Li, Zekun Tong, Andrew Lim, Junsong Yuan, Yuwei Wu, Jing Tang, Raymond Huang

Based on it, we formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.

Instance Segmentation Point Cloud Segmentation +3

Content-Aware Inter-Scale Cost Aggregation for Stereo Matching

no code implementations5 Jun 2020 Chengtang Yao, Yunde Jia, Huijun Di, Yuwei Wu, Lidong Yu

In this paper, we present a content-aware inter-scale cost aggregation method that adaptively aggregates and upsamples the cost volume from coarse-scale to fine-scale by learning dynamic filter weights according to the content of the left and right views on the two scales.

Depth Estimation Stereo Matching

Semi-Supervised Models via Data Augmentationfor Classifying Interactive Affective Responses

1 code implementation23 Apr 2020 Jiaao Chen, Yuwei Wu, Diyi Yang

We present semi-supervised models with data augmentation (SMDA), a semi-supervised text classification system to classify interactive affective responses.

Data Augmentation Semi-Supervised Text Classification +2

On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial Attacks

1 code implementation CVPR 2020 Yue Zhao, Yuwei Wu, Caihua Chen, Andrew Lim

Armed with the Thompson Sampling, we develop a black-box attack with success rate over 95% on ModelNet40 data set.

Thompson Sampling

Semantics-aware BERT for Language Understanding

1 code implementation5 Sep 2019 Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou, Xiang Zhou

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks.

Language Modeling Language Modelling +6

DCMN+: Dual Co-Matching Network for Multi-choice Reading Comprehension

2 code implementations30 Aug 2019 Shuailiang Zhang, Hai Zhao, Yuwei Wu, Zhuosheng Zhang, Xi Zhou, Xiang Zhou

Multi-choice reading comprehension is a challenging task to select an answer from a set of candidate options when given passage and question.

Reading Comprehension Sentence

SG-Net: Syntax-Guided Machine Reading Comprehension

1 code implementation14 Aug 2019 Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang

In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention.

Language Modelling Machine Reading Comprehension +1

Stitching Videos from a Fisheye Lens Camera and a Wide-Angle Lens Camera for Telepresence Robots

no code implementations15 Mar 2019 Yanmei Dong, Mingtao Pei, Lijia Zhang, Bin Xu, Yuwei Wu, Yunde Jia

In this paper, we propose to stitch videos from the FF-camera with a wide-angle lens and the DF-camera with a fisheye lens for telepresence robots.

distortion correction

Auto-Retoucher(ART) - A framework for Background Replacement and Image Editing

1 code implementation13 Jan 2019 Yunxuan Xiao, Yikai Li, Yuwei Wu, LiZhen Zhu

Replacing the background and simultaneously adjusting foreground objects is a challenging task in image editing.

Image Matting Position +1

Explicit Contextual Semantics for Text Comprehension

no code implementations8 Sep 2018 Zhuosheng Zhang, Yuwei Wu, Zuchao Li, Hai Zhao

Who did what to whom is a major focus in natural language understanding, which is right the aim of semantic role labeling (SRL) task.

Machine Reading Comprehension Natural Language Understanding +1

Deep Stereo Matching with Explicit Cost Aggregation Sub-Architecture

no code implementations12 Jan 2018 Lidong Yu, Yucheng Wang, Yuwei Wu, Yunde Jia

The cost aggregation sub-architecture is realized by a two-stream network: one for the generation of cost aggregation proposals, the other for the selection of the proposals.

Stereo Matching Stereo Matching Hand

Learning a Robust Representation via a Deep Network on Symmetric Positive Definite Manifolds

no code implementations17 Nov 2017 Zhi Gao, Yuwei Wu, Xingyuan Bu, Yunde Jia

To this end, several new layers are introduced in our network, including a nonlinear kernel aggregation layer, an SPD matrix transformation layer, and a vectorization layer.

A Hybrid Data Association Framework for Robust Online Multi-Object Tracking

no code implementations31 Mar 2017 Min Yang, Yuwei Wu, Yunde Jia

In this paper, we present a hybrid data association framework with a min-cost multi-commodity network flow for robust online multi-object tracking.

global-optimization Multi-Object Tracking +1

Cannot find the paper you are looking for? You can Submit a new open access paper.