Search Results for author: Min Wang

Found 50 papers, 16 papers with code

Unimodal and Crossmodal Refinement Network for Multimodal Sequence Fusion

no code implementations EMNLP 2021 Xiaobao Guo, Adams Kong, Huan Zhou, Xianfeng Wang, Min Wang

Specifically, to improve unimodal representations, a unimodal refinement module is designed to refine modality-specific learning via iteratively updating the distribution with transformer-based attention layers.

Representation Learning

GaussNav: Gaussian Splatting for Visual Navigation

no code implementations18 Mar 2024 Xiaohan Lei, Min Wang, Wengang Zhou, Houqiang Li

In embodied vision, Instance ImageGoal Navigation (IIN) requires an agent to locate a specific object depicted in a goal image within an unexplored environment.

Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction

no code implementations18 Mar 2024 Zhiyang Guo, Wengang Zhou, Li Li, Min Wang, Houqiang Li

To address the above problem, we propose a novel motion-aware enhancement framework for dynamic scene reconstruction, which mines useful motion cues from optical flow to improve different paradigms of dynamic 3DGS.

DEMOS: Dynamic Environment Motion Synthesis in 3D Scenes via Local Spherical-BEV Perception

no code implementations4 Mar 2024 Jingyu Gong, Min Wang, Wentao Liu, Chen Qian, Zhizhong Zhang, Yuan Xie, Lizhuang Ma

To handle this problem, we propose the first Dynamic Environment MOtion Synthesis framework (DEMOS) to predict future motion instantly according to the current scene, and use it to dynamically update the latent motion for final motion synthesis.

motion prediction Motion Synthesis

Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval

no code implementations3 Mar 2024 Yongchao Du, Min Wang, Wengang Zhou, Shuping Hui, Houqiang Li

To tackle the above problems, we propose Image2Sentence based Asymmetric zero-shot composed image retrieval (ISA), which takes advantage of the VL model and only relies on unlabeled images for composition learning.

Image Retrieval Language Modelling +2

Structure Similarity Preservation Learning for Asymmetric Image Retrieval

1 code implementation1 Mar 2024 Hui Wu, Min Wang, Wengang Zhou, Houqiang Li

The centroid vectors in the quantizer serve as anchor points in the embedding space of the gallery model to characterize its structure.

Image Retrieval Retrieval

Asymmetric Feature Fusion for Image Retrieval

no code implementations CVPR 2023 Hui Wu, Min Wang, Wengang Zhou, Zhenbo Lu, Houqiang Li

Then, a dynamic mixer is introduced to aggregate these features into compact embedding for efficient search.

Image Retrieval Retrieval

Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

no code implementations25 Feb 2024 Xiaohan Lei, Min Wang, Wengang Zhou, Li Li, Houqiang Li

In this work, we propose to imitate the human behaviour of ``getting closer to confirm" when distinguishing objects from a distance.

Navigate

Self-distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

1 code implementation17 Aug 2023 Ziyin Zhang, Ning Lu, Minghui Liao, Yongshuai Huang, Cheng Li, Min Wang, Wei Peng

It incorporates a framewise regularization term in CTC loss to emphasize individual supervision, and leverages the maximizing-a-posteriori of latent alignment to solve the inconsistency problem that arises in distillation between CTC-based models.

Automatic Deduction Path Learning via Reinforcement Learning with Environmental Correction

no code implementations16 Jun 2023 Shuai Xiao, Chen Pan, Min Wang, Xinxin Zhu, Siqiao Xue, Jing Wang, Yunhua Hu, James Zhang, Jinghua Feng

To this end, we formulate the problem as a partially observable Markov decision problem (POMDP) and employ an environment correction algorithm based on the characteristics of the business.

Hierarchical Reinforcement Learning reinforcement-learning

SDTracker: Synthetic Data Based Multi-Object Tracking

no code implementations26 Mar 2023 Yingda Guan, Zhengyang Feng, Huiying Chang, Kuo Du, TingTing Li, Min Wang

We present SDTracker, a method that harnesses the potential of synthetic data for multi-object tracking of real-world scenes in a domain generalization and semi-supervised fashion.

Domain Generalization Multi-Object Tracking +1

HandNeRF: Neural Radiance Fields for Animatable Interacting Hands

no code implementations CVPR 2023 Zhiyang Guo, Wengang Zhou, Min Wang, Li Li, Houqiang Li

We propose a novel framework to reconstruct accurate appearance and geometry with neural radiance fields (NeRF) for interacting hands, enabling the rendering of photo-realistic images and videos for gesture animation from arbitrary views.

An Order-Invariant and Interpretable Hierarchical Dilated Convolution Neural Network for Chemical Fault Detection and Diagnosis

no code implementations13 Feb 2023 Mengxuan Li, Peng Peng, Min Wang, Hongwei Wang

The novelty of HDLCNN lies in its capability of processing tabular data with features of arbitrary order without seeking the optimal order, due to the ability to agglomerate correlated features of feature clustering and the large receptive field of dilated convolution.

Chemical Process Clustering +1

FashionVQA: A Domain-Specific Visual Question Answering System

no code implementations24 Aug 2022 Min Wang, Ata Mahjoubfar, Anupama Joshi

We see that using the same transformer for encoding the question and decoding the answer, as in language models, achieves maximum accuracy, showing that visual language models (VLMs) make the best visual question answering systems for our dataset.

Question Answering Visual Question Answering

Unsupervisedly Prompting AlphaFold2 for Few-Shot Learning of Accurate Folding Landscape and Protein Structure Prediction

2 code implementations20 Aug 2022 Jun Zhang, Sirui Liu, Mengyun Chen, Haotian Chu, Min Wang, Zidong Wang, Jialiang Yu, Ningxi Ni, Fan Yu, Diqing Chen, Yi Isaac Yang, Boxin Xue, Lijiang Yang, YuAn Liu, Yi Qin Gao

Data-driven predictive methods which can efficiently and accurately transform protein sequences into biologically active structures are highly valuable for scientific research and medical development.

Denoising Few-Shot Learning +2

A Universal PINNs Method for Solving Partial Differential Equations with a Point Source

1 code implementation Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022 Xiang Huang, Hongsheng Liu, Beiji Shi, Zidong Wang, Kang Yang, Yang Li, Min Wang, Haotian Chu, Jing Zhou, Fan Yu, Bei Hua, Bin Dong, Lei Chen

In recent years, deep learning technology has been used to solve partial differential equations (PDEs), among which the physics-informed neural networks (PINNs)method emerges to be a promising method for solving both forward and inverse PDE problems.

Deeply Supervised Skin Lesions Diagnosis with Stage and Branch Attention

2 code implementations9 May 2022 Wei Dai, Rui Liu, Tianyi Wu, Min Wang, Jianqin Yin, Jun Liu

Visual features of skin lesions vary significantly because the images are collected from patients with different lesion colours and morphologies by using dissimilar imaging equipment.

Classification

Do Inpainting Yourself: Generative Facial Inpainting Guided by Exemplars

1 code implementation13 Feb 2022 Wanglong Lu, Hanli Zhao, Xianta Jiang, Xiaogang Jin, YongLiang Yang, Min Wang, Jiankai Lyu, Kaijie Shi

We introduce a novel attribute similarity metric to encourage networks to learn the style of facial attributes from the exemplar in a self-supervised way.

Attribute Facial Inpainting

Rethinking Feature Uncertainty in Stochastic Neural Networks for Adversarial Robustness

no code implementations1 Jan 2022 Hao Yang, Min Wang, Zhengfei Yu, Yun Zhou

Extensive experiments on well-known white- and black-box attacks show that MFDV-SNN achieves a significant improvement over existing methods, which indicates that it is a simple but effective method to improve model robustness.

Adversarial Robustness

Contextual Similarity Distillation for Asymmetric Image Retrieval

no code implementations CVPR 2022 Hui Wu, Min Wang, Wengang Zhou, Houqiang Li, Qi Tian

To this end, we propose a flexible contextual similarity distillation framework to enhance the small query model and keep its output feature compatible with that of large gallery model, which is crucial with asymmetric retrieval.

Image Retrieval Retrieval

Learning Token-based Representation for Image Retrieval

1 code implementation12 Dec 2021 Hui Wu, Min Wang, Wengang Zhou, Yang Hu, Houqiang Li

Next, a refinement block is introduced to enhance the visual tokens with self-attention and cross-attention.

Image Retrieval Retrieval

Meta-Auto-Decoder for Solving Parametric Partial Differential Equations

no code implementations15 Nov 2021 Xiang Huang, Zhanhong Ye, Hongsheng Liu, Beiji Shi, Zidong Wang, Kang Yang, Yang Li, Bingya Weng, Min Wang, Haotian Chu, Fan Yu, Bei Hua, Lei Chen, Bin Dong

Many important problems in science and engineering require solving the so-called parametric partial differential equations (PDEs), i. e., PDEs with different physical parameters, boundary conditions, shapes of computation domains, etc.

Meta-Learning

Solving Partial Differential Equations with Point Source Based on Physics-Informed Neural Networks

no code implementations2 Nov 2021 Xiang Huang, Hongsheng Liu, Beiji Shi, Zidong Wang, Kang Yang, Yang Li, Bingya Weng, Min Wang, Haotian Chu, Jing Zhou, Fan Yu, Bei Hua, Lei Chen, Bin Dong

In recent years, deep learning technology has been used to solve partial differential equations (PDEs), among which the physics-informed neural networks (PINNs) emerges to be a promising method for solving both forward and inverse PDE problems.

Contextual Similarity Aggregation with Self-attention for Visual Re-ranking

1 code implementation NeurIPS 2021 Jianbo Ouyang, Hui Wu, Min Wang, Wengang Zhou, Houqiang Li

Since our re-ranking model is not directly involved with the visual feature used in the initial retrieval, it is ready to be applied to retrieval result lists obtained from various retrieval algorithms.

Content-Based Image Retrieval Data Augmentation +2

DynSTGAT: Dynamic Spatial-Temporal Graph Attention Network for Traffic Signal Control

no code implementations12 Sep 2021 Libing Wu, Min Wang, Dan Wu, Jia Wu

Then, to efficiently utilize the historical state information of the intersection, we design a sequence model with the temporal convolutional network (TCN) to capture the historical information and further merge it with the spatial information to improve its performance.

Graph Attention

The 2nd Anti-UAV Workshop & Challenge: Methods and Results

no code implementations23 Aug 2021 Jian Zhao, Gang Wang, Jianan Li, Lei Jin, Nana Fan, Min Wang, Xiaojuan Wang, Ting Yong, Yafeng Deng, Yandong Guo, Shiming Ge, Guodong Guo

The 2nd Anti-UAV Workshop \& Challenge aims to encourage research in developing novel and accurate methods for multi-scale object tracking.

Object Tracking

SKFAC: Training Neural Networks With Faster Kronecker-Factored Approximate Curvature

1 code implementation CVPR 2021 Zedong Tang, Fenlong Jiang, Maoguo Gong, Hao Li, Yue Wu, Fan Yu, Zidong Wang, Min Wang

For the fully connected layers, by utilizing the low-rank property of Kronecker factors of Fisher information matrix, our method only requires inverting a small matrix to approximate the curvature with desirable accuracy.

Dimensionality Reduction

The Origin of Corporate Control Power

no code implementations3 Jun 2021 Jie He, Min Wang

How does the control power of corporate shareholder arise?

SKFAC:Training Neural Networks with Faster Kronecker-Factored Approximate Curvature

1 code implementation Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 Zedong Tang, Fenlong Jiang, Maoguo Gong, Hao Li, Yue Wu, Fan Yu, Zidong Wang, Min Wang

For the fully connected layers, by utilizing the low-rank property of Kronecker factors of Fisher information matrix, our method only requires inverting a small matrix to approximate the curvature with desirable accuracy.

Dimensionality Reduction

THOR, Trace-based Hardware-adaptive layer-ORiented Natural Gradient Descent Computation

no code implementations AAAI Technical Track on Machine Learning 2021 Mengyun Chen, Kaixin Gao, Xiaolei Liu, Zidong Wang, Ningxi Ni, Qian Zhang, Lei Chen, Chao Ding, ZhengHai Huang, Min Wang, Shuangling Wang, Fan Yu, Xinyuan Zhao, Dachuan Xu

It is well-known that second-order optimizer can accelerate the training of deep neural networks, however, the huge computation cost of second-order optimization makes it impractical to apply in real practice.

IUP: An Intelligent Utility Prediction Scheme for Solid-State Fermentation in 5G IoT

no code implementations28 Mar 2021 Min Wang, Shanchen Pang, Tong Ding, Sibo Qiao, Xue Zhai, Shuo Wang, Neal N. Xiong, Zhengwen Huang

In addition, we design a utility prediction model for SSF based on the Generative Adversarial Networks (GAN) and Fully Connected Neural Network (FCNN).

Few-Shot Learning

A Priori Generalization Analysis of the Deep Ritz Method for Solving High Dimensional Elliptic Equations

no code implementations5 Jan 2021 Jianfeng Lu, Yulong Lu, Min Wang

This paper concerns the a priori generalization analysis of the Deep Ritz Method (DRM) [W. E and B. Yu, 2017], a popular neural-network-based method for solving high dimensional partial differential equations.

Learning Deep Local Features With Multiple Dynamic Attentions for Large-Scale Image Retrieval

1 code implementation ICCV 2021 Hui Wu, Min Wang, Wengang Zhou, Houqiang Li

To this end, we propose a novel deep local feature learning architecture to simultaneously focus on multiple discriminative local patterns in an image.

Image Retrieval Metric Learning +1

AsymptoticNG: A regularized natural gradient optimization algorithm with look-ahead strategy

no code implementations24 Dec 2020 Zedong Tang, Fenlong Jiang, Junke Song, Maoguo Gong, Hao Li, Fan Yu, Zidong Wang, Min Wang

Optimizers that further adjust the scale of gradient, such as Adam, Natural Gradient (NG), etc., despite widely concerned and used by the community, are often found poor generalization performance, compared with Stochastic Gradient Descent (SGD).

Eigenvalue-corrected Natural Gradient Based on a New Approximation

no code implementations27 Nov 2020 Kai-Xin Gao, Xiao-Lei Liu, Zheng-Hai Huang, Min Wang, Shuangling Wang, Zidong Wang, Dachuan Xu, Fan Yu

Using second-order optimization methods for training deep neural networks (DNNs) has attracted many researchers.

A Trace-restricted Kronecker-Factored Approximation to Natural Gradient

no code implementations21 Nov 2020 Kai-Xin Gao, Xiao-Lei Liu, Zheng-Hai Huang, Min Wang, Zidong Wang, Dachuan Xu, Fan Yu

There have been many attempts to use second-order optimization methods for training deep neural networks.

Empirical distributions of the robustified $t$-test statistics

no code implementations6 Jul 2018 Chanseok Park, Min Wang

Based on the median and the median absolute deviation estimators, and the Hodges-Lehmann and Shamos estimators, robustified analogues of the conventional $t$-test statistic are proposed.

Applications

Weight-importance sparse training in keyword spotting

no code implementations2 Jul 2018 Sihao Xue, Zhenyi Ying, Fan Mo, Min Wang, Jue Sun

Besides this, at most of time, ASR system is used to deal with real-time problem such as keyword spotting (KWS).

Keyword Spotting speech-recognition +1

Deep Multiscale Model Learning

no code implementations13 Jun 2018 Yating Wang, Siu Wun Cheung, Eric T. Chung, Yalchin Efendiev, Min Wang

Numerical results show that using deep learning and multiscale models, we can improve the forward models, which are conditioned to the available data.

DRPose3D: Depth Ranking in 3D Human Pose Estimation

no code implementations23 May 2018 Min Wang, Xipeng Chen, Wentao Liu, Chen Qian, Liang Lin, Lizhuang Ma

In this paper, we propose a two-stage depth ranking based method (DRPose3D) to tackle the problem of 3D human pose estimation.

3D Human Pose Estimation 3D Pose Estimation

YNUDLG at IJCNLP-2017 Task 5: A CNN-LSTM Model with Attention for Multi-choice Question Answering in Examinations

no code implementations IJCNLP 2017 Min Wang, Qingxun Liu, Peng Ding, Yongbin Li, Xiaobing Zhou

In this paper, we perform convolutional neural networks (CNN) to learn the joint representations of question-answer pairs first, then use the joint representations as the inputs of the long short-term memory (LSTM) with attention to learn the answer sequence of a question for labeling the matching quality of each answer.

Question Answering valid

Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial "Bottleneck" Structure

1 code implementation15 Aug 2016 Min Wang, Baoyuan Liu, Hassan Foroosh

A topological subdivisioning is adopted to reduce the connection between the input channels and output channels.

Sparse Convolutional Neural Networks

no code implementations CVPR 2015 Baoyuan Liu, Min Wang, Hassan Foroosh, Marshall Tappen, Marianna Pensky

Deep neural networks have achieved remarkable performance in both image classification and object detection problems, at the cost of a large number of parameters and computational complexity.

Image Classification object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.