Search Results for author: Zheng Ma

Found 52 papers, 26 papers with code

Panoptic Video Scene Graph Generation

3 code implementations CVPR 2023 Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu

PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.

Graph Generation Panoptic Scene Graph Generation +4

An Unsupervised Deep Learning Approach for the Wave Equation Inverse Problem

no code implementations8 Nov 2023 Xiong-bin Yan, Keke Wu, Zhi-Qin John Xu, Zheng Ma

Full-waveform inversion (FWI) is a powerful geophysical imaging technique that infers high-resolution subsurface physical parameters by solving a non-convex optimization problem.

Bayesian Inference

Dynamic Demonstrations Controller for In-Context Learning

1 code implementation30 Sep 2023 Fei Zhao, Taotian Pang, Zhen Wu, Zheng Ma, ShuJian Huang, Xinyu Dai

Previous studies have revealed that ICL is sensitive to the selection and the ordering of demonstrations.

Language Modelling Large Language Model

Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models

1 code implementation6 Aug 2023 Zheng Ma, Mianzhi Pan, Wenhan Wu, Kanzhi Cheng, Jianbing Zhang, ShuJian Huang, Jiajun Chen

Experiments on our proposed datasets demonstrate that popular VLMs underperform in the food domain compared with their performance in the general domain.

Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model

1 code implementation2 Aug 2023 Kanzhi Cheng, Wenpo Song, Zheng Ma, Wenhao Zhu, Zixuan Zhu, Jianbing Zhang

Considering that Vision-Language Pre-Training (VLP) models master massive such knowledge from large-scale web-harvested data, it is promising to utilize the generalizability of VLP models to incorporate knowledge into image descriptions.

Image Captioning Knowledge Distillation +1

ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora

1 code implementation2 Aug 2023 Kanzhi Cheng, Zheng Ma, Shi Zong, Jianbing Zhang, Xinyu Dai, Jiajun Chen

Generating visually grounded image captions with specific linguistic styles using unpaired stylistic corpora is a challenging task, especially since we expect stylized captions with a wide variety of stylistic patterns.

Contrastive Learning Image Captioning

Capturing the Diffusive Behavior of the Multiscale Linear Transport Equations by Asymptotic-Preserving Convolutional DeepONets

no code implementations28 Jun 2023 Keke Wu, Xiong-bin Yan, Shi Jin, Zheng Ma

In this paper, we introduce two types of novel Asymptotic-Preserving Convolutional Deep Operator Networks (APCONs) designed to address the multiscale time-dependent linear transport problem.

Laplace-fPINNs: Laplace-based fractional physics-informed neural networks for solving forward and inverse problems of subdiffusion

no code implementations3 Apr 2023 Xiong-bin Yan, Zhi-Qin John Xu, Zheng Ma

To address this issue, this paper proposes an extension to PINNs called Laplace-based fractional physics-informed neural networks (Laplace-fPINNs), which can effectively solve the forward and inverse problems of fractional diffusion equations.

Distribution Fitting for Combating Mode Collapse in GANs

no code implementations3 Dec 2022 Yanxiang Gong, Zhiwei Xie, Guozhen Duan, Zheng Ma, Mei Xie

To alleviate the problem, we propose a global distribution fitting (GDF) method by a penalty term to constrain generated data distribution.

Bayesian Inversion with Neural Operator (BINO) for Modeling Subdiffusion: Forward and Inverse Problems

no code implementations22 Nov 2022 Xiong-bin Yan, Zhi-Qin John Xu, Zheng Ma

A large number of numerical experiments demonstrate that the operator learning method proposed in this work can efficiently solve the forward problems and Bayesian inverse problems of the subdiffusion equation.

Operator learning

Probing Cross-modal Semantics Alignment Capability from the Textual Perspective

no code implementations18 Oct 2022 Zheng Ma, Shi Zong, Mianzhi Pan, Jianbing Zhang, ShuJian Huang, Xinyu Dai, Jiajun Chen

In recent years, vision and language pre-training (VLP) models have advanced the state-of-the-art results in a variety of cross-modal downstream tasks.

Image Captioning

GEDI: A Graph-based End-to-end Data Imputation Framework

no code implementations13 Aug 2022 Katrina Chen, Xiuqin Liang, Zheng Ma, Zhibin Zhang

Data imputation is an effective way to handle missing data, which is common in practical applications.

Graph structure learning Imputation +1

OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving

1 code implementation27 May 2022 Guohang Yan, Liu Zhuochun, Chengjie Wang, Chunlei Shi, Pengjin Wei, Xinyu Cai, Tao Ma, Zhizheng Liu, Zebin Zhong, Yuqian Liu, Ming Zhao, Zheng Ma, Yikang Li

To this end, we present OpenCalib, a calibration toolbox that contains a rich set of various sensor calibration methods.

Autonomous Driving

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

no code implementations8 Jul 2021 Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma

In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs.

Unsupervised domain adaptation via coarse-to-fine feature alignment method using contrastive learning

no code implementations23 Mar 2021 Shiyu Tang, Peijun Tang, Yanxiang Gong, Zheng Ma, Mei Xie

It draws class-wise features closer than coarse feature alignment or class-wise feature alignment only, therefore improves the model's performance to a great extent.

Contrastive Learning Semantic Segmentation +1

Network Pruning via Resource Reallocation

1 code implementation2 Mar 2021 Yuenan Hou, Zheng Ma, Chunxiao Liu, Zhe Wang, Chen Change Loy

Channel pruning is broadly recognized as an effective approach to obtain a small compact model through eliminating unimportant channels from a large cumbersome network.

Network Pruning

Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

no code implementations30 Jan 2021 Yaoyu Zhang, Tao Luo, Zheng Ma, Zhi-Qin John Xu

Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question.

Open-Ended Question Answering

Fourier-domain Variational Formulation and Its Well-posedness for Supervised Learning

no code implementations6 Dec 2020 Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang

A supervised learning problem is to find a function in a hypothesis function space given values on isolated data points.

On the exact computation of linear frequency principle dynamics and its generalization

1 code implementation15 Oct 2020 Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behavior of DNNs in complex tasks.

Fully Decentralized Federated Learning Based Beamforming Design for UAV Communications

no code implementations27 Jul 2020 Yue Xiao, Yu Ye, Shaocheng Huang, Li Hao, Zheng Ma, Ming Xiao, Shahid Mumtaz

To handle the data explosion in the era of internet of things (IoT), it is of interest to investigate the decentralized network, with the aim at relaxing the burden to central server along with keeping data privacy.

Signal Processing

Phase diagram for two-layer ReLU neural networks at infinite-width limit

1 code implementation15 Jul 2020 Tao Luo, Zhi-Qin John Xu, Zheng Ma, Yaoyu Zhang

In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization.

Path Integral Based Convolution and Pooling for Graph Neural Networks

1 code implementation NeurIPS 2020 Zheng Ma, Junyu Xuan, Yu Guang Wang, Ming Li, Pietro Lio

Borrowing ideas from physics, we propose a path integral based graph neural networks (PAN) for classification and regression tasks on graphs.

Graph Classification Graph Regression +1

Inter-Region Affinity Distillation for Road Marking Segmentation

1 code implementation CVPR 2020 Yuenan Hou, Zheng Ma, Chunxiao Liu, Tak-Wai Hui, Chen Change Loy

We study the problem of distilling knowledge from a large deep teacher network to a much smaller student network for the task of road marking segmentation.

Knowledge Distillation Lane Detection +1

What's the relationship between CNNs and communication systems?

no code implementations3 Mar 2020 Hao Ge, Xiaoguang Tu, Yanxiang Gong, Mei Xie, Zheng Ma

The interpretability of Convolutional Neural Networks (CNNs) is an important topic in the field of computer vision.

Defending from adversarial examples with a two-stream architecture

no code implementations30 Dec 2019 Hao Ge, Xiaoguang Tu, Mei Xie, Zheng Ma

We demonstrate that our two-stream architecture is robust to adversarial examples built by currently known attacking algorithms.

Vocal Bursts Valence Prediction

Haar Graph Pooling

1 code implementation ICML 2020 Yu Guang Wang, Ming Li, Zheng Ma, Guido Montufar, Xiaosheng Zhuang, Yanan Fan

Deep Graph Neural Networks (GNNs) are useful models for graph classification and graph-based regression tasks.

General Classification Graph Classification +1

HaarPooling: Graph Pooling with Compressive Haar Basis

no code implementations25 Sep 2019 Yu Guang Wang, Ming Li, Zheng Ma, Guido Montufar, Xiaosheng Zhuang, Yanan Fan

The input of each pooling layer is transformed by the compressive Haar basis of the corresponding clustering.

Graph Classification

STELA: A Real-Time Scene Text Detector with Learned Anchor

1 code implementation17 Sep 2019 Linjie Deng, Yanxiang Gong, Xinchen Lu, Yi Lin, Zheng Ma, Mei Xie

To achieve high coverage of target boxes, a normal strategy of conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks.

Scene Text Detection Text Detection

Focus-Enhanced Scene Text Recognition with Deformable Convolutions

1 code implementation29 Aug 2019 Linjie Deng, Yanxiang Gong, Xinchen Lu, Xin Yi, Zheng Ma, Mei Xie

Recently, scene text recognition methods based on deep learning have sprung up in computer vision area.

Scene Text Recognition

Learning Lightweight Lane Detection CNNs by Self Attention Distillation

2 code implementations ICCV 2019 Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy

Training deep models for lane detection is challenging due to the very subtle and sparse supervisory signals inherent in lane annotations.

Knowledge Distillation Lane Detection +1

Fast Haar Transforms for Graph Neural Networks

no code implementations10 Jul 2019 Ming Li, Zheng Ma, Yu Guang Wang, Xiaosheng Zhuang

Graph Neural Networks (GNNs) have become a topic of intense research recently due to their powerful capability in high-dimensional classification and regression tasks for graph-structured data.

General Classification Node Classification +1

Theory of the Frequency Principle for General Deep Neural Networks

1 code implementation21 Jun 2019 Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training.

Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks

1 code implementation24 May 2019 Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

It remains a puzzle that why deep neural networks (DNNs), with more parameters than samples, often generalize well.

A type of generalization error induced by initialization in deep neural networks

no code implementations19 May 2019 Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Overall, our work serves as a baseline for the further investigation of the impact of initialization and loss function on the generalization of DNNs, which can potentially guide and improve the training of DNNs in practice.

PAN: Path Integral Based Convolution for Deep Graph Neural Networks

no code implementations24 Apr 2019 Zheng Ma, Ming Li, Yuguang Wang

In this paper, we propose PAN, a new graph convolution framework that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk.

A Dual Symmetric Gauss-Seidel Alternating Direction Method of Multipliers for Hyperspectral Sparse Unmixing

no code implementations25 Feb 2019 Longfei Ren, Chengjing Wang, Peipei Tang, Zheng Ma

Since sparse unmixing has emerged as a promising approach to hyperspectral unmixing, some spatial-contextual information in the hyperspectral images has been exploited to improve the performance of the unmixing recently.

Hyperspectral Unmixing

Generating Text Sequence Images for Recognition

1 code implementation21 Jan 2019 Yanxiang Gong, Linjie Deng, Zheng Ma, Mei Xie

Recently, methods based on deep learning have dominated the field of text recognition.

Image-to-Image Translation Translation

Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks

3 code implementations19 Jan 2019 Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma

We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective.

Deep Transfer Across Domains for Face Anti-spoofing

no code implementations17 Jan 2019 Xiaoguang Tu, Hengsheng Zhang, Mei Xie, Yao Luo, Yuefei Zhang, Zheng Ma

We propose a CNN framework using sparsely labeled data from the target domain to learn features that are invariant across domains for face anti-spoofing.

Face Anti-Spoofing Face Recognition

Enhance the Motion Cues for Face Anti-Spoofing using CNN-LSTM Architecture

no code implementations17 Jan 2019 Xiaoguang Tu, Hengsheng Zhang, Mei Xie, Yao Luo, Yuefei Zhang, Zheng Ma

Spatio-temporal information is very important to capture the discriminative cues between genuine and fake faces from video sequences.

Face Anti-Spoofing Motion Magnification

Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing

1 code implementation17 Jan 2019 Xiaoguang Tu, Jian Zhao, Mei Xie, Guodong Du, Hengsheng Zhang, Jianshu Li, Zheng Ma, Jiashi Feng

Face anti-spoofing (a. k. a presentation attack detection) has drawn growing attention due to the high-security demand in face authentication systems.

Domain Adaptation Face Anti-Spoofing +1

Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks

2 code implementations7 Nov 2018 Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy

In this paper, we considerably improve the accuracy and robustness of predictions through heterogeneous auxiliary networks feature mimicking, a new and effective training method that provides us with much richer contextual signals apart from steering direction.

Image Segmentation Multi-Task Learning +3

Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking

1 code implementation29 May 2017 Di Kang, Zheng Ma, Antoni B. Chan

The goal of this paper is to evaluate density maps generated by density estimation methods on a variety of crowd analysis tasks, including counting, detection, and tracking.

Density Estimation regression

Small Instance Detection by Integer Programming on Object Density Maps

no code implementations CVPR 2015 Zheng Ma, Lei Yu, Antoni B. Chan

For each region, a sliding window (ROI) is passed over the density map to calculate the instance count within each ROI.

object-detection Object Detection

Crossing the Line: Crowd Counting by Integer Programming with Local Features

no code implementations CVPR 2013 Zheng Ma, Antoni B. Chan

Next, the number of people is estimated in a set of overlapping sliding windows on the temporal slice image, using a regression function that maps from local features to a count.

Crowd Counting

Cannot find the paper you are looking for? You can Submit a new open access paper.