Search Results for author: Zheng Ma

Found 60 papers, 27 papers with code

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

1 code implementation • 25 Apr 2024 • Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao

Compared to both open-source and proprietary models, InternVL 1. 5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks.

844

Paper
Code

ODE-DPS: ODE-based Diffusion Posterior Sampling for Inverse Problems in Partial Differential Equation

no code implementations • 21 Apr 2024 • Enze Jiang, Jishen Peng, Zheng Ma, Xiong-bin Yan

In recent years we have witnessed a growth in mathematics for deep learning, which has been used to solve inverse problems of partial differential equations (PDEs).

Paper
Add Code

Capturing Shock Waves by Relaxation Neural Networks

no code implementations • 1 Apr 2024 • Nan Zhou, Zheng Ma

In this paper, we put forward a neural network framework to solve the nonlinear hyperbolic systems.

Paper
Add Code

MixRED: A Mix-lingual Relation Extraction Dataset

no code implementations • 23 Mar 2024 • Lingxing Kong, Yougang Chu, Zheng Ma, Jianbing Zhang, Liang He, Jiajun Chen

Relation extraction is a critical task in the field of natural language processing with numerous real-world applications.

Relation Relation Extraction

Paper
Add Code

A Scoping Review of Energy-Efficient Driving Behaviors and Applied State-of-the-Art AI Methods

no code implementations • 4 Mar 2024 • Zhipeng Ma, Bo Nørregaard Jørgensen, Zheng Ma

However, there is no comprehensive investigation into energy-efficient driving behaviors and strategies.

Paper
Add Code

Cobra Effect in Reference-Free Image Captioning Metrics

no code implementations • 18 Feb 2024 • Zheng Ma, Changxin Wang, Yawen Ouyang, Fei Zhao, Jianbing Zhang, ShuJian Huang, Jiajun Chen

If a certain metric has flaws, it will be exploited by the model and reflected in the generated sentences.

Image Captioning

Paper
Add Code

A Scoping Review of Energy Load Disaggregation

no code implementations • 10 Jan 2024 • Balázs András Tolnai, Zheng Ma, Bo Nørregaard Jørgensen

Energy load disaggregation can contribute to balancing power grids by enhancing the effectiveness of demand-side management and promoting electricity-saving behavior through increased consumer awareness.

Management

Paper
Add Code

Identifying Best Practice Melting Patterns in Induction Furnaces: A Data-Driven Approach Using Time Series KMeans Clustering and Multi-Criteria Decision Making

no code implementations • 9 Jan 2024 • Daniel Anthony Howard, Bo Nørregaard Jørgensen, Zheng Ma

The study successfully identified the cluster with the best performance.

Decision Making Time Series

Paper
Add Code

Panoptic Video Scene Graph Generation

3 code implementations • CVPR 2023 • Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu

PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.

Graph Generation Panoptic Scene Graph Generation +5

Paper
Code

An Unsupervised Deep Learning Approach for the Wave Equation Inverse Problem

no code implementations • 8 Nov 2023 • Xiong-bin Yan, Keke Wu, Zhi-Qin John Xu, Zheng Ma

Full-waveform inversion (FWI) is a powerful geophysical imaging technique that infers high-resolution subsurface physical parameters by solving a non-convex optimization problem.

Bayesian Inference

Paper
Add Code

Bounding and Filling: A Fast and Flexible Framework for Image Captioning

1 code implementation • 15 Oct 2023 • Zheng Ma, Changxin Wang, Bo Huang, Zixuan Zhu, Jianbing Zhang

Several models adopted a non-autoregressive manner to speed up the process.

Image Captioning

Paper
Code

Dynamic Demonstrations Controller for In-Context Learning

1 code implementation • 30 Sep 2023 • Fei Zhao, Taotian Pang, Zhen Wu, Zheng Ma, ShuJian Huang, Xinyu Dai

Previous studies have revealed that ICL is sensitive to the selection and the ordering of demonstrations.

In-Context Learning Language Modelling +1

Paper
Code

Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models

1 code implementation • 6 Aug 2023 • Zheng Ma, Mianzhi Pan, Wenhan Wu, Kanzhi Cheng, Jianbing Zhang, ShuJian Huang, Jiajun Chen

Experiments on our proposed datasets demonstrate that popular VLMs underperform in the food domain compared with their performance in the general domain.

Paper
Code

Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model

1 code implementation • 2 Aug 2023 • Kanzhi Cheng, Wenpo Song, Zheng Ma, Wenhao Zhu, Zixuan Zhu, Jianbing Zhang

Considering that Vision-Language Pre-Training (VLP) models master massive such knowledge from large-scale web-harvested data, it is promising to utilize the generalizability of VLP models to incorporate knowledge into image descriptions.

Hallucination Image Captioning +2

Paper
Code

ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora

1 code implementation • 2 Aug 2023 • Kanzhi Cheng, Zheng Ma, Shi Zong, Jianbing Zhang, Xinyu Dai, Jiajun Chen

Generating visually grounded image captions with specific linguistic styles using unpaired stylistic corpora is a challenging task, especially since we expect stylized captions with a wide variety of stylistic patterns.

Contrastive Learning Image Captioning

Paper
Code

Capturing the Diffusive Behavior of the Multiscale Linear Transport Equations by Asymptotic-Preserving Convolutional DeepONets

no code implementations • 28 Jun 2023 • Keke Wu, Xiong-bin Yan, Shi Jin, Zheng Ma

In this paper, we introduce two types of novel Asymptotic-Preserving Convolutional Deep Operator Networks (APCONs) designed to address the multiscale time-dependent linear transport problem.

Paper
Add Code

Laplace-fPINNs: Laplace-based fractional physics-informed neural networks for solving forward and inverse problems of subdiffusion

no code implementations • 3 Apr 2023 • Xiong-bin Yan, Zhi-Qin John Xu, Zheng Ma

To address this issue, this paper proposes an extension to PINNs called Laplace-based fractional physics-informed neural networks (Laplace-fPINNs), which can effectively solve the forward and inverse problems of fractional diffusion equations.

Paper
Add Code

Distribution Fitting for Combating Mode Collapse in Generative Adversarial Networks

no code implementations • 3 Dec 2022 • Yanxiang Gong, Zhiwei Xie, Guozhen Duan, Zheng Ma, Mei Xie

To address the issue, we propose a global distribution fitting (GDF) method with a penalty term to confine the generated data distribution.

Paper
Add Code

Bayesian Inversion with Neural Operator (BINO) for Modeling Subdiffusion: Forward and Inverse Problems

no code implementations • 22 Nov 2022 • Xiong-bin Yan, Zhi-Qin John Xu, Zheng Ma

A large number of numerical experiments demonstrate that the operator learning method proposed in this work can efficiently solve the forward problems and Bayesian inverse problems of the subdiffusion equation.

Operator learning

Paper
Add Code

Probing Cross-modal Semantics Alignment Capability from the Textual Perspective

no code implementations • 18 Oct 2022 • Zheng Ma, Shi Zong, Mianzhi Pan, Jianbing Zhang, ShuJian Huang, Xinyu Dai, Jiajun Chen

In recent years, vision and language pre-training (VLP) models have advanced the state-of-the-art results in a variety of cross-modal downstream tasks.

Image Captioning Sentence

Paper
Add Code

GEDI: A Graph-based End-to-end Data Imputation Framework

no code implementations • 13 Aug 2022 • Katrina Chen, Xiuqin Liang, Zheng Ma, Zhibin Zhang

Data imputation is an effective way to handle missing data, which is common in practical applications.

Graph structure learning Imputation +1

Paper
Add Code

OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving

1 code implementation • 27 May 2022 • Guohang Yan, Liu Zhuochun, Chengjie Wang, Chunlei Shi, Pengjin Wei, Xinyu Cai, Tao Ma, Zhizheng Liu, Zebin Zhong, Yuqian Liu, Ming Zhao, Zheng Ma, Yikang Li

To this end, we present OpenCalib, a calibration toolbox that contains a rich set of various sensor calibration methods.

Autonomous Driving

2,046

Paper
Code

Unified Chinese License Plate Detection and Recognition with High Efficiency

1 code implementation • 7 May 2022 • Yanxiang Gong, Linjie Deng, Shuai Tao, Xinchen Lu, Peicheng Wu, Zhiwei Xie, Zheng Ma, Mei Xie

With CRPD, a unified detection and recognition network with high efficiency is presented as the baseline.

License Plate Detection Vocal Bursts Intensity Prediction

Paper
Code

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

no code implementations • 8 Jul 2021 • Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma

In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs.

Paper
Add Code

An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural Network

no code implementations • 25 May 2021 • Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang

frequency in DNN training.

Paper
Add Code

Unsupervised domain adaptation via coarse-to-fine feature alignment method using contrastive learning

no code implementations • 23 Mar 2021 • Shiyu Tang, Peijun Tang, Yanxiang Gong, Zheng Ma, Mei Xie

It draws class-wise features closer than coarse feature alignment or class-wise feature alignment only, therefore improves the model's performance to a great extent.

Contrastive Learning Semantic Segmentation +1

Paper
Add Code

Network Pruning via Resource Reallocation

1 code implementation • 2 Mar 2021 • Yuenan Hou, Zheng Ma, Chunxiao Liu, Zhe Wang, Chen Change Loy

Channel pruning is broadly recognized as an effective approach to obtain a small compact model through eliminating unimportant channels from a large cumbersome network.

Network Pruning

478

Paper
Code

Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

no code implementations • 30 Jan 2021 • Yaoyu Zhang, Tao Luo, Zheng Ma, Zhi-Qin John Xu

Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question.

Open-Ended Question Answering

Paper
Add Code

Fourier-domain Variational Formulation and Its Well-posedness for Supervised Learning

no code implementations • 6 Dec 2020 • Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang

A supervised learning problem is to find a function in a hypothesis function space given values on isolated data points.

Paper
Add Code

On the exact computation of linear frequency principle dynamics and its generalization

1 code implementation • 15 Oct 2020 • Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behavior of DNNs in complex tasks.

Paper
Code

Fully Decentralized Federated Learning Based Beamforming Design for UAV Communications

no code implementations • 27 Jul 2020 • Yue Xiao, Yu Ye, Shaocheng Huang, Li Hao, Zheng Ma, Ming Xiao, Shahid Mumtaz

To handle the data explosion in the era of internet of things (IoT), it is of interest to investigate the decentralized network, with the aim at relaxing the burden to central server along with keeping data privacy.

Signal Processing

Paper
Add Code

Phase diagram for two-layer ReLU neural networks at infinite-width limit

1 code implementation • 15 Jul 2020 • Tao Luo, Zhi-Qin John Xu, Zheng Ma, Yaoyu Zhang

In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization.

Paper
Code

Path Integral Based Convolution and Pooling for Graph Neural Networks

1 code implementation • NeurIPS 2020 • Zheng Ma, Junyu Xuan, Yu Guang Wang, Ming Li, Pietro Lio

Borrowing ideas from physics, we propose a path integral based graph neural networks (PAN) for classification and regression tasks on graphs.

Graph Classification Graph Regression +1

Paper
Code

Inter-Region Affinity Distillation for Road Marking Segmentation

1 code implementation • CVPR 2020 • Yuenan Hou, Zheng Ma, Chunxiao Liu, Tak-Wai Hui, Chen Change Loy

We study the problem of distilling knowledge from a large deep teacher network to a much smaller student network for the task of road marking segmentation.

Ranked #1 on Semantic Segmentation on ApolloScape

Knowledge Distillation Lane Detection +1

112

Paper
Code

What's the relationship between CNNs and communication systems?

no code implementations • 3 Mar 2020 • Hao Ge, Xiaoguang Tu, Yanxiang Gong, Mei Xie, Zheng Ma

The interpretability of Convolutional Neural Networks (CNNs) is an important topic in the field of computer vision.

Paper
Add Code

Defending from adversarial examples with a two-stream architecture

no code implementations • 30 Dec 2019 • Hao Ge, Xiaoguang Tu, Mei Xie, Zheng Ma

We demonstrate that our two-stream architecture is robust to adversarial examples built by currently known attacking algorithms.

Vocal Bursts Valence Prediction

Paper
Add Code

HaarPooling: Graph Pooling with Compressive Haar Basis

no code implementations • 25 Sep 2019 • Yu Guang Wang, Ming Li, Zheng Ma, Guido Montufar, Xiaosheng Zhuang, Yanan Fan

The input of each pooling layer is transformed by the compressive Haar basis of the corresponding clustering.

Graph Classification

Paper
Add Code

Haar Graph Pooling

1 code implementation • ICML 2020 • Yu Guang Wang, Ming Li, Zheng Ma, Guido Montufar, Xiaosheng Zhuang, Yanan Fan

Deep Graph Neural Networks (GNNs) are useful models for graph classification and graph-based regression tasks.

General Classification Graph Classification +1

Paper
Code

STELA: A Real-Time Scene Text Detector with Learned Anchor

1 code implementation • 17 Sep 2019 • Linjie Deng, Yanxiang Gong, Xinchen Lu, Yi Lin, Zheng Ma, Mei Xie

To achieve high coverage of target boxes, a normal strategy of conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks.

Scene Text Detection Text Detection

Paper
Code

Focus-Enhanced Scene Text Recognition with Deformable Convolutions

1 code implementation • 29 Aug 2019 • Linjie Deng, Yanxiang Gong, Xinchen Lu, Xin Yi, Zheng Ma, Mei Xie

Recently, scene text recognition methods based on deep learning have sprung up in computer vision area.

Scene Text Recognition

Paper
Code

Learning Lightweight Lane Detection CNNs by Self Attention Distillation

2 code implementations • ICCV 2019 • Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy

Training deep models for lane detection is challenging due to the very subtle and sparse supervisory signals inherent in lane annotations.

Ranked #5 on Lane Detection on BDD100K val

Knowledge Distillation Lane Detection +1

1,023

Paper
Code

Fast Haar Transforms for Graph Neural Networks

no code implementations • 10 Jul 2019 • Ming Li, Zheng Ma, Yu Guang Wang, Xiaosheng Zhuang

Graph Neural Networks (GNNs) have become a topic of intense research recently due to their powerful capability in high-dimensional classification and regression tasks for graph-structured data.

General Classification Node Classification +1

Paper
Add Code

Theory of the Frequency Principle for General Deep Neural Networks

1 code implementation • 21 Jun 2019 • Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training.

Paper
Code

Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks

1 code implementation • 24 May 2019 • Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

It remains a puzzle that why deep neural networks (DNNs), with more parameters than samples, often generalize well.

Paper
Code

A type of generalization error induced by initialization in deep neural networks

no code implementations • 19 May 2019 • Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Overall, our work serves as a baseline for the further investigation of the impact of initialization and loss function on the generalization of DNNs, which can potentially guide and improve the training of DNNs in practice.

Paper
Add Code

PAN: Path Integral Based Convolution for Deep Graph Neural Networks

no code implementations • 24 Apr 2019 • Zheng Ma, Ming Li, Yuguang Wang

In this paper, we propose PAN, a new graph convolution framework that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk.

Paper
Add Code

3D Face Reconstruction from A Single Image Assisted by 2D Face Images in the Wild

2 code implementations • 22 Mar 2019 • Xiaoguang Tu, Jian Zhao, Zi-Hang Jiang, Yao Luo, Mei Xie, Yang Zhao, Linxiao He, Zheng Ma, Jiashi Feng

3D face reconstruction from a single 2D image is a challenging problem with broad applications.

Ranked #7 on Face Alignment on AFLW2000-3D

3D Face Reconstruction Face Alignment +2

463

Paper
Code

A Dual Symmetric Gauss-Seidel Alternating Direction Method of Multipliers for Hyperspectral Sparse Unmixing

no code implementations • 25 Feb 2019 • Longfei Ren, Chengjing Wang, Peipei Tang, Zheng Ma

Since sparse unmixing has emerged as a promising approach to hyperspectral unmixing, some spatial-contextual information in the hyperspectral images has been exploited to improve the performance of the unmixing recently.

Hyperspectral Unmixing

Paper
Add Code

Generating Text Sequence Images for Recognition

1 code implementation • 21 Jan 2019 • Yanxiang Gong, Linjie Deng, Zheng Ma, Mei Xie

Recently, methods based on deep learning have dominated the field of text recognition.

Image-to-Image Translation Translation

Paper
Code

Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks

3 code implementations • 19 Jan 2019 • Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma

We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective.

Paper
Code

Deep Transfer Across Domains for Face Anti-spoofing

no code implementations • 17 Jan 2019 • Xiaoguang Tu, Hengsheng Zhang, Mei Xie, Yao Luo, Yuefei Zhang, Zheng Ma

We propose a CNN framework using sparsely labeled data from the target domain to learn features that are invariant across domains for face anti-spoofing.

Face Anti-Spoofing Face Recognition

Paper
Add Code

Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing

1 code implementation • 17 Jan 2019 • Xiaoguang Tu, Jian Zhao, Mei Xie, Guodong Du, Hengsheng Zhang, Jianshu Li, Zheng Ma, Jiashi Feng

Face anti-spoofing (a. k. a presentation attack detection) has drawn growing attention due to the high-security demand in face authentication systems.

Ranked #2 on Face Anti-Spoofing on MSU-MFSD

Domain Adaptation Face Anti-Spoofing +1

Paper
Code

Enhance the Motion Cues for Face Anti-Spoofing using CNN-LSTM Architecture

no code implementations • 17 Jan 2019 • Xiaoguang Tu, Hengsheng Zhang, Mei Xie, Yao Luo, Yuefei Zhang, Zheng Ma

Spatio-temporal information is very important to capture the discriminative cues between genuine and fake faces from video sequences.

Face Anti-Spoofing Motion Magnification

Paper
Add Code

NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning

no code implementations • ICLR 2019 • Sirui Xie, Junning Huang, Lanxin Lei, Chunxiao Liu, Zheng Ma, Wei zhang, Liang Lin

Reinforcement learning agents need exploratory behaviors to escape from local optima.

Continuous Control reinforcement-learning +1

Paper
Add Code

Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks

2 code implementations • 7 Nov 2018 • Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy

In this paper, we considerably improve the accuracy and robustness of predictions through heterogeneous auxiliary networks feature mimicking, a new and effective training method that provides us with much richer contextual signals apart from steering direction.

Ranked #1 on Steering Control on BDD100K val

Image Segmentation Multi-Task Learning +3

Paper
Code

Detecting Multi-Oriented Text with Corner-based Region Proposals

1 code implementation • 8 Apr 2018 • Linjie Deng, Yanxiang Gong, Yi Lin, Jingwen Shuai, Xiaoguang Tu, Yuefei Zhang, Zheng Ma, Mei Xie

Previous approaches for scene text detection usually rely on manually defined sliding windows.

Ranked #1 on Scene Text Detection on COCO-Text

Data Augmentation Robust classification +2

140

Paper
Code

Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking

1 code implementation • 29 May 2017 • Di Kang, Zheng Ma, Antoni B. Chan

The goal of this paper is to evaluate density maps generated by density estimation methods on a variety of crowd analysis tasks, including counting, detection, and tracking.

Density Estimation regression

Paper
Code

Improve Sentiment Analysis of Citations with Author Modelling

no code implementations • WS 2016 • Zheng Ma, Jinseok Nam, Karsten Weihe

Sentiment Analysis

Paper
Add Code

Small Instance Detection by Integer Programming on Object Density Maps

no code implementations • CVPR 2015 • Zheng Ma, Lei Yu, Antoni B. Chan

For each region, a sliding window (ROI) is passed over the density map to calculate the instance count within each ROI.

Novel Object Detection Object +2

Paper
Add Code

Crossing the Line: Crowd Counting by Integer Programming with Local Features

no code implementations • CVPR 2013 • Zheng Ma, Antoni B. Chan

Next, the number of people is estimated in a set of overlapping sliding windows on the temporal slice image, using a regression function that maps from local features to a count.

Crowd Counting

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.