Search Results for author: Chao Ma

Found 132 papers, 49 papers with code

A Mean Field Analysis Of Deep ResNet And Beyond: Towards Provably Optimization Via Overparameterization From Depth

no code implementations ICML 2020 Yiping Lu, Chao Ma, Yulong Lu, Jianfeng Lu, Lexing Ying

Specifically, we propose a \textbf{new continuum limit} of deep residual networks, which enjoys a good landscape in the sense that \textbf{every local minimizer is global}.

OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

no code implementations23 Apr 2024 Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

Existing solutions for 3D semantic occupancy prediction typically treat the task as a one-shot 3D voxel-wise segmentation perception problem.

3D Semantic Occupancy Prediction Autonomous Driving +1

SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction

no code implementations15 Apr 2024 Pin Tang, Zhongdao Wang, Guoqing Wang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

Vision-based perception for autonomous driving requires an explicit modeling of a 3D space, where 2D latent representations are mapped and subsequent 3D operators are applied.

Autonomous Driving

FiP: a Fixed-Point Approach for Causal Generative Modeling

no code implementations10 Apr 2024 Meyer Scetbon, Joel Jennings, Agrin Hilmkil, Cheng Zhang, Chao Ma

Based on this, we design a two-stage causal generative model that first infers the causal order from observations in a zero-shot manner, thus by-passing the search, and then learns the generative fixed-point SCM on the ordered variables.

Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane

no code implementations24 Mar 2024 Han Yan, Yang Li, Zhennan Wu, Shenzhou Chen, Weixuan Sun, Taizhang Shang, Weizhe Liu, Tian Chen, Xiaqiang Dai, Chao Ma, Hongdong Li, Pan Ji

We present Frankenstein, a diffusion-based framework that can generate semantic-compositional 3D scenes in a single pass.

Denoising

Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation

1 code implementation16 Feb 2024 Ziyang Wang, Chao Ma

Medical image segmentation is increasingly reliant on deep learning techniques, yet the promising performance often come with high annotation costs.

Cardiac Segmentation Decoder +4

Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation

1 code implementation11 Feb 2024 Chao Ma, Ziyang Wang

Medical image segmentation is essential in diagnostics, treatment planning, and healthcare, with deep learning offering promising advancements.

Cardiac Segmentation Contrastive Learning +4

The Essential Role of Causality in Foundation World Models for Embodied AI

no code implementations6 Feb 2024 Tarun Gupta, Wenbo Gong, Chao Ma, Nick Pawlowski, Agrin Hilmkil, Meyer Scetbon, Marc Rigter, Ade Famoti, Ashley Juan Llorens, Jianfeng Gao, Stefan Bauer, Danica Kragic, Bernhard Schölkopf, Cheng Zhang

The study of causality lends itself to the construction of veridical world models, which are crucial for accurately predicting the outcomes of possible interactions.

Misconceptions

Vision-Informed Flow Image Super-Resolution with Quaternion Spatial Modeling and Dynamic Flow Convolution

no code implementations29 Jan 2024 Qinglong Cao, Zhengqin Xu, Chao Ma, Xiaokang Yang, Yuntian Chen

To tackle this dilemma, we comprehensively consider the flow visual properties, including the unique flow imaging principle and morphological information, and propose the first flow visual property-informed FISR algorithm.

Image Super-Resolution

Correlation-Embedded Transformer Tracking: A Single-Branch Framework

1 code implementation23 Jan 2024 Fei Xie, Wankou Yang, Chunyu Wang, Lei Chu, Yue Cao, Chao Ma, Wenjun Zeng

Thus, we reformulate the two-branch Siamese tracking as a conceptually simple, fully transformer-based Single-Branch Tracking pipeline, dubbed SBT.

Feature Correlation Visual Object Tracking

Understanding the Generalization Benefits of Late Learning Rate Decay

no code implementations21 Jan 2024 Yinuo Ren, Chao Ma, Lexing Ying

Why do neural networks trained with large learning rates for a longer time often lead to better generalization?

VidToMe: Video Token Merging for Zero-Shot Video Editing

no code implementations17 Dec 2023 Xirui Li, Chao Ma, Xiaokang Yang, Ming-Hsuan Yang

In this work, we propose a novel approach to enhance temporal consistency in generated videos by merging self-attention tokens across frames.

Video Editing Video Generation

Domain Prompt Learning with Quaternion Networks

no code implementations12 Dec 2023 Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang

Specifically, the proposed method involves using domain-specific vision features from domain-specific foundation models to guide the transformation of generalized contextual embeddings from the language branch into a specialized space within the quaternion networks.

Contrastive Learning

ODM3D: Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D Object Detection

1 code implementation28 Oct 2023 Weijia Zhang, Dongnan Liu, Chao Ma, Weidong Cai

Monocular 3D object detection (M3OD) is a significant yet inherently challenging task in autonomous driving due to absence of explicit depth cues in a single RGB image.

Autonomous Driving Data Augmentation +5

Towards Causal Foundation Model: on Duality between Causal Inference and Attention

no code implementations1 Oct 2023 JiaQi Zhang, Joel Jennings, Agrin Hilmkil, Nick Pawlowski, Cheng Zhang, Chao Ma

Foundation models have brought changes to the landscape of machine learning, demonstrating sparks of human-level intelligence across a diverse array of tasks.

Causal Inference

Domain-Controlled Prompt Learning

1 code implementation30 Sep 2023 Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang

Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms, leading to suboptimal performance due to the misinterpretation of specific images in natural image patterns.

Frame Fusion with Vehicle Motion Prediction for 3D Object Detection

no code implementations19 Jun 2023 Xirui Li, Feng Wang, Naiyan Wang, Chao Ma

To ''forward'' frames, we use vehicle motion models to estimate the future pose of the bounding boxes.

3D Object Detection Future prediction +3

Reflection Invariance Learning for Few-shot Semantic Segmentation

no code implementations1 Jun 2023 Qinglong Cao, Yuntian Chen, Chao Ma, Xiaokang Yang

Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images.

Few-Shot Semantic Segmentation Segmentation +1

Few-Shot Rotation-Invariant Aerial Image Semantic Segmentation

1 code implementation29 May 2023 Qinglong Cao, Yuntian Chen, Chao Ma, Xiaokang Yang

Few-shot aerial image segmentation is a challenging task that involves precisely parsing objects in query aerial images with limited annotated support.

Image Segmentation Segmentation +1

Understanding Causality with Large Language Models: Feasibility and Opportunities

no code implementations11 Apr 2023 Cheng Zhang, Stefan Bauer, Paul Bennett, Jiangfeng Gao, Wenbo Gong, Agrin Hilmkil, Joel Jennings, Chao Ma, Tom Minka, Nick Pawlowski, James Vaughan

We assess the ability of large language models (LLMs) to answer causal questions by analyzing their strengths and weaknesses against three types of causal question.

Decision Making

Causal Reasoning in the Presence of Latent Confounders via Neural ADMG Learning

1 code implementation22 Mar 2023 Matthew Ashman, Chao Ma, Agrin Hilmkil, Joel Jennings, Cheng Zhang

In this work, we further extend the existing body of work and develop a novel gradient-based approach to learning an ADMG with non-linear functional relations from observational data.

Pillar R-CNN for Point Cloud 3D Object Detection

1 code implementation26 Feb 2023 Guangsheng Shi, Ruifeng Li, Chao Ma

The performance of point cloud 3D object detection hinges on effectively representing raw points, grid-based voxels or pillars.

3D Object Detection Autonomous Driving +1

Causal-Discovery Performance of ChatGPT in the context of Neuropathic Pain Diagnosis

no code implementations24 Jan 2023 Ruibo Tu, Chao Ma, Cheng Zhang

ChatGPT has demonstrated exceptional proficiency in natural language conversation, e. g., it can answer a wide range of questions while no previous large language models can.

Causal Discovery

HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness

no code implementations18 Jan 2023 Zongwei Wu, Guillaume Allibert, Fabrice Meriaudeau, Chao Ma, Cédric Demonceaux

In this paper, from a new perspective, we propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.

Decoder object-detection +3

3D-Aware Face Swapping

no code implementations CVPR 2023 Yixuan Li, Chao Ma, Yichao Yan, Wenhan Zhu, Xiaokang Yang

To achieve this, we take advantage of the strong geometry and texture prior of 3D human faces, where the 2D faces are projected into the latent space of a 3D generative model.

Attribute Face Swapping

PlenVDB: Memory Efficient VDB-Based Radiance Fields for Fast Training and Rendering

no code implementations CVPR 2023 Han Yan, Celong Liu, Chao Ma, Xing Mei

VDB takes both the advantages of sparse and dense volumes for compact data representation and efficient data access, being a promising data structure for NeRF data interpolation and ray marching.

ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation

no code implementations ICCV 2023 Pin Tang, Hai-Ming Xu, Chao Ma

Knowledge transfer from multi-modal, i. e., LiDAR points and images, to a single LiDAR modal can take advantage of complimentary information from modal-fusion but keep a single modal inference speed, showing a promising direction for point cloud semantic segmentation in autonomous driving.

Autonomous Driving Point Cloud Segmentation +2

VideoTrack: Learning To Track Objects via Video Transformer

no code implementations CVPR 2023 Fei Xie, Lei Chu, Jiahao Li, Yan Lu, Chao Ma

Existing Siamese tracking methods, which are built on pair-wise matching between two single frames, heavily rely on additional sophisticated mechanism to exploit temporal information among successive video frames, hindering them from high efficiency and industrial deployments.

Visual Tracking

SmartAssign: Learning a Smart Knowledge Assignment Strategy for Deraining and Desnowing

no code implementations CVPR 2023 Yinglong Wang, Chao Ma, Jianzhuang Liu

Extensive experiments on seven benchmark datasets verify that proposed SmartAssign explores effective connection between rain and snow, and improves the performances of both deraining and desnowing apparently.

Multi-Task Learning Rain Removal

Adv-Attribute: Inconspicuous and Transferable Adversarial Attack on Face Recognition

no code implementations13 Oct 2022 Shuai Jia, Bangjie Yin, Taiping Yao, Shouhong Ding, Chunhua Shen, Xiaokang Yang, Chao Ma

For face recognition attacks, existing methods typically generate the l_p-norm perturbations on pixels, however, resulting in low attack transferability and high vulnerability to denoising defense models.

Adversarial Attack Attribute +2

Why self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries

no code implementations13 Oct 2022 Chao Ma, Lexing Ying

The knowledge consists of a set of vectors in the same embedding space as the input sequence, containing the information of the language used to process the input sequence.

The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks

no code implementations7 Oct 2022 Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli

We introduce the class of quasi-homogeneous models, which is expressive enough to describe nearly all neural networks with homogeneous activations, even those with biases, residual connections, and normalization layers, while structured enough to enable geometric analysis of its gradient dynamics.

Removing Rain Streaks via Task Transfer Learning

no code implementations28 Aug 2022 Yinglong Wang, Chao Ma, Jianzhuang Liu

Inspired by our studies, we propose to remove rain by learning favorable deraining representations from other connected tasks.

Knowledge Distillation Rain Removal +1

H2-Stereo: High-Speed, High-Resolution Stereoscopic Video System

no code implementations4 Aug 2022 Ming Cheng, Yiling Xu, Wang Shen, M. Salman Asif, Chao Ma, Jun Sun, Zhan Ma

We utilize a disparity network to transfer spatiotemporal information across views even in large disparity scenes, based on which, we propose disparity-guided flow-based warping for LSR-HFR view and complementary warping for HSR-LFR view.

Super-Resolution Vocal Bursts Intensity Prediction

AiATrack: Attention in Attention for Transformer Visual Tracking

1 code implementation20 Jul 2022 Shenyuan Gao, Chunluan Zhou, Chao Ma, Xinggang Wang, Junsong Yuan

However, the independent correlation computation in the attention mechanism could result in noisy and ambiguous attention weights, which inhibits further performance improvement.

Visual Object Tracking Visual Tracking

Depth-Adapted CNNs for RGB-D Semantic Segmentation

no code implementations8 Jun 2022 Zongwei Wu, Guillaume Allibert, Christophe Stolz, Chao Ma, Cédric Demonceaux

Recent RGB-D semantic segmentation has motivated research interest thanks to the accessibility of complementary modalities from the input side.

Segmentation Semantic Segmentation

Generalization Error Bounds for Deep Neural Networks Trained by SGD

no code implementations7 Jun 2022 Mingze Wang, Chao Ma

Generalization error bounds for deep neural networks trained by stochastic gradient descent (SGD) are derived by combining a dynamical control of an appropriate parameter norm and the Rademacher complexity estimate based on parameter norms.

Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks

1 code implementation5 Jun 2022 Mingze Wang, Chao Ma

The convergence of GD and SGD when training mildly parameterized neural networks starting from random initialization is studied.

Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes

no code implementations24 Apr 2022 Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying

Numerically, we observe that neural network loss functions possesses a multiscale structure, manifested in two ways: (1) in a neighborhood of minima, the loss mixes a continuum of scales and grows subquadratically, and (2) in a larger region, the loss shows several separate scales clearly.

Facial Geometric Detail Recovery via Implicit Representation

1 code implementation18 Mar 2022 Xingyu Ren, Alexandros Lattas, Baris Gecer, Jiankang Deng, Chao Ma, Xiaokang Yang, Stefanos Zafeiriou

Learning a dense 3D model with fine-scale details from a single facial image is highly challenging and ill-posed.

Provably convergent quasistatic dynamics for mean-field two-player zero-sum games

no code implementations ICLR 2022 Chao Ma, Lexing Ying

In this paper, we study the problem of finding mixed Nash equilibrium for mean-field two-player zero-sum games.

Missing Data Imputation and Acquisition with Deep Hierarchical Models and Hamiltonian Monte Carlo

1 code implementation9 Feb 2022 Ignacio Peis, Chao Ma, José Miguel Hernández-Lobato

Our experiments show that HH-VAEM outperforms existing baselines in the tasks of missing data imputation and supervised learning with missing features.

Active Learning Imputation

Deep End-to-end Causal Inference

1 code implementation4 Feb 2022 Tomas Geffner, Javier Antoran, Adam Foster, Wenbo Gong, Chao Ma, Emre Kiciman, Amit Sharma, Angus Lamb, Martin Kukla, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang

Causal inference is essential for data-driven decision making across domains such as business engagement, medical treatment and policy making.

Causal Discovery Causal Inference +1

End-to-End Reconstruction-Classification Learning for Face Forgery Detection

1 code implementation CVPR 2022 Junyi Cao, Chao Ma, Taiping Yao, Shen Chen, Shouhong Ding, Xiaokang Yang

Reconstruction learning over real images enhances the learned representations to be aware of forgery patterns that are even unknown, while classification learning takes the charge of mining the essential discrepancy between real and fake images, facilitating the understanding of forgeries.

Classification Decoder

Identifiable Generative Models for Missing Not at Random Data Imputation

1 code implementation NeurIPS 2021 Chao Ma, Cheng Zhang

In this work, we fill in this gap by systematically analyzing the identifiability of generative models under MNAR.

Imputation

A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization

no code implementations17 Oct 2021 Chao Ma, Lexing Ying

Later, the infinite-width limit of the two-layer neural networks with BN is considered, and a mean-field formulation is derived for the training dynamics.

Partial Feature Selection and Alignment for Multi-Source Domain Adaptation

no code implementations CVPR 2021 Yangye Fu, Ming Zhang, Xing Xu, Zuo Cao, Chao Ma, Yanli Ji, Kai Zuo, Huimin Lu

By assuming that the source and target domains share consistent key feature representations and identical label space, existing studies on MSDA typically utilize the entire union set of features from both the source and target domains to obtain the feature map and align the map for each category and domain.

feature selection Partial Domain Adaptation

PointAugmenting: Cross-Modal Augmentation for 3D Object Detection

no code implementations CVPR 2021 Chunwei Wang, Chao Ma, Ming Zhu, Xiaokang Yang

On one hand, PointAugmenting decorates point clouds with corresponding point-wise CNN features extracted by pretrained 2D detection models, and then performs 3D object detection over the decorated point clouds.

3D Object Detection Autonomous Driving +4

Multi-Decoding Deraining Network and Quasi-Sparsity Based Training

no code implementations CVPR 2021 Yinglong Wang, Chao Ma, Bing Zeng

In this work, we aim to exploit the intrinsic priors of rainy images and develop intrinsic loss functions to facilitate training deraining networks, which decompose a rainy image into a rain-free background layer and a rainy layer containing intact rain streaks.

Rain Removal

On Perceptual Lossy Compression: The Cost of Perceptual Reconstruction and An Optimal Training Framework

1 code implementation5 Jun 2021 Zeyu Yan, Fei Wen, Rendong Ying, Chao Ma, Peilin Liu

This paper provides nontrivial results theoretically revealing that, \textit{1}) the cost of achieving perfect perception quality is exactly a doubling of the lowest achievable MSE distortion, \textit{2}) an optimal encoder for the "classic" rate-distortion problem is also optimal for the perceptual compression problem, \textit{3}) distortion loss is unnecessary for training a perceptual decoder.

Decoder

On Linear Stability of SGD and Input-Smoothness of Neural Networks

1 code implementation NeurIPS 2021 Chao Ma, Lexing Ying

The multiplicative structure of parameters and input data in the first layer of neural networks is explored to build connection between the landscape of the loss function with respect to parameters and the landscape of the model function with respect to input data.

Adversarial Robustness

Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks

no code implementations30 Mar 2021 Yuqing Li, Tao Luo, Chao Ma

In an attempt to better understand structural benefits and generalization power of deep neural networks, we firstly present a novel graph theoretical formulation of neural network models, including fully connected, residual network (ResNet) and densely connected networks (DenseNet).

Continual Learning for Blind Image Quality Assessment

1 code implementation19 Feb 2021 Weixia Zhang, Dingquan Li, Chao Ma, Guangtao Zhai, Xiaokang Yang, Kede Ma

In this paper, we formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets, building on what was learned from previously seen data.

Blind Image Quality Assessment Continual Learning

Achieving Adversarial Robustness Requires An Active Teacher

no code implementations14 Dec 2020 Chao Ma, Lexing Ying

A new understanding of adversarial examples and adversarial robustness is proposed by decoupling the data generator and the label generator (which we call the teacher).

Adversarial Robustness

Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning

no code implementations22 Nov 2020 Weixia Zhang, Chao Ma, Qi Wu, Xiaokang Yang

We then propose to recursively alternate the learning schemes of imitation and exploration to narrow the discrepancy between training and inference.

Imitation Learning Navigate +1

DEAL: Difficulty-aware Active Learning for Semantic Segmentation

1 code implementation17 Oct 2020 Shuai Xie, Zunlei Feng, Ying Chen, Songtao Sun, Chao Ma, Mingli Song

To deal with this problem, we propose a semantic Difficulty-awarE Active Learning (DEAL) network composed of two branches: the common segmentation branch and the semantic difficulty branch.

Active Learning Segmentation +1

Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning

no code implementations NeurIPS 2020 Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, Weinan E

The result shows that (1) the escaping time of both SGD and ADAM~depends on the Radon measure of the basin positively and the heaviness of gradient noise negatively; (2) for the same basin, SGD enjoys smaller escaping time than ADAM, mainly because (a) the geometry adaptation in ADAM~via adaptively scaling each gradient coordinate well diminishes the anisotropic structure in gradient noise and results in larger Radon measure of a basin; (b) the exponential gradient average in ADAM~smooths its gradient and leads to lighter gradient noise tails than SGD.

DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs

1 code implementation11 Oct 2020 Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, George Karypis

To minimize the overheads associated with distributed computations, DistDGL uses a high-quality and light-weight min-cut graph partitioning algorithm along with multiple balancing constraints.

Fraud Detection graph partitioning

Interpretable Neural Computation for Real-World Compositional Visual Question Answering

no code implementations10 Oct 2020 Ruixue Tang, Chao Ma

There are two main lines of research on visual question answering (VQA): compositional model with explicit multi-hop reasoning, and monolithic network with implicit reasoning in the latent feature space.

Question Answering Visual Question Answering

Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't

no code implementations22 Sep 2020 Weinan E, Chao Ma, Stephan Wojtowytsch, Lei Wu

The purpose of this article is to review the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning.

A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms

no code implementations14 Sep 2020 Chao Ma, Lei Wu, Weinan E

The dynamic behavior of RMSprop and Adam algorithms is studied through a combination of careful numerical experiments and theoretical explanations.

Complexity Measures for Neural Networks with General Activation Functions Using Path-based Norms

no code implementations14 Sep 2020 Zhong Li, Chao Ma, Lei Wu

The approach is motivated by approximating the general activation functions with one-dimensional ReLU networks, which reduces the problem to the complexity controls of ReLU networks.

Cross-Modality 3D Object Detection

no code implementations16 Aug 2020 Ming Zhu, Chao Ma, Pan Ji, Xiaokang Yang

In this paper, we focus on exploring the fusion of images and point clouds for 3D object detection in view of the complementary nature of the two modalities, i. e., images possess more semantic information while point clouds specialize in distance sensing.

3D Classification 3D Object Detection +4

The Slow Deterioration of the Generalization Error of the Random Feature Model

no code implementations13 Aug 2020 Chao Ma, Lei Wu, Weinan E

The random feature model exhibits a kind of resonance behavior when the number of parameters is close to the training sample size.

Rethinking Image Deraining via Rain Streaks and Vapors

1 code implementation ECCV 2020 Yinglong Wang, Yibing Song, Chao Ma, Bing Zeng

Single image deraining regards an input image as a fusion of a background image, a transmission map, rain streaks, and atmosphere light.

Image Generation Image Restoration +1

Robust Tracking against Adversarial Attacks

2 code implementations ECCV 2020 Shuai Jia, Chao Ma, Yibing Song, Xiaokang Yang

On one hand, we add the temporal perturbations into the original video sequences as adversarial examples to greatly degrade the tracking performance.

Adversarial Attack

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering

1 code implementation ECCV 2020 Ruixue Tang, Chao Ma, Wei Emma Zhang, Qi Wu, Xiaokang Yang

However, there are few works studying the data augmentation problem for VQA and none of the existing image based augmentation schemes (such as rotation and flipping) can be directly applied to VQA due to its semantic structure -- an $\langle image, question, answer\rangle$ triplet needs to be maintained correctly.

Adversarial Attack Data Augmentation +2

The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models

1 code implementation25 Jun 2020 Chao Ma, Lei Wu, Weinan E

A numerical and phenomenological study of the gradient descent (GD) algorithm for training two-layer neural network models is carried out for different parameter regimes when the target function can be accurately approximated by a relatively small number of neurons.

DGL-KE: Training Knowledge Graph Embeddings at Scale

1 code implementation18 Apr 2020 Da Zheng, Xiang Song, Chao Ma, Zeyuan Tan, Zihao Ye, Jin Dong, Hao Xiong, Zheng Zhang, George Karypis

Experiments on knowledge graphs consisting of over 86M nodes and 338M edges show that DGL-KE can compute embeddings in 100 minutes on an EC2 instance with 8 GPUs and 30 minutes on an EC2 cluster with 4 machines with 48 cores/machine.

Distributed, Parallel, and Cluster Computing

A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth

no code implementations11 Mar 2020 Yiping Lu, Chao Ma, Yulong Lu, Jianfeng Lu, Lexing Ying

Specifically, we propose a new continuum limit of deep residual networks, which enjoys a good landscape in the sense that every local minimizer is global.

A Mean-field Analysis of Deep ResNet and Beyond:Towards Provable Optimization Via Overparameterization From Depth

no code implementations ICLR Workshop DeepDiffEq 2019 Yiping Lu, Chao Ma, Yulong Lu, Jianfeng Lu, Lexing Ying

Specifically, we propose a \textbf{new continuum limit} of deep residual networks, which enjoys a good landscape in the sense that \textbf{every local minimizer is global}.

Machine Learning from a Continuous Viewpoint

no code implementations30 Dec 2019 Weinan E, Chao Ma, Lei Wu

We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the two-layer neural network model and the residual neural network model, can all be recovered (in a scaled form) as particular discretizations of different continuous formulations.

BIG-bench Machine Learning

The Generalization Error of the Minimum-norm Solutions for Over-parameterized Neural Networks

no code implementations15 Dec 2019 Weinan E, Chao Ma, Lei Wu

We study the generalization properties of minimum-norm solutions for three over-parametrized machine learning models including the random feature model, the two-layer neural network model and the residual network model.

BIG-bench Machine Learning

Deep Image Deraining Via Intrinsic Rainy Image Priors and Multi-scale Auxiliary Decoding

no code implementations25 Nov 2019 Yinglong Wang, Chao Ma, Bing Zeng

Different rain models and novel network structures have been proposed to remove rain streaks from single rainy images.

Rain Removal

Global Convergence of Gradient Descent for Deep Linear Residual Networks

no code implementations NeurIPS 2019 Lei Wu, Qingcan Wang, Chao Ma

We analyze the global convergence of gradient descent for deep linear residual networks by proposing a new initialization: zero-asymmetric (ZAS) initialization.

Semi-supervised 3D Face Reconstruction with Nonlinear Disentangled Representations

no code implementations25 Sep 2019 Zhongpai Gao, Juyong Zhang, Yudong Guo, Chao Ma, Guangtao Zhai, Xiaokang Yang

Moreover, the identity and expression representations are entangled in these models, which hurdles many facial editing applications.

3D Face Reconstruction Facial Editing

Real-Time Correlation Tracking via Joint Model Compression and Transfer

1 code implementation23 Jul 2019 Ning Wang, Wengang Zhou, Yibing Song, Chao Ma, Houqiang Li

In the distillation process, we propose a fidelity loss to enable the student network to maintain the representation capability of the teacher network.

Computational Efficiency Image Classification +4

Deep Single Image Deraining Via Estimating Transmission and Atmospheric Light in rainy Scenes

no code implementations22 Jun 2019 Yinglong Wang, Qinfeng Shi, Ehsan Abbasnejad, Chao Ma, Xiaoping Ma, Bing Zeng

Instead of using the estimated atmospheric light directly to learn a network to calculate transmission, we utilize it as ground truth and design a simple but novel triangle-shaped network structure to learn atmospheric light for every rainy image, then fine-tune the network to obtain a better estimation of atmospheric light during the training of transmission network.

Single Image Deraining

The Barron Space and the Flow-induced Function Spaces for Neural Network Models

no code implementations18 Jun 2019 Weinan E, Chao Ma, Lei Wu

We define the Barron space and show that it is the right space for two-layer neural network models in the sense that optimal direct and inverse approximation theorems hold for functions in the Barron space.

BIG-bench Machine Learning

A Priori Estimates of the Generalization Error for Two-layer Neural Networks

no code implementations ICLR 2019 Lei Wu, Chao Ma, Weinan E

These new estimates are a priori in nature in the sense that the bounds depend only on some norms of the underlying functions to be fitted, not the parameters in the model.

Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections

no code implementations10 Apr 2019 Weinan E, Chao Ma, Qingcan Wang, Lei Wu

In addition, it is also shown that the GD path is uniformly close to the functions given by the related random feature model.

A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

no code implementations8 Apr 2019 Weinan E, Chao Ma, Lei Wu

In the over-parametrized regime, it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels.

Target-Aware Deep Tracking

no code implementations CVPR 2019 Xin Li, Chao Ma, Baoyuan Wu, Zhenyu He, Ming-Hsuan Yang

Despite demonstrated successes for numerous vision tasks, the contributions of using pre-trained deep features for visual tracking are not as significant as that for object recognition.

Object Object Recognition +1

Depth-Aware Video Frame Interpolation

5 code implementations CVPR 2019 Wenbo Bao, Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, Ming-Hsuan Yang

The proposed model then warps the input frames, depth maps, and contextual features based on the optical flow and local interpolation kernels for synthesizing the output frame.

Optical Flow Estimation Video Frame Interpolation

A Priori Estimates of the Population Risk for Residual Networks

no code implementations6 Mar 2019 Weinan E, Chao Ma, Qingcan Wang

An important part of the regularized model is the usage of a new path norm, called the weighted path norm, as the regularization term.

How SGD Selects the Global Minima in Over-parameterized Learning: A Dynamical Stability Perspective

1 code implementation NeurIPS 2018 Lei Wu, Chao Ma, Weinan E

The question of which global minima are accessible by a stochastic gradient decent (SGD) algorithm with specific learning rate and batch size is studied from the perspective of dynamical stability.

A Priori Estimates of the Population Risk for Two-layer Neural Networks

no code implementations ICLR 2019 Weinan E, Chao Ma, Lei Wu

New estimates for the population risk are established for two-layer neural networks.

Deep Attentive Tracking via Reciprocative Learning

no code implementations NeurIPS 2018 Shi Pu, Yibing Song, Chao Ma, Honggang Zhang, Ming-Hsuan Yang

Visual attention, derived from cognitive neuroscience, facilitates human perception on the most pertinent subset of the sensory data.

Visual Tracking

Person-Job Fit: Adapting the Right Talent for the Right Job with Joint Representation Learning

no code implementations8 Oct 2018 Chen Zhu, HengShu Zhu, Hui Xiong, Chao Ma, Fang Xie, Pengliang Ding, Pan Li

To this end, in this paper, we propose a novel end-to-end data-driven model based on Convolutional Neural Network (CNN), namely Person-Job Fit Neural Network (PJFNN), for matching a talent qualification to the requirements of a job.

Data Visualization Representation Learning

EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE

1 code implementation ICLR 2019 Chao Ma, Sebastian Tschiatschek, Konstantina Palla, José Miguel Hernández-Lobato, Sebastian Nowozin, Cheng Zhang

Many real-life decision-making situations allow further relevant information to be acquired at a specific cost, for example, in assessing the health status of a patient we may decide to take additional measurements such as diagnostic tests or imaging scans before making a final assessment.

Decision Making Experimental Design +1

Deep Regression Tracking with Shrinkage Loss

1 code implementation ECCV 2018 Xiankai Lu, Chao Ma, Bingbing Ni, Xiaokang Yang, Ian Reid, Ming-Hsuan Yang

Regression trackers directly learn a mapping from regularly dense samples of target objects to soft labels, which are usually generated by a Gaussian function, to estimate target positions.

regression

Model Reduction with Memory and the Machine Learning of Dynamical Systems

no code implementations10 Aug 2018 Chao Ma, Jianchun Wang, Weinan E

The well-known Mori-Zwanzig theory tells us that model reduction leads to memory effect.

BIG-bench Machine Learning

Joint Neural Entity Disambiguation with Output Space Search

no code implementations COLING 2018 Hamed Shahbazi, Xiaoli Z. Fern, Reza Ghaeini, Chao Ma, Rasha Obeidat, Prasad Tadepalli

In this paper, we present a novel model for entity disambiguation that combines both local contextual information and global evidences through Limited Discrepancy Search (LDS).

Entity Disambiguation

Variational Implicit Processes

1 code implementation6 Jun 2018 Chao Ma, Yingzhen Li, José Miguel Hernández-Lobato

We introduce the implicit processes (IPs), a stochastic process that places implicitly defined multivariate distributions over any finite collections of random variables.

Gaussian Processes Stochastic Optimization

VITAL: VIsual Tracking via Adversarial Learning

no code implementations CVPR 2018 Yibing Song, Chao Ma, Xiaohe Wu, Lijun Gong, Linchao Bao, WangMeng Zuo, Chunhua Shen, Rynson Lau, Ming-Hsuan Yang

To augment positive samples, we use a generative network to randomly generate masks, which are applied to adaptively dropout input features to capture a variety of appearance changes.

General Classification Visual Tracking

CREST: Convolutional Residual Learning for Visual Tracking

no code implementations ICCV 2017 Yibing Song, Chao Ma, Lijun Gong, Jiawei Zhang, Rynson Lau, Ming-Hsuan Yang

Our method integrates feature extraction, response map generation as well as model update into the neural networks for an end-to-end training.

Visual Tracking

Visual Question Answering with Memory-Augmented Networks

no code implementations CVPR 2018 Chao Ma, Chunhua Shen, Anthony Dick, Qi Wu, Peng Wang, Anton Van Den Hengel, Ian Reid

In this paper, we exploit a memory-augmented neural network to predict accurate answers to visual questions, even when those answers occur rarely in the training set.

Question Answering Visual Question Answering

Robust Visual Tracking via Hierarchical Convolutional Features

1 code implementation12 Jul 2017 Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

Specifically, we learn adaptive correlation filters on the outputs from each convolutional layer to encode the target appearance.

Object Recognition Visual Tracking

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

1 code implementation7 Jul 2017 Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

Second, we learn a correlation filter over a feature pyramid centered at the estimated target position for predicting scale changes.

Object Tracking Position

Person Re-Identification via Recurrent Feature Aggregation

1 code implementation23 Jan 2017 Yichao Yan, Bingbing Ni, Zhichao Song, Chao Ma, Yan Yan, Xiaokang Yang

We address the person re-identification problem by effectively exploiting a globally discriminative feature representation from a sequence of tracked human regions/patches.

Patch Matching Person Re-Identification

Learning a No-Reference Quality Metric for Single-Image Super-Resolution

2 code implementations18 Dec 2016 Chao Ma, Chih-Yuan Yang, Xiaokang Yang, Ming-Hsuan Yang

Numerous single-image super-resolution algorithms have been proposed in the literature, but few studies address the problem of performance evaluation based on visual perception.

Image Super-Resolution Video Quality Assessment

Deep Extreme Feature Extraction: New MVA Method for Searching Particles in High Energy Physics

no code implementations24 Mar 2016 Chao Ma, Tianchenghou, Bin Lan, Jinhui Xu, Zhenhua Zhang

Experimental data shows that, DEFE is able to train an ensemble of discriminative feature learners that boosts the overperformance of final prediction.

Ensemble Learning

Hierarchical Convolutional Features for Visual Tracking

no code implementations ICCV 2015 Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

The outputs of the last convolutional layers encode the semantic information of targets and such representations are robust to significant appearance variations.

Object Recognition Visual Object Tracking +1

Long-Term Correlation Tracking

no code implementations CVPR 2015 Chao Ma, Xiaokang Yang, Chongyang Zhang, Ming-Hsuan Yang

In this paper, we address the problem of long-term visual tracking where the target objects undergo significant appearance variation due to deformation, abrupt motion, heavy occlusion and out-of-the-view.

Translation Visual Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.