Search Results for author: Zhiyuan Li

Found 72 papers, 22 papers with code

Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent

no code implementations • 17 Apr 2024 • Wei Chen, Zhiyuan Li

A multimodal AI agent is characterized by its ability to process and learn from various types of data, including natural language, visual, and audio inputs, to inform its actions.

Paper
Add Code

Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization

no code implementations • 5 Apr 2024 • Shuo Xie, Zhiyuan Li

Adam with decoupled weight decay, also known as AdamW, is widely acclaimed for its superior performance in language modeling tasks, surpassing Adam with $\ell_2$ regularization in terms of generalization and optimization.

Language Modelling

Paper
Add Code

Octopus: On-device language model for function calling of software APIs

no code implementations • 2 Apr 2024 • Wei Chen, Zhiyuan Li, Mingyuan Ma

In the rapidly evolving domain of artificial intelligence, Large Language Models (LLMs) play a crucial role due to their advanced text processing and generation abilities.

Language Modelling

Paper
Add Code

Octopus v2: On-device language model for super agent

no code implementations • 2 Apr 2024 • Wei Chen, Zhiyuan Li

Current on-device models for function calling face issues with latency and accuracy.

Language Modelling

Paper
Add Code

Traditional Transformation Theory Guided Model for Learned Image Compression

no code implementations • 24 Feb 2024 • Zhiyuan Li, Chenyang Ge, Shun Li

Recently, many deep image compression methods have been proposed and achieved remarkable performance.

Image Compression

Paper
Add Code

Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

no code implementations • 20 Feb 2024 • Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma

Given input length $n$, previous works have shown that constant-depth transformers with finite precision $\mathsf{poly}(n)$ embedding size can only solve problems in $\mathsf{TC}^0$ without CoT.

Paper
Add Code

AgentMixer: Multi-Agent Correlated Policy Factorization

no code implementations • 16 Jan 2024 • Zhiyuan Li, Wenshuai Zhao, Lijun Wu, Joni Pajarinen

Inspired by the concept of correlated equilibrium, we propose to introduce a \textit{strategy modification} to provide a mechanism for agents to correlate their policies.

Imitation Learning Multi-agent Reinforcement Learning

Paper
Add Code

Joint Self-Supervised and Supervised Contrastive Learning for Multimodal MRI Data: Towards Predicting Abnormal Neurodevelopment

no code implementations • 22 Dec 2023 • Zhiyuan Li, Hailong Li, Anca L. Ralescu, Jonathan R. Dillman, Mekibib Altaye, Kim M. Cecil, Nehal A. Parikh, Lili He

The integration of different imaging modalities, such as structural, diffusion tensor, and functional magnetic resonance imaging, with deep learning models has yielded promising outcomes in discerning phenotypic characteristics and enhancing disease diagnosis.

Contrastive Learning

Paper
Add Code

Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking

1 code implementation • 30 Nov 2023 • Kaifeng Lyu, Jikai Jin, Zhiyuan Li, Simon S. Du, Jason D. Lee, Wei Hu

Recent work by Power et al. (2022) highlighted a surprising "grokking" phenomenon in learning arithmetic tasks: a neural net first "memorizes" the training set, resulting in perfect training accuracy but near-random test accuracy, and after training for sufficiently longer, it suddenly transitions to perfect test accuracy.

Paper
Code

A Coefficient Makes SVRG Effective

1 code implementation • 9 Nov 2023 • Yida Yin, Zhiqiu Xu, Zhiyuan Li, Trevor Darrell, Zhuang Liu

Stochastic Variance Reduced Gradient (SVRG), introduced by Johnson & Zhang (2013), is a theoretically compelling optimization method.

Image Classification

Paper
Code

Complex Organ Mask Guided Radiology Report Generation

1 code implementation • 4 Nov 2023 • Tiancheng Gu, Dongnan Liu, Zhiyuan Li, Weidong Cai

The goal of automatic report generation is to generate a clinically accurate and coherent phrase from a single given X-ray image, which could alleviate the workload of traditional radiology reporting.

Medical Report Generation

Paper
Code

Optimistic Multi-Agent Policy Gradient for Cooperative Tasks

1 code implementation • 3 Nov 2023 • Wenshuai Zhao, Yi Zhao, Zhiyuan Li, Juho Kannala, Joni Pajarinen

However, with function approximation optimism can amplify overestimation and thus fail on complex tasks.

Q-Learning

Paper
Code

Distributionally Robust Optimization and Invariant Representation Learning for Addressing Subgroup Underrepresentation: Mechanisms and Limitations

no code implementations • 12 Aug 2023 • Nilesh Kumar, Ruby Shrestha, Zhiyuan Li, Linwei Wang

Spurious correlation caused by subgroup underrepresentation has received increasing attention as a source of bias that can be perpetuated by deep neural networks (DNNs).

Image Classification Medical Image Classification +1

Paper
Add Code

Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation

1 code implementation • 27 Jul 2023 • Zhiyuan Li, Dongnan Liu, Heng Wang, Chaoyi Zhang, Weidong Cai

We further show that with a simple extension, the generated pseudo sentences can be deployed as weak supervision to boost the 1% semi-supervised image caption benchmark up to 93. 4 CIDEr score (+8. 9) which showcases the versatility and effectiveness of our approach.

Image Captioning Model Optimization +2

Paper
Code

The Marginal Value of Momentum for Small Learning Rate SGD

no code implementations • 27 Jul 2023 • Runzhe Wang, Sadhika Malladi, Tianhao Wang, Kaifeng Lyu, Zhiyuan Li

Momentum is known to accelerate the convergence of gradient descent in strongly convex settings without stochastic gradient noise.

Stochastic Optimization

Paper
Add Code

Prototype-Driven and Multi-Expert Integrated Multi-Modal MR Brain Tumor Image Segmentation

1 code implementation • 22 Jul 2023 • Yafei Zhang, Zhiyuan Li, Huafeng Li, Dapeng Tao

To this end, a multi-modal MR brain tumor segmentation method with tumor prototype-driven and multi-expert integration is proposed.

Brain Tumor Segmentation Image Segmentation +2

Paper
Code

The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

no code implementations • 22 Jun 2023 • Khashayar Gatmiry, Zhiyuan Li, Ching-Yao Chuang, Sashank Reddi, Tengyu Ma, Stefanie Jegelka

Recent works on over-parameterized neural networks have shown that the stochasticity in optimizers has the implicit regularization effect of minimizing the sharpness of the loss function (in particular, the trace of its Hessian) over the family zero-loss solutions.

Inductive Bias

Paper
Add Code

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

3 code implementations • 23 May 2023 • Hong Liu, Zhiyuan Li, David Hall, Percy Liang, Tengyu Ma

Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training.

Language Modelling Stochastic Optimization

886

Paper
Code

Sequential Latent Variable Models for Few-Shot High-Dimensional Time-Series Forecasting

1 code implementation • ICLR 2023 • Xiajun Jiang, Ryan Missel, Zhiyuan Li, Linwei Wang

We compared the presented framework with a comprehensive set of baseline models trained 1) globally on the large meta-training set with diverse dynamics, and 2) individually on single dynamics, both with and without fine-tuning to k-shot support series used by the meta-models.

Meta-Learning Time Series +1

Paper
Code

Learning Empirical Bregman Divergence for Uncertain Distance Representation

no code implementations • 16 Apr 2023 • Zhiyuan Li, Ziru Liu, Anna Zou, Anca L. Ralescu

Deep metric learning techniques have been used for visual representation in various supervised and unsupervised learning tasks through learning embeddings of samples with deep networks.

Metric Learning

Paper
Add Code

A Novel Collaborative Self-Supervised Learning Method for Radiomic Data

1 code implementation • 20 Feb 2023 • Zhiyuan Li, Hailong Li, Anca L. Ralescu, Jonathan R. Dillman, Nehal A. Parikh, Lili He

We compared our proposed method with other state-of-the-art self-supervised learning methods on a simulation study and two independent datasets.

Self-Supervised Learning

Paper
Code

Learning Generalized Hybrid Proximity Representation for Image Recognition

no code implementations • 31 Jan 2023 • Zhiyuan Li, Anca Ralescu

Recently, deep metric learning techniques received attention, as the learned distance representations are useful to capture the similarity relationship among samples and further improve the performance of various of supervised or unsupervised learning tasks.

Metric Learning

Paper
Add Code

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing

no code implementations • 27 Jan 2023 • Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon S. Du, Jason D. Lee

It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in training machine learning models.

Incremental Learning

Paper
Add Code

How Does Sharpness-Aware Minimization Minimize Sharpness?

no code implementations • 10 Nov 2022 • Kaiyue Wen, Tengyu Ma, Zhiyuan Li

SAM intends to penalize a notion of sharpness of the model but implements a computationally efficient variant; moreover, a third notion of sharpness was used for proving generalization guarantees.

Paper
Add Code

Interpretable Modeling and Reduction of Unknown Errors in Mechanistic Operators

no code implementations • 2 Nov 2022 • Maryam Toloubidokhti, Nilesh Kumar, Zhiyuan Li, Prashnna K. Gyawali, Brian Zenger, Wilson W. Good, Rob S. MacLeod, Linwei Wang

Prior knowledge about the imaging physics provides a mechanistic forward operator that plays an important role in image reconstruction, although myriad sources of possible errors in the operator could negatively impact the reconstruction solutions.

Image Reconstruction

Paper
Add Code

Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models

no code implementations • 25 Oct 2022 • Hong Liu, Sang Michael Xie, Zhiyuan Li, Tengyu Ma

Toward understanding this implicit bias, we prove that SGD with standard mini-batch noise implicitly prefers flatter minima in language models, and empirically observe a strong correlation between flatness and downstream performance among models with the same minimal pre-training loss.

Language Modelling

Paper
Add Code

Few-shot Generation of Personalized Neural Surrogates for Cardiac Simulation via Bayesian Meta-Learning

1 code implementation • 6 Oct 2022 • Xiajun Jiang, Zhiyuan Li, Ryan Missel, Md Shakil Zaman, Brian Zenger, Wilson W. Good, Rob S. MacLeod, John L. Sapp, Linwei Wang

As test time, metaPNS delivers a personalized neural surrogate by fast feed-forward embedding of a small and flexible number of data available from an individual, achieving -- for the first time -- personalization and surrogate construction for expensive simulations in one end-to-end learning framework.

Meta-Learning Variational Inference

Paper
Code

Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent

no code implementations • 8 Jul 2022 • Zhiyuan Li, Tianhao Wang, JasonD. Lee, Sanjeev Arora

Conversely, continuous mirror descent with any Legendre function can be viewed as gradient flow with a related commuting parametrization.

Paper
Add Code

Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction

no code implementations • 14 Jun 2022 • Kaifeng Lyu, Zhiyuan Li, Sanjeev Arora

Normalization layers (e. g., Batch Normalization, Layer Normalization) were introduced to help with optimization difficulties in very deep nets, but they clearly also help generalization, even in not-so-deep nets.

Paper
Add Code

Learning Progress Driven Multi-Agent Curriculum

no code implementations • 20 May 2022 • Wenshuai Zhao, Zhiyuan Li, Joni Pajarinen

Inspired by the success of CRL in single-agent settings, a few works have attempted to apply CRL to multi-agent reinforcement learning (MARL) using the number of agents to control task difficulty.

Multi-agent Reinforcement Learning Open-Ended Question Answering +3

Paper
Add Code

Understanding Gradient Descent on Edge of Stability in Deep Learning

no code implementations • 19 May 2022 • Sanjeev Arora, Zhiyuan Li, Abhishek Panigrahi

The current paper mathematically analyzes a new mechanism of implicit regularization in the EoS phase, whereby GD updates due to non-smooth loss landscape turn out to evolve along some deterministic flow on the manifold of minimum loss.

Paper
Add Code

Interpretability of Neural Network With Physiological Mechanisms

no code implementations • 24 Mar 2022 • Anna Zou, Zhiyuan Li

Deep learning continues to play as a powerful state-of-art technique that has achieved extraordinary accuracy levels in various domains of regression and classification tasks, including images, video, signal, and natural language data.

Paper
Add Code

A Novel Ontology-guided Attribute Partitioning Ensemble Learning Model for Early Prediction of Cognitive Deficits using Quantitative Structural MRI in Very Preterm Infants

no code implementations • 8 Feb 2022 • Zhiyuan Li, Hailong Li, Adebayo Braimah, Jonathan R. Dillman, Nehal A. Parikh, Lili He

We applied the OAP-EL to predict cognitive deficits at 2 years of age using quantitative brain maturation and geometric features obtained at term equivalent age in very preterm infants.

Attribute BIG-bench Machine Learning +1

Paper
Add Code

Robust Training of Neural Networks Using Scale Invariant Architectures

no code implementations • 2 Feb 2022 • Zhiyuan Li, Srinadh Bhojanapalli, Manzil Zaheer, Sashank J. Reddi, Sanjiv Kumar

In contrast to SGD, adaptive gradient methods like Adam allow robust training of modern deep networks, especially large language models.

Paper
Add Code

Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias

no code implementations • NeurIPS 2021 • Kaifeng Lyu, Zhiyuan Li, Runzhe Wang, Sanjeev Arora

The current paper is able to establish this global optimality for two-layer Leaky ReLU nets trained with gradient flow on linearly separable and symmetric data, regardless of the width.

Vocal Bursts Valence Prediction

Paper
Add Code

What Happens after SGD Reaches Zero Loss? --A Mathematical Framework

no code implementations • ICLR 2022 • Zhiyuan Li, Tianhao Wang, Sanjeev Arora

Understanding the implicit bias of Stochastic Gradient Descent (SGD) is one of the key challenges in deep learning, especially for overparametrized models, where the local minimizers of the loss function $L$ can form a manifold.

valid

Paper
Add Code

DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection

1 code implementation • ICCV 2021 • Limeng Qiao, Yuxuan Zhao, Zhiyuan Li, Xi Qiu, Jianan Wu, Chi Zhang

Few-shot object detection, which aims at detecting novel objects rapidly from extremely few annotated examples of previously unseen classes, has attracted significant research interest in the community.

Ranked #4 on Few-Shot Object Detection on MS-COCO (1-shot)

Classification Few-Shot Object Detection +1

177

Paper
Code

Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning

no code implementations • 25 Mar 2021 • Yaqi Duan, Chi Jin, Zhiyuan Li

Concretely, we view the Bellman error as a surrogate loss for the optimality gap, and prove the followings: (1) In double sampling regime, the excess risk of Empirical Risk Minimizer (ERM) is bounded by the Rademacher complexity of the function class.

Learning Theory reinforcement-learning +1

Paper
Add Code

A Two-Stage Variable Selection Approach for Correlated High Dimensional Predictors

no code implementations • 24 Mar 2021 • Zhiyuan Li

To solve the challenge, we propose a two-stage approach that combines a variable clustering stage and a group variable stage for the group variable selection problem.

Clustering Variable Selection +2

Paper
Add Code

On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)

1 code implementation • NeurIPS 2021 • Zhiyuan Li, Sadhika Malladi, Sanjeev Arora

It is generally recognized that finite learning rate (LR), in contrast to infinitesimal LR, is important for good generalization in real-life deep nets.

Paper
Code

A Supernova-driven, Magnetically-collimated Outflow as the Origin of the Galactic Center Radio Bubbles

no code implementations • 26 Jan 2021 • Mengfei Zhang, Zhiyuan Li, Mark R. Morris

Our simulations are run with different combinations of two main parameters, the supernova birth rate and the strength of a global magnetic field being vertically oriented with respect to the disk.

Astrophysics of Galaxies

Paper
Add Code

Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning

no code implementations • ICLR 2021 • Zhiyuan Li, Yuping Luo, Kaifeng Lyu

Matrix factorization is a simple and natural test-bed to investigate the implicit regularization of gradient descent.

Paper
Add Code

Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets?

no code implementations • ICLR 2021 • Zhiyuan Li, Yi Zhang, Sanjeev Arora

However, this has not been made mathematically rigorous, and the hurdle is that the fully connected net can always simulate the convolutional net (for a fixed task).

Image Classification Inductive Bias

Paper
Add Code

Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate

no code implementations • NeurIPS 2020 • Zhiyuan Li, Kaifeng Lyu, Sanjeev Arora

Recent works (e. g., (Li and Arora, 2020)) suggest that the use of popular normalization schemes (including Batch Normalization) in today's deep learning can move it far from a traditional optimization viewpoint, e. g., use of exponentially increasing learning rates.

Paper
Add Code

Learning Geometry-Dependent and Physics-Based Inverse Image Reconstruction

no code implementations • 18 Jul 2020 • Xiajun Jiang, Sandesh Ghimire, Jwala Dhamala, Zhiyuan Li, Prashnna Kumar Gyawali, Linwei Wang

However, many reconstruction problems involve imaging physics that are dependent on the underlying non-Euclidean geometry.

Image Reconstruction

Paper
Add Code

Chemical abundances in Sgr A East: evidence for a type Iax supernova remnant

no code implementations • 26 Jun 2020 • Ping Zhou, Shing-Chi Leung, Zhiyuan Li, Ken'ichi Nomoto, Jacco Vink, Yang Chen

We report evidence that SNR Sgr A East in the Galactic center resulted from a pure turbulent deflagration of a Chandrasekhar-mass carbon-oxygen WD, an explosion mechanism used for type Iax SNe.

High Energy Astrophysical Phenomena

Paper
Add Code

When is Particle Filtering Efficient for Planning in Partially Observed Linear Dynamical Systems?

no code implementations • 10 Jun 2020 • Simon S. Du, Wei Hu, Zhiyuan Li, Ruoqi Shen, Zhao Song, Jiajun Wu

Though errors in past actions may affect the future, we are able to bound the number of particles needed so that the long-run reward of the policy based on particle filtering is close to that based on exact inference.

Decision Making

Paper
Add Code

Semi-supervised Medical Image Classification with Global Latent Mixing

1 code implementation • 22 May 2020 • Prashnna Kumar Gyawali, Sandesh Ghimire, Pradeep Bajracharya, Zhiyuan Li, Linwei Wang

In this work, we argue that regularizing the global smoothness of neural functions by filling the void in between data points can further improve SSL.

General Classification Image Classification +1

Paper
Code

Progressive Learning and Disentanglement of Hierarchical Representations

1 code implementation • ICLR 2020 • Zhiyuan Li, Jaideep Vitthal Murkute, Prashnna Kumar Gyawali, Linwei Wang

By drawing on the respective advantage of hierarchical representation learning and progressive learning, this is to our knowledge the first attempt to improve disentanglement by progressively growing the capacity of VAE to learn hierarchical representations.

Disentanglement

Paper
Code

Implicit Regularization and Convergence for Weight Normalization

no code implementations • NeurIPS 2020 • Xiaoxia Wu, Edgar Dobriban, Tongzheng Ren, Shanshan Wu, Zhiyuan Li, Suriya Gunasekar, Rachel Ward, Qiang Liu

For certain stepsizes of g and w , we show that they can converge close to the minimum norm solution.

Paper
Add Code

Enhanced Convolutional Neural Tangent Kernels

no code implementations • 3 Nov 2019 • Zhiyuan Li, Ruosong Wang, Dingli Yu, Simon S. Du, Wei Hu, Ruslan Salakhutdinov, Sanjeev Arora

An exact algorithm to compute CNTK (Arora et al., 2019) yielded the finding that classification accuracy of CNTK on CIFAR-10 is within 6-7% of that of that of the corresponding CNN architecture (best figure being around 78%) which is interesting performance for a fixed kernel.

Data Augmentation regression

Paper
Add Code

An Exponential Learning Rate Schedule for Deep Learning

no code implementations • ICLR 2020 • Zhiyuan Li, Sanjeev Arora

This paper suggests that the phenomenon may be due to Batch Normalization or BN, which is ubiquitous and provides benefits in optimization and generalization across all standard architectures.

Paper
Add Code

Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

4 code implementations • ICLR 2020 • Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu

On VOC07 testbed for few-shot image classification tasks on ImageNet with transfer learning (Goyal et al., 2019), replacing the linear SVM currently used with a Convolutional NTK SVM consistently improves performance.

Few-Shot Image Classification General Classification +3

Paper
Code

Improving Disentangled Representation Learning with the Beta Bernoulli Process

1 code implementation • 3 Sep 2019 • Prashnna Kumar Gyawali, Zhiyuan Li, Cameron Knight, Sandesh Ghimire, B. Milan Horacek, John Sapp, Linwei Wang

We note that the independence within and the complexity of the latent density are two different properties we constrain when regularizing the posterior density: while the former promotes the disentangling ability of VAE, the latter -- if overly limited -- creates an unnecessary competition with the data reconstruction objective in VAE.

Decision Making Representation Learning

Paper
Code

Semi-Supervised Learning by Disentangling and Self-Ensembling Over Stochastic Latent Space

1 code implementation • 22 Jul 2019 • Prashnna Kumar Gyawali, Zhiyuan Li, Sandesh Ghimire, Linwei Wang

In this work, we hypothesize -- from the generalization perspective -- that self-ensembling can be improved by exploiting the stochasticity of a disentangled latent space.

Data Augmentation Multi-Label Classification +1

Paper
Code

Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets

1 code implementation • NeurIPS 2019 • Rohith Kuditipudi, Xiang Wang, Holden Lee, Yi Zhang, Zhiyuan Li, Wei Hu, Sanjeev Arora, Rong Ge

Mode connectivity is a surprising phenomenon in the loss landscape of deep nets.

Paper
Code

Feature-level and Model-level Audiovisual Fusion for Emotion Recognition in the Wild

no code implementations • 6 Jun 2019 • Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O'Reilly, Shizhong Han, Ping Liu, Min Chen, Yan Tong

In this paper, we proposed two strategies to fuse information extracted from different modalities, i. e., audio and visual.

Emotion Recognition

Paper
Add Code

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

no code implementations • ICLR 2020 • Wei Hu, Zhiyuan Li, Dingli Yu

Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data.

Paper
Add Code

The role of over-parametrization in generalization of neural networks

1 code implementation • ICLR 2019 • Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann Lecun, Nathan Srebro

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization.

Paper
Code

On Exact Computation with an Infinitely Wide Neural Net

2 code implementations • NeurIPS 2019 • Sanjeev Arora, Simon S. Du, Wei Hu, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang

An attraction of such ideas is that a pure kernel-based method is used to capture the power of a fully-trained deep net of infinite width.

Gaussian Processes

108

Paper
Code

Identity-Free Facial Expression Recognition using conditional Generative Adversarial Network

no code implementations • 19 Mar 2019 • Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O'Reilly, Shizhong Han, Yan Tong

A novel Identity-Free conditional Generative Adversarial Network (IF-GAN) was proposed for Facial Expression Recognition (FER) to explicitly reduce high inter-subject variations caused by identity-related facial attributes, e. g., age, race, and gender.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Add Code

Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks

no code implementations • 24 Jan 2019 • Sanjeev Arora, Simon S. Du, Wei Hu, Zhiyuan Li, Ruosong Wang

This paper analyzes training and generalization for a simple 2-layer ReLU net with random initialization, and provides the following improvements over recent works: (i) Using a tighter characterization of training speed than recent papers, an explanation for why training a neural net with random labels leads to slower training, as originally observed in [Zhang et al. ICLR'17].

Paper
Add Code

Probabilistic Attribute Tree in Convolutional Neural Networks for Facial Expression Recognition

no code implementations • 17 Dec 2018 • Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O'Reilly, Yan Tong

In this paper, we proposed a novel Probabilistic Attribute Tree-CNN (PAT-CNN) to explicitly deal with the large intra-class variations caused by identity-related attributes, e. g., age, race, and gender.

Attribute Facial Expression Recognition +1

Paper
Add Code

Theoretical Analysis of Auto Rate-Tuning by Batch Normalization

no code implementations • ICLR 2019 • Sanjeev Arora, Zhiyuan Li, Kaifeng Lyu

Batch Normalization (BN) has become a cornerstone of deep learning across diverse architectures, appearing to help optimization as well as generalization.

Paper
Add Code

Deep Template Matching for Offline Handwritten Chinese Character Recognition

no code implementations • 15 Nov 2018 • Zhiyuan Li, Min Jin, Qi Wu, Huaxiang Lu

Just like its remarkable achievements in many computer vision tasks, the convolutional neural networks (CNN) provide an end-to-end solution in handwritten Chinese character recognition (HCCR) with great success.

Binary Classification Offline Handwritten Chinese Character Recognition +1

Paper
Add Code

Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks

2 code implementations • 30 May 2018 • Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann Lecun, Nathan Srebro

Paper
Code

Online Improper Learning with an Approximation Oracle

no code implementations • NeurIPS 2018 • Elad Hazan, Wei Hu, Yuanzhi Li, Zhiyuan Li

We revisit the question of reducing online learning to approximate optimization of the offline problem.

Paper
Add Code

Building Efficient CNN Architecture for Offline Handwritten Chinese Character Recognition

no code implementations • 4 Apr 2018 • Zhiyuan Li, Nanjun Teng, Min Jin, Huaxiang Lu

Deep convolutional networks based methods have brought great breakthrough in images classification, which provides an end-to-end solution for handwritten Chinese character recognition(HCCR) problem through learning discriminative features automatically.

Offline Handwritten Chinese Character Recognition

Paper
Add Code

Island Loss for Learning Discriminative Features in Facial Expression Recognition

1 code implementation • 9 Oct 2017 • Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O'Reilly, Yan Tong

Over the past few years, Convolutional Neural Networks (CNNs) have shown promise on facial expression recognition.

Ranked #3 on Facial Expression Recognition (FER) on SFEW

Facial Expression Recognition Facial Expression Recognition (FER)

Paper
Code

Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition

no code implementations • CVPR 2018 • Shizhong Han, Zibo Meng, Zhiyuan Li, James O'Reilly, Jie Cai, Xiao-Feng Wang, Yan Tong

Most recently, Convolutional Neural Networks (CNNs) have shown promise for facial AU recognition, where predefined and fixed convolution filter sizes are employed.

Facial Action Unit Detection

Paper
Add Code

Solving Marginal MAP Problems with NP Oracles and Parity Constraints

no code implementations • NeurIPS 2016 • Yexiang Xue, Zhiyuan Li, Stefano Ermon, Carla P. Gomes, Bart Selman

Arising from many applications at the intersection of decision making and machine learning, Marginal Maximum A Posteriori (Marginal MAP) Problems unify the two main classes of inference, namely maximization (optimization) and marginal inference (counting), and are believed to have higher complexity than both of them.

BIG-bench Machine Learning Decision Making

Paper
Add Code

Learning in Games: Robustness of Fast Convergence

no code implementations • NeurIPS 2016 • Dylan J. Foster, Zhiyuan Li, Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We show that learning algorithms satisfying a $\textit{low approximate regret}$ property experience fast convergence to approximate optimality in a large class of repeated games.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.