Search Results for author: Yuandong Tian

Found 107 papers, 63 papers with code

Student Specialization in Deep Rectified Networks With Finite Width and Input Dimension

no code implementations ICML 2020 Yuandong Tian

Under mild conditions on dataset and teacher network, we prove that when the gradient is small at every data sample, each teacher node is \emph{specialized} by at least one student node \emph{at the lowest layer}.

Inductive Bias

Learning to Progressively Plan

no code implementations ICLR 2019 Xinyun Chen, Yuandong Tian

For problem solving, making reactive decisions based on problem description is fast but inaccurate, while search-based planning using heuristics gives better solutions but could be exponentially slow.

reinforcement-learning Reinforcement Learning (RL) +1

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

1 code implementation6 Mar 2024 Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian

Our approach reduces memory usage by up to 65. 5% in optimizer states while maintaining both efficiency and performance for pre-training on LLaMA 1B and 7B architectures with C4 dataset with up to 19. 7B tokens, and on fine-tuning RoBERTa on GLUE tasks.

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

no code implementations21 Feb 2024 Lucas Lehnert, Sainbayar Sukhbaatar, Paul McVay, Michael Rabbat, Yuandong Tian

In this work, we demonstrate how to train Transformers to solve complex planning tasks and present Searchformer, a Transformer model that optimally solves previously unseen Sokoban puzzles 93. 7% of the time, while using up to 26. 8% fewer search steps than standard $A^*$ search.

Decision Making

Diffusion World Model

no code implementations5 Feb 2024 Zihan Ding, Amy Zhang, Yuandong Tian, Qinqing Zheng

We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently.

D4RL Q-Learning

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

1 code implementation2 Feb 2024 Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, Yu Su

Are these language agents capable of planning in more complex settings that are out of the reach of prior AI agents?

Image Classifier Based Generative Method for Planar Antenna Design

no code implementations16 Dec 2023 Yang Zhong, Weiping Dou, Andrew Cohen, Dia'a Bisharat, Yuandong Tian, Jiang Zhu, Qing Huo Liu

To extend the antenna design on printed circuit boards (PCBs) for more engineers of interest, we propose a simple method that models PCB antennas with a few basic components.

H-GAP: Humanoid Control with a Generalist Planner

no code implementations5 Dec 2023 Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian

However, the extensive collection of human motion-captured data and the derived datasets of humanoid trajectories, such as MoCapAct, paves the way to tackle these challenges.

Humanoid Control Model Predictive Control +1

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

1 code implementation26 Oct 2023 Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability.

In-Context Learning

End-to-end Story Plot Generator

no code implementations13 Oct 2023 Hanlin Zhu, Andrew Cohen, Danqing Wang, Kevin Yang, Xiaomeng Yang, Jiantao Jiao, Yuandong Tian

Story plots, while short, carry most of the essential information of a full story that may contain tens of thousands of words.

Blocking

Learning Personalized Story Evaluation

no code implementations5 Oct 2023 Danqing Wang, Kevin Yang, Hanlin Zhu, Xiaomeng Yang, Andrew Cohen, Lei LI, Yuandong Tian

We further develop a personalized story evaluation model PERSE to infer reviewer preferences and provide a personalized evaluation.

Retrieval Text Generation

GenCO: Generating Diverse Solutions to Design Problems with Combinatorial Nature

no code implementations3 Oct 2023 Aaron Ferber, Arman Zharmagambetov, Taoan Huang, Bistra Dilkina, Yuandong Tian

Generating diverse objects (e. g., images) using generative models (such as GAN or VAE) has achieved impressive results in the recent years, to help solve many design problems that are traditionally done by humans.

Combinatorial Optimization Image Generation

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention

1 code implementation1 Oct 2023 Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Du

We propose Joint MLP/Attention (JoMA) dynamics, a novel mathematical framework to understand the training procedure of multilayer Transformer architectures.

Efficient Streaming Language Models with Attention Sinks

5 code implementations29 Sep 2023 Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis

In this paper, we first demonstrate that the emergence of attention sink is due to the strong attention scores towards initial tokens as a ``sink'' even if they are not semantically important.

Language Modelling

RLCD: Reinforcement Learning from Contrastive Distillation for Language Model Alignment

2 code implementations24 Jul 2023 Kevin Yang, Dan Klein, Asli Celikyilmaz, Nanyun Peng, Yuandong Tian

We propose Reinforcement Learning from Contrastive Distillation (RLCD), a method for aligning language models to follow principles expressed in natural language (e. g., to be more harmless) without using human feedback.

Language Modelling reinforcement-learning

Extending Context Window of Large Language Models via Positional Interpolation

4 code implementations27 Jun 2023 Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian

We present Position Interpolation (PI) that extends the context window sizes of RoPE-based pretrained LLMs such as LLaMA models to up to 32768 with minimal fine-tuning (within 1000 steps), while demonstrating strong empirical results on various tasks that require long context, including passkey retrieval, language modeling, and long document summarization from LLaMA 7B to 65B.

Document Summarization Language Modelling +2

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

1 code implementation24 Jun 2023 Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.

Sample-efficient Surrogate Model for Frequency Response of Linear PDEs using Self-Attentive Complex Polynomials

no code implementations6 Jan 2023 Andrew Cohen, Weiping Dou, Jiang Zhu, Slawomir Koziel, Peter Renner, Jan-Ove Mattsson, Xiaomeng Yang, Beidi Chen, Kevin Stone, Yuandong Tian

Linear Partial Differential Equations (PDEs) govern the spatial-temporal dynamics of physical systems that are essential to building modern technology.

DOC: Improving Long Story Coherence With Detailed Outline Control

1 code implementation20 Dec 2022 Kevin Yang, Dan Klein, Nanyun Peng, Yuandong Tian

In human evaluations of automatically generated stories, DOC substantially outperforms a strong Re3 baseline (Yang et al., 2022) on plot coherence (22. 5% absolute gain), outline relevance (28. 2%), and interestingness (20. 7%).

SurCo: Learning Linear Surrogates For Combinatorial Nonlinear Optimization Problems

no code implementations22 Oct 2022 Aaron Ferber, Taoan Huang, Daochen Zha, Martin Schubert, Benoit Steiner, Bistra Dilkina, Yuandong Tian

Optimization problems with nonlinear cost functions and combinatorial constraints appear in many real-world applications but remain challenging to solve efficiently compared to their linear counterparts.

Combinatorial Optimization

Re3: Generating Longer Stories With Recursive Reprompting and Revision

1 code implementation13 Oct 2022 Kevin Yang, Yuandong Tian, Nanyun Peng, Dan Klein

We consider the problem of automatically generating longer stories of over two thousand words.

Language Modelling

DreamShard: Generalizable Embedding Table Placement for Recommender Systems

1 code implementation5 Oct 2022 Daochen Zha, Louis Feng, Qiaoyu Tan, Zirui Liu, Kwei-Herng Lai, Bhargav Bhushanam, Yuandong Tian, Arun Kejariwal, Xia Hu

Although prior work has explored learning-based approaches for the device placement of computational graphs, embedding table placement remains to be a challenging problem because of 1) the operation fusion of embedding tables, and 2) the generalizability requirement on unseen placement tasks with different numbers of tables and/or devices.

Recommendation Systems Reinforcement Learning (RL)

Efficient Planning in a Compact Latent Action Space

1 code implementation22 Aug 2022 Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian

Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.

Continuous Control Decision Making +1

AutoShard: Automated Embedding Table Sharding for Recommender Systems

1 code implementation12 Aug 2022 Daochen Zha, Louis Feng, Bhargav Bhushanam, Dhruv Choudhary, Jade Nie, Yuandong Tian, Jay Chae, Yinbin Ma, Arun Kejariwal, Xia Hu

This is a significant design challenge of distributed systems named embedding table sharding, i. e., how we should partition the embedding tables to balance the costs across devices, which is a non-trivial task because 1) it is hard to efficiently and precisely measure the cost, and 2) the partition problem is known to be NP-hard.

Recommendation Systems

Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning

1 code implementation2 Jun 2022 Yuandong Tian

First, the presence of nonlinearity can lead to many local optima even in 1-layer setting, each corresponding to certain patterns from the data distribution, while with linear activation, only one major pattern can be learned.

Contrastive Learning Self-Supervised Learning

On the Importance of Asymmetry for Siamese Representation Learning

1 code implementation CVPR 2022 Xiao Wang, Haoqi Fan, Yuandong Tian, Daisuke Kihara, Xinlei Chen

Many recent self-supervised frameworks for visual representation learning are based on certain forms of Siamese networks.

Representation Learning

Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization

1 code implementation11 Feb 2022 Runlong Zhou, Zelin He, Yuandong Tian, Yi Wu, Simon S. Du

Furthermore, our theory explains the benefit of curriculum learning: it can find a strong sampling policy and reduce the distribution shift, a critical quantity that governs the convergence rate in our theorem.

Combinatorial Optimization Reinforcement Learning (RL)

Understanding Deep Contrastive Learning via Coordinate-wise Optimization

1 code implementation29 Jan 2022 Yuandong Tian

We show that Contrastive Learning (CL) under a broad family of loss functions (including InfoNCE) has a unified formulation of coordinate-wise optimization on the network parameter $\boldsymbol{\theta}$ and pairwise importance $\alpha$, where the \emph{max player} $\boldsymbol{\theta}$ learns representation for contrastiveness, and the \emph{min player} $\alpha$ puts more weights on pairs of distinct samples that share similar representations.

Contrastive Learning Representation Learning

Learning Bounded Context-Free-Grammar via LSTM and the Transformer:Difference and Explanations

1 code implementation16 Dec 2021 Hui Shi, Sicun Gao, Yuandong Tian, Xinyun Chen, Jishen Zhao

With the forced decomposition, we show that the performance upper bounds of LSTM and Transformer in learning CFL are close: both of them can simulate a stack and perform stack operation along with state transitions.

NovelD: A Simple yet Effective Exploration Criterion

1 code implementation NeurIPS 2021 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

We analyze NovelD thoroughly in MiniGrid and found that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.

Efficient Exploration Montezuma's Revenge +1

Understanding Dimensional Collapse in Contrastive Self-supervised Learning

1 code implementation ICLR 2022 Li Jing, Pascal Vincent, Yann Lecun, Yuandong Tian

It has been shown that non-contrastive methods suffer from a lesser collapse problem of a different nature: dimensional collapse, whereby the embedding vectors end up spanning a lower-dimensional subspace instead of the entire available embedding space.

Contrastive Learning Learning Theory +2

Towards Demystifying Representation Learning with Non-contrastive Self-supervision

2 code implementations11 Oct 2021 Xiang Wang, Xinlei Chen, Simon S. Du, Yuandong Tian

Non-contrastive methods of self-supervised learning (such as BYOL and SimSiam) learn representations by minimizing the distance between two views of the same image.

Representation Learning Self-Supervised Learning

Multi-objective Optimization by Learning Space Partitions

1 code implementation7 Oct 2021 Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian

In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.

Neural Architecture Search

Multi-objective Optimization by Learning Space Partition

no code implementations ICLR 2022 Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian

In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.

Neural Architecture Search

NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training

1 code implementation ICLR 2022 Chengyue Gong, Dilin Wang, Meng Li, Xinlei Chen, Zhicheng Yan, Yuandong Tian, Qiang Liu, Vikas Chandra

In this work, we observe that the poor performance is due to a gradient conflict issue: the gradients of different sub-networks conflict with that of the supernet more severely in ViTs than CNNs, which leads to early saturation in training and inferior convergence.

Data Augmentation Image Classification +2

CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research

1 code implementation17 Sep 2021 Chris Cummins, Bram Wasti, Jiadong Guo, Brandon Cui, Jason Ansel, Sahir Gomez, Somya Jain, Jia Liu, Olivier Teytaud, Benoit Steiner, Yuandong Tian, Hugh Leather

What is needed is an easy, reusable experimental infrastructure for real world compiler optimization tasks that can serve as a common benchmark for comparing techniques, and as a platform to accelerate progress in the field.

Compiler Optimization OpenAI Gym

Latent Execution for Neural Program Synthesis

1 code implementation NeurIPS 2021 Xinyun Chen, Dawn Song, Yuandong Tian

While recent works demonstrated limited success on domain-specific languages (DSL), it remains highly challenging to apply them to real-world programming languages, such as C. Due to complicated syntax and token variation, there are three major challenges: (1) unlike many DSLs, programs in languages like C need to compile first and are not executed via interpreters; (2) the program search space grows exponentially when the syntax and semantics of the programming language become more complex; and (3) collecting a large-scale dataset of real-world programs is non-trivial.

C++ code Program Synthesis

Learning Space Partitions for Path Planning

2 code implementations NeurIPS 2021 Kevin Yang, Tianjun Zhang, Chris Cummins, Brandon Cui, Benoit Steiner, Linnan Wang, Joseph E. Gonzalez, Dan Klein, Yuandong Tian

Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function.

MADE: Exploration via Maximizing Deviation from Explored Regions

1 code implementation NeurIPS 2021 Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao, Yuandong Tian, Joseph Gonzalez, Stuart Russell

As a proof of concept, we evaluate the new intrinsic reward on tabular examples across a variety of model-based and model-free algorithms, showing improvements over count-only exploration strategies.

Efficient Exploration Reinforcement Learning (RL)

Understanding Robustness in Teacher-Student Setting: A New Perspective

no code implementations25 Feb 2021 Zhuolin Yang, Zhaoxi Chen, Tiffany Cai, Xinyun Chen, Bo Li, Yuandong Tian

Extensive experiments show that student specialization correlates strongly with model robustness in different scenarios, including student trained via standard training, adversarial training, confidence-calibrated adversarial training, and training with robust feature dataset.

BIG-bench Machine Learning Data Augmentation

Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games

no code implementations17 Feb 2021 Yulai Zhao, Yuandong Tian, Jason D. Lee, Simon S. Du

Policy-based methods with function approximation are widely used for solving two-player zero-sum games with large state and/or action spaces.

Policy Gradient Methods Vocal Bursts Valence Prediction

Understanding self-supervised Learning Dynamics without Contrastive Pairs

5 code implementations12 Feb 2021 Yuandong Tian, Xinlei Chen, Surya Ganguli

While contrastive approaches of self-supervised learning (SSL) learn representations by minimizing the distance between two augmented views of the same data point (positive pairs) and maximizing views from different data points (negative pairs), recent \emph{non-contrastive} SSL (e. g., BYOL and SimSiam) show remarkable performance {\it without} negative pairs, with an extra learnable predictor and a stop-gradient operation.

Self-Supervised Learning

BeBold: Exploration Beyond the Boundary of Explored Regions

2 code implementations15 Dec 2020 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR.

Efficient Exploration NetHack

FP-NAS: Fast Probabilistic Neural Architecture Search

no code implementations CVPR 2021 Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli

Furthermore, to search fast in the multi-variate space, we propose a coarse-to-fine strategy by using a factorized distribution at the beginning which can reduce the number of architecture parameters by over an order of magnitude.

Neural Architecture Search

Multi-Agent Collaboration via Reward Attribution Decomposition

2 code implementations16 Oct 2020 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

In this work, we propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play.

Dota 2 Multi-agent Reinforcement Learning +2

Understanding Self-supervised Learning with Dual Deep Networks

2 code implementations1 Oct 2020 Yuandong Tian, Lantao Yu, Xinlei Chen, Surya Ganguli

We propose a novel theoretical framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks (e. g., SimCLR).

Self-Supervised Learning

Joint Policy Search for Multi-agent Collaboration with Imperfect Information

1 code implementation NeurIPS 2020 Yuandong Tian, Qucheng Gong, Tina Jiang

Based on this, we propose Joint Policy Search(JPS) that iteratively improves joint policies of collaborative agents in imperfect information games, without re-evaluating the entire game.

Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search

2 code implementations NeurIPS 2020 Linnan Wang, Rodrigo Fonseca, Yuandong Tian

If the nonlinear partition function and the local model fits well with ground-truth black-box function, then good partitions and candidates can be reached with much fewer samples.

Bayesian Optimization Neural Architecture Search

Towards Automated Neural Interaction Discovery for Click-Through Rate Prediction

no code implementations29 Jun 2020 Qingquan Song, Dehua Cheng, Hanning Zhou, Jiyan Yang, Yuandong Tian, Xia Hu

Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems, driving personalized experience for billions of consumers.

Click-Through Rate Prediction Learning-To-Rank +2

Few-shot Neural Architecture Search

2 code implementations11 Jun 2020 Yiyang Zhao, Linnan Wang, Yuandong Tian, Rodrigo Fonseca, Tian Guo

supernet, to approximate the performance of every architecture in the search space via weight-sharing.

Neural Architecture Search Transfer Learning

Student Specialization in Deep ReLU Networks With Finite Width and Input Dimension

1 code implementation30 Sep 2019 Yuandong Tian

Under mild conditions on dataset and teacher network, we prove that when the gradient is small at every data sample, each teacher node is \emph{specialized} by at least one student node \emph{at the lowest layer}.

Data Augmentation Inductive Bias

A⋆MCTS: SEARCH WITH THEORETICAL GUARANTEE USING POLICY AND VALUE FUNCTIONS

no code implementations25 Sep 2019 Xian Wu, Yuandong Tian, Lexing Ying

We apply our theoretical framework to different models for the noise distribution of the policy and value network as well as the distribution of rewards, and show that for these general models, the sample complexity is polynomial in D, where D is the depth of the search tree.

Board Games

Toward Understanding Generalization of Over-parameterized Deep ReLU network trained with SGD in Student-teacher Setting

no code implementations25 Sep 2019 Yuandong Tian

Our analysis shows that over-parameterization plays two roles: (1) it is a necessary condition for alignment to happen at the critical points, and (2) in training dynamics, it helps student nodes cover more teacher nodes with fewer iterations.

Mean Field Models for Neural Networks in Teacher-student Setting

no code implementations25 Sep 2019 Lexing Ying, Yuandong Tian

For the two-layer networks, we derive the necessary condition of the stationary distributions of the mean field equation and explain an empirical phenomenon concerning training speed differences using the Wasserstein flow description.

Neural Architecture Search by Learning Action Space for Monte Carlo Tree Search

no code implementations25 Sep 2019 Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian

As a result, using manually designed action space to perform NAS often leads to sample-inefficient explorations of architectures and thus can be sub-optimal.

Bayesian Optimization Neural Architecture Search

Simple is Better: Training an End-to-end Contract Bridge Bidding Agent without Human Knowledge

no code implementations25 Sep 2019 Qucheng Gong, Yu Jiang, Yuandong Tian

While playing is relatively easy for modern software, bidding is challenging and requires agents to learn a communication protocol to reach the optimal contract jointly, with their own private information.

All Simulations Are Not Equal: Simulation Reweighing for Imperfect Information Games

no code implementations25 Sep 2019 Qucheng Gong, Yuandong Tian

We use simulation reweighing in the playing phase of the game contract bridge, and show that it outperforms previous state-of-the-art Monte Carlo simulation based methods, and achieves better play per decision.

Bayesian Relational Memory for Semantic Visual Navigation

1 code implementation ICCV 2019 Yi Wu, Yuxin Wu, Aviv Tamar, Stuart Russell, Georgia Gkioxari, Yuandong Tian

We introduce a new memory architecture, Bayesian Relational Memory (BRM), to improve the generalization ability for semantic visual navigation agents in unseen environments, where an agent is given a semantic target to navigate towards.

Navigate Visual Navigation

A Neural-based Program Decompiler

no code implementations28 Jun 2019 Cheng Fu, Huili Chen, Haolan Liu, Xinyun Chen, Yuandong Tian, Farinaz Koushanfar, Jishen Zhao

Reverse engineering of binary executables is a critical problem in the computer security domain.

Computer Security Malware Detection

Sample-Efficient Neural Architecture Search by Learning Action Space

1 code implementation17 Jun 2019 Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian

To improve the sample efficiency, this paper proposes Latent Action Neural Architecture Search (LaNAS), which learns actions to recursively partition the search space into good or bad regions that contain networks with similar performance metrics.

Evolutionary Algorithms Neural Architecture Search

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers

2 code implementations NeurIPS 2019 Ari S. Morcos, Haonan Yu, Michela Paganini, Yuandong Tian

Perhaps surprisingly, we found that, within the natural images domain, winning ticket initializations generalized across a variety of datasets, including Fashion MNIST, SVHN, CIFAR-10/100, ImageNet, and Places365, often achieving performance close to that of winning tickets generated on the same dataset.

Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP

no code implementations ICLR 2020 Haonan Yu, Sergey Edunov, Yuandong Tian, Ari S. Morcos

The lottery ticket hypothesis proposes that over-parameterization of deep neural networks (DNNs) aids training by increasing the probability of a "lucky" sub-network initialization being present rather than by helping the optimization process (Frankle & Carbin, 2019).

Image Classification Reinforcement Learning (RL)

Hierarchical Decision Making by Generating and Following Natural Language Instructions

1 code implementation NeurIPS 2019 Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.

Decision Making

Luck Matters: Understanding Training Dynamics of Deep ReLU Networks

1 code implementation31 May 2019 Yuandong Tian, Tina Jiang, Qucheng Gong, Ari Morcos

We analyze the dynamics of training deep ReLU networks and their implications on generalization capability.

M^3RL: Mind-aware Multi-agent Management Reinforcement Learning

no code implementations ICLR 2019 Tianmin Shu, Yuandong Tian

Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward.

Management Multi-agent Reinforcement Learning +2

A theoretical framework for deep and locally connected ReLU network

no code implementations ICLR 2019 Yuandong Tian

In this paper, we propose a novel theoretical framework for such networks with ReLU nonlinearity.

AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search

1 code implementation26 Mar 2019 Linnan Wang, Yiyang Zhao, Yuu Jinnai, Yuandong Tian, Rodrigo Fonseca

Neural Architecture Search (NAS) has shown great success in automating the design of neural networks, but the prohibitive amount of computations behind current NAS methods requires further investigations in improving the sample efficiency and the network evaluation cost to get better results in a shorter time.

Image Captioning Neural Architecture Search +4

ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero

1 code implementation12 Feb 2019 Yuandong Tian, Jerry Ma, Qucheng Gong, Shubho Sengupta, Zhuoyuan Chen, James Pinkerton, C. Lawrence Zitnick

The AlphaGo, AlphaGo Zero, and AlphaZero series of algorithms are remarkable demonstrations of deep reinforcement learning's capabilities, achieving superhuman performance in the complex game of Go with progressively increasing autonomy.

Game of Go

Sample-Efficient Neural Architecture Search by Learning Action Space for Monte Carlo Tree Search

1 code implementation1 Jan 2019 Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian

To improve the sample efficiency, this paper proposes Latent Action Neural Architecture Search (LaNAS), which learns actions to recursively partition the search space into good or bad regions that contain networks with similar performance metrics.

Evolutionary Algorithms Image Classification +1

Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search

no code implementations ICLR 2019 Bichen Wu, Yanghan Wang, Peizhao Zhang, Yuandong Tian, Peter Vajda, Kurt Keutzer

Recent work in network quantization has substantially reduced the time and space complexity of neural network inference, enabling their deployment on embedded and mobile devices with limited computational and memory resources.

Neural Architecture Search Quantization

M$^3$RL: Mind-aware Multi-agent Management Reinforcement Learning

1 code implementation ICLR 2019 Tianmin Shu, Yuandong Tian

Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward.

Management Multi-agent Reinforcement Learning +2

A theoretical framework for deep locally connected ReLU network

no code implementations28 Sep 2018 Yuandong Tian

Understanding theoretical properties of deep and locally connected nonlinear network, such as deep convolutional neural network (DCNN), is still a hard problem despite its empirical success.

Learning and Planning with a Semantic Model

no code implementations ICLR 2019 Yi Wu, Yuxin Wu, Aviv Tamar, Stuart Russell, Georgia Gkioxari, Yuandong Tian

Building deep reinforcement learning agents that can generalize and adapt to unseen environments remains a fundamental challenge for AI.

Visual Navigation

Neural Architecture Search using Deep Neural Networks and Monte Carlo Tree Search

2 code implementations18 May 2018 Linnan Wang, Yiyang Zhao, Yuu Jinnai, Yuandong Tian, Rodrigo Fonseca

Neural Architecture Search (NAS) has shown great success in automating the design of neural networks, but the prohibitive amount of computations behind current NAS methods requires further investigations in improving the sample efficiency and the network evaluation cost to get better results in a shorter time.

Image Captioning Neural Architecture Search +4

3D Interpreter Networks for Viewer-Centered Wireframe Modeling

no code implementations3 Apr 2018 Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman

3D-INN is trained on real images to estimate 2D keypoint heatmaps from an input image; it then predicts 3D object structure from heatmaps using knowledge learned from synthetic 3D shapes.

Image Retrieval Keypoint Estimation +2

Building Generalizable Agents with a Realistic and Rich 3D Environment

5 code implementations ICLR 2018 Yi Wu, Yuxin Wu, Georgia Gkioxari, Yuandong Tian

To generalize to unseen environments, an agent needs to be robust to low-level variations (e. g. color, texture, object changes), and also high-level variations (e. g. layout changes of the environment).

Data Augmentation

Latent forward model for Real-time Strategy game planning with incomplete information

no code implementations ICLR 2018 Yuandong Tian, Qucheng Gong

Model-free deep reinforcement learning approaches have shown superhuman performance in simulated environments (e. g., Atari games, Go, etc).

Atari Games Decision Making +2

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima

no code implementations ICML 2018 Simon S. Du, Jason D. Lee, Yuandong Tian, Barnabas Poczos, Aarti Singh

We consider the problem of learning a one-hidden-layer neural network with non-overlapping convolutional layer and ReLU activation, i. e., $f(\mathbf{Z}, \mathbf{w}, \mathbf{a}) = \sum_j a_j\sigma(\mathbf{w}^T\mathbf{Z}_j)$, in which both the convolutional weights $\mathbf{w}$ and the output weights $\mathbf{a}$ are parameters to be learned.

When is a Convolutional Filter Easy To Learn?

no code implementations ICLR 2018 Simon S. Du, Jason D. Lee, Yuandong Tian

We show that (stochastic) gradient descent with random initialization can learn the convolutional filter in polynomial time and the convergence rate depends on the smoothness of the input distribution and the closeness of patches.

ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games

2 code implementations NeurIPS 2017 Yuandong Tian, Qucheng Gong, Wenling Shang, Yuxin Wu, C. Lawrence Zitnick

In addition, our platform is flexible in terms of environment-agent communication topologies, choices of RL methods, changes in game parameters, and can host existing C/C++-based game environments like Arcade Learning Environment.

Atari Games reinforcement-learning +2

Channel-Recurrent Autoencoding for Image Modeling

no code implementations12 Jun 2017 Wenling Shang, Kihyuk Sohn, Yuandong Tian

Despite recent successes in synthesizing faces and bedrooms, existing generative models struggle to capture more complex image types, potentially due to the oversimplification of their latent space constructions.

An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis

1 code implementation ICML 2017 Yuandong Tian

We train our network with gradient descent on $\mathbf{w}$ to mimic the output of a teacher network with the same architecture and fixed parameters $\mathbf{w}^*$.

Single Image 3D Interpreter Network

1 code implementation29 Apr 2016 Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman

In this work, we propose 3D INterpreter Network (3D-INN), an end-to-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data.

Image Retrieval Keypoint Estimation +2

Better Computer Go Player with Neural Network and Long-term Prediction

3 code implementations19 Nov 2015 Yuandong Tian, Yan Zhu

Against human players, the newest versions, darkfores2, achieve a stable 3d level on KGS Go Server as a ranked bot, a substantial improvement upon the estimated 4k-5k ranks for DCNN reported in Clark & Storkey (2015) based on games against other machine players.

Game of Go

Semantic Amodal Segmentation

2 code implementations CVPR 2017 Yan Zhu, Yuandong Tian, Dimitris Mexatas, Piotr Dollár

Specifically, we create an amodal segmentation of each image: the full extent of each region is marked, not just the visible pixels.

object-detection Object Detection +2

Convolutional networks and learning invariant to homogeneous multiplicative scalings

no code implementations26 Jun 2015 Mark Tygert, Arthur Szlam, Soumith Chintala, Marc'Aurelio Ranzato, Yuandong Tian, Wojciech Zaremba

The conventional classification schemes -- notably multinomial logistic regression -- used in conjunction with convolutional networks (convnets) are classical in statistics, designed without consideration for the usual coupling with convnets, stochastic gradient descent, and backpropagation.

Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.