no code implementations • ICLR 2019 • Xinyun Chen, Yuandong Tian
For problem solving, making reactive decisions based on problem description is fast but inaccurate, while search-based planning using heuristics gives better solutions but could be exponentially slow.
no code implementations • ICML 2020 • Yuandong Tian
Under mild conditions on dataset and teacher network, we prove that when the gradient is small at every data sample, each teacher node is \emph{specialized} by at least one student node \emph{at the lowest layer}.
1 code implementation • 24 Jul 2023 • Kevin Yang, Dan Klein, Asli Celikyilmaz, Nanyun Peng, Yuandong Tian
We propose Reinforcement Learning from Contrast Distillation (RLCD), a method for aligning language models to follow natural language principles without using human feedback.
1 code implementation • 18 Jul 2023 • Arman Zharmagambetov, Brandon Amos, Aaron Ferber, Taoan Huang, Bistra Dilkina, Yuandong Tian
The implicit approach may not require optimal solutions as labels and is capable of handling problem uncertainty; however, it is slow to train and deploy due to frequent calls to optimizer $\mathbf{g}$ during both training and testing.
1 code implementation • 27 Jun 2023 • Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
We present Position Interpolation (PI) that extends the context window sizes of RoPE-based pretrained LLMs such as LLaMA models to up to 32768 with minimal fine-tuning (within 1000 steps), while demonstrating strong empirical results on various tasks that require long context, including passkey retrieval, language modeling, and long document summarization from LLaMA 7B to 65B.
1 code implementation • 24 Jun 2023 • Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen
Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.
no code implementations • 25 May 2023 • Yuandong Tian, Yiping Wang, Beidi Chen, Simon Du
More specifically, with the assumption (a) no positional encoding, (b) long input sequence, and (c) the decoder layer learns faster than the self-attention layer, we prove that self-attention acts as a \emph{discriminative scanning algorithm}: starting from uniform attention, it gradually attends more to distinct key tokens for a specific next token to be predicted, and pays less attention to common key tokens that occur across different next tokens.
1 code implementation • 3 May 2023 • Daochen Zha, Louis Feng, Liang Luo, Bhargav Bhushanam, Zirui Liu, Yusuo Hu, Jade Nie, Yuzhen Huang, Yuandong Tian, Arun Kejariwal, Xia Hu
In this work, we explore a "pre-train, and search" paradigm for efficient sharding.
1 code implementation • 24 Apr 2023 • Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann Lecun, Micah Goldblum
Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning.
no code implementations • 3 Feb 2023 • Taoan Huang, Aaron Ferber, Yuandong Tian, Bistra Dilkina, Benoit Steiner
Integer Linear Programs (ILPs) are powerful tools for modeling and solving a large number of combinatorial optimization problems.
no code implementations • 9 Jan 2023 • Youwei Liang, Kevin Stone, Ali Shameli, Chris Cummins, Mostafa Elhoushi, Jiadong Guo, Benoit Steiner, Xiaomeng Yang, Pengtao Xie, Hugh Leather, Yuandong Tian
Finding the optimal pass sequence of compilation can lead to a significant reduction in program size and/or improvement in program efficiency.
no code implementations • 6 Jan 2023 • Andrew Cohen, Weiping Dou, Jiang Zhu, Slawomir Koziel, Peter Renner, Jan-Ove Mattsson, Xiaomeng Yang, Beidi Chen, Kevin Stone, Yuandong Tian
Linear Partial Differential Equations (PDEs) govern the spatial-temporal dynamics of physical systems that are essential to building modern technology.
1 code implementation • 20 Dec 2022 • Kevin Yang, Dan Klein, Nanyun Peng, Yuandong Tian
In human evaluations of automatically generated stories, DOC substantially outperforms a strong Re3 baseline (Yang et al., 2022) on plot coherence (22. 5% absolute gain), outline relevance (28. 2%), and interestingness (20. 7%).
no code implementations • 15 Dec 2022 • Taoan Huang, Aaron Ferber, Yuandong Tian, Bistra Dilkina, Benoit Steiner
LNS relies on heuristics to select neighborhoods to search in.
1 code implementation • 23 Nov 2022 • Minghao Xu, Yuanfan Guo, Yi Xu, Jian Tang, Xinlei Chen, Yuandong Tian
We study EurNets in two important domains for image and protein structure modeling.
no code implementations • 22 Oct 2022 • Aaron Ferber, Taoan Huang, Daochen Zha, Martin Schubert, Benoit Steiner, Bistra Dilkina, Yuandong Tian
Optimization problems with nonlinear cost functions and combinatorial constraints appear in many real-world applications but remain challenging to solve efficiently compared to their linear counterparts.
1 code implementation • 13 Oct 2022 • Kevin Yang, Yuandong Tian, Nanyun Peng, Dan Klein
We consider the problem of automatically generating longer stories of over two thousand words.
1 code implementation • 5 Oct 2022 • Daochen Zha, Louis Feng, Qiaoyu Tan, Zirui Liu, Kwei-Herng Lai, Bhargav Bhushanam, Yuandong Tian, Arun Kejariwal, Xia Hu
Although prior work has explored learning-based approaches for the device placement of computational graphs, embedding table placement remains to be a challenging problem because of 1) the operation fusion of embedding tables, and 2) the generalizability requirement on unseen placement tasks with different numbers of tables and/or devices.
1 code implementation • 22 Aug 2022 • Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian
Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.
1 code implementation • 12 Aug 2022 • Daochen Zha, Louis Feng, Bhargav Bhushanam, Dhruv Choudhary, Jade Nie, Yuandong Tian, Jay Chae, Yinbin Ma, Arun Kejariwal, Xia Hu
This is a significant design challenge of distributed systems named embedding table sharding, i. e., how we should partition the embedding tables to balance the costs across devices, which is a non-trivial task because 1) it is hard to efficiently and precisely measure the cost, and 2) the partition problem is known to be NP-hard.
1 code implementation • 30 Jun 2022 • Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian
The ability to separate signal from noise, and reason with clean abstractions, is critical to intelligence.
1 code implementation • 2 Jun 2022 • Yuandong Tian
First, the presence of nonlinearity can lead to many local optima even in 1-layer setting, each corresponding to certain patterns from the data distribution, while with linear activation, only one major pattern can be learned.
1 code implementation • CVPR 2022 • Xiao Wang, Haoqi Fan, Yuandong Tian, Daisuke Kihara, Xinlei Chen
Many recent self-supervised frameworks for visual representation learning are based on certain forms of Siamese networks.
1 code implementation • 11 Feb 2022 • Runlong Zhou, Yuandong Tian, Yi Wu, Simon S. Du
For a canonical online CO problem, Secretary Problem, we formally prove that distribution shift is reduced exponentially with curriculum learning even if the curriculum is randomly generated.
1 code implementation • 29 Jan 2022 • Yuandong Tian
We show that Contrastive Learning (CL) under a broad family of loss functions (including InfoNCE) has a unified formulation of coordinate-wise optimization on the network parameter $\boldsymbol{\theta}$ and pairwise importance $\alpha$, where the \emph{max player} $\boldsymbol{\theta}$ learns representation for contrastiveness, and the \emph{min player} $\alpha$ puts more weights on pairs of distinct samples that share similar representations.
1 code implementation • 16 Dec 2021 • Hui Shi, Sicun Gao, Yuandong Tian, Xinyun Chen, Jishen Zhao
With the forced decomposition, we show that the performance upper bounds of LSTM and Transformer in learning CFL are close: both of them can simulate a stack and perform stack operation along with state transitions.
1 code implementation • NeurIPS 2021 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
We analyze NovelD thoroughly in MiniGrid and found that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.
no code implementations • 19 Nov 2021 • Weilin Cong, Yanhong Wu, Yuandong Tian, Mengting Gu, Yinglong Xia, Chun-cheng Jason Chen, Mehrdad Mahdavi
To achieve efficient and scalable training, we propose temporal-union graph structure and its associated subgraph-based node sampling strategy.
1 code implementation • ICLR 2022 • Li Jing, Pascal Vincent, Yann Lecun, Yuandong Tian
It has been shown that non-contrastive methods suffer from a lesser collapse problem of a different nature: dimensional collapse, whereby the embedding vectors end up spanning a lower-dimensional subspace instead of the entire available embedding space.
1 code implementation • 11 Oct 2021 • Xiang Wang, Xinlei Chen, Simon S. Du, Yuandong Tian
Non-contrastive methods of self-supervised learning (such as BYOL and SimSiam) learn representations by minimizing the distance between two views of the same image.
1 code implementation • 7 Oct 2021 • Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian
In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.
no code implementations • ICLR 2022 • Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian
In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.
1 code implementation • ICLR 2022 • Chengyue Gong, Dilin Wang, Meng Li, Xinlei Chen, Zhicheng Yan, Yuandong Tian, Qiang Liu, Vikas Chandra
In this work, we observe that the poor performance is due to a gradient conflict issue: the gradients of different sub-networks conflict with that of the supernet more severely in ViTs than CNNs, which leads to early saturation in training and inferior convergence.
Ranked #7 on
Neural Architecture Search
on ImageNet
1 code implementation • 17 Sep 2021 • Chris Cummins, Bram Wasti, Jiadong Guo, Brandon Cui, Jason Ansel, Sahir Gomez, Somya Jain, Jia Liu, Olivier Teytaud, Benoit Steiner, Yuandong Tian, Hugh Leather
What is needed is an easy, reusable experimental infrastructure for real world compiler optimization tasks that can serve as a common benchmark for comparing techniques, and as a platform to accelerate progress in the field.
1 code implementation • NeurIPS 2021 • Xinyun Chen, Dawn Song, Yuandong Tian
While recent works demonstrated limited success on domain-specific languages (DSL), it remains highly challenging to apply them to real-world programming languages, such as C. Due to complicated syntax and token variation, there are three major challenges: (1) unlike many DSLs, programs in languages like C need to compile first and are not executed via interpreters; (2) the program search space grows exponentially when the syntax and semantics of the programming language become more complex; and (3) collecting a large-scale dataset of real-world programs is non-trivial.
2 code implementations • NeurIPS 2021 • Kevin Yang, Tianjun Zhang, Chris Cummins, Brandon Cui, Benoit Steiner, Linnan Wang, Joseph E. Gonzalez, Dan Klein, Yuandong Tian
Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function.
1 code implementation • NeurIPS 2021 • Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao, Yuandong Tian, Joseph Gonzalez, Stuart Russell
As a proof of concept, we evaluate the new intrinsic reward on tabular examples across a variety of model-based and model-free algorithms, showing improvements over count-only exploration strategies.
no code implementations • NeurIPS 2021 • Xinyun Chen, Dawn Song, Yuandong Tian
Program synthesis from input-output (IO) examples has been a long-standing challenge.
no code implementations • 25 Feb 2021 • Zhuolin Yang, Zhaoxi Chen, Tiffany Cai, Xinyun Chen, Bo Li, Yuandong Tian
Extensive experiments show that student specialization correlates strongly with model robustness in different scenarios, including student trained via standard training, adversarial training, confidence-calibrated adversarial training, and training with robust feature dataset.
no code implementations • 17 Feb 2021 • Yulai Zhao, Yuandong Tian, Jason D. Lee, Simon S. Du
Policy-based methods with function approximation are widely used for solving two-player zero-sum games with large state and/or action spaces.
4 code implementations • 12 Feb 2021 • Yuandong Tian, Xinlei Chen, Surya Ganguli
While contrastive approaches of self-supervised learning (SSL) learn representations by minimizing the distance between two augmented views of the same data point (positive pairs) and maximizing views from different data points (negative pairs), recent \emph{non-contrastive} SSL (e. g., BYOL and SimSiam) show remarkable performance {\it without} negative pairs, with an extra learnable predictor and a stop-gradient operation.
1 code implementation • 1 Jan 2021 • Cheng Fu, Kunlin Yang, Xinyun Chen, Yuandong Tian, Jishen Zhao
In software development, decompilation aims to reverse engineer binary executables.
2 code implementations • 15 Dec 2020 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR.
no code implementations • CVPR 2021 • Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli
Furthermore, to search fast in the multi-variate space, we propose a coarse-to-fine strategy by using a factorized distribution at the beginning which can reduce the number of architecture parameters by over an order of magnitude.
2 code implementations • 16 Oct 2020 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
In this work, we propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play.
2 code implementations • 1 Oct 2020 • Yuandong Tian, Lantao Yu, Xinlei Chen, Surya Ganguli
We propose a novel theoretical framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks (e. g., SimCLR).
no code implementations • 28 Aug 2020 • Hongzi Mao, Shannon Chen, Drew Dimmery, Shaun Singh, Drew Blaisdell, Yuandong Tian, Mohammad Alizadeh, Eytan Bakshy
Client-side video players employ adaptive bitrate (ABR) algorithms to optimize user quality of experience (QoE).
1 code implementation • NeurIPS 2020 • Yuandong Tian, Qucheng Gong, Tina Jiang
Based on this, we propose Joint Policy Search(JPS) that iteratively improves joint policies of collaborative agents in imperfect information games, without re-evaluating the entire game.
2 code implementations • NeurIPS 2020 • Linnan Wang, Rodrigo Fonseca, Yuandong Tian
If the nonlinear partition function and the local model fits well with ground-truth black-box function, then good partitions and candidates can be reached with much fewer samples.
no code implementations • 29 Jun 2020 • Qingquan Song, Dehua Cheng, Hanning Zhou, Jiyan Yang, Yuandong Tian, Xia Hu
Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems, driving personalized experience for billions of consumers.
2 code implementations • 11 Jun 2020 • Yiyang Zhao, Linnan Wang, Yuandong Tian, Rodrigo Fonseca, Tian Guo
supernet, to approximate the performance of every architecture in the search space via weight-sharing.
2 code implementations • CVPR 2021 • Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, Joseph E. Gonzalez
To address this, we present Neural Architecture-Recipe Search (NARS) to search both (a) architectures and (b) their corresponding training recipes, simultaneously.
Ranked #5 on
Neural Architecture Search
on ImageNet
1 code implementation • ICLR 2020 • Hui Shi, Yang Zhang, Xinyun Chen, Yuandong Tian, Jishen Zhao
Deep symbolic superoptimization refers to the task of applying deep learning methods to simplify symbolic expressions.
1 code implementation • CVPR 2020 • Alvin Wan, Xiaoliang Dai, Peizhao Zhang, Zijian He, Yuandong Tian, Saining Xie, Bichen Wu, Matthew Yu, Tao Xu, Kan Chen, Peter Vajda, Joseph E. Gonzalez
We propose a masking mechanism for feature map reuse, so that memory and computational costs stay nearly constant as the search space expands.
Ranked #68 on
Neural Architecture Search
on ImageNet
no code implementations • NeurIPS 2019 • Cheng Fu, Huili Chen, Haolan Liu, Xinyun Chen, Yuandong Tian, Farinaz Koushanfar, Jishen Zhao
Furthermore, Coda outperforms the sequence-to-sequence model with attention by a margin of 70% program accuracy.
1 code implementation • 30 Sep 2019 • Yuandong Tian
Under mild conditions on dataset and teacher network, we prove that when the gradient is small at every data sample, each teacher node is \emph{specialized} by at least one student node \emph{at the lowest layer}.
no code implementations • 25 Sep 2019 • Xian Wu, Yuandong Tian, Lexing Ying
We apply our theoretical framework to different models for the noise distribution of the policy and value network as well as the distribution of rewards, and show that for these general models, the sample complexity is polynomial in D, where D is the depth of the search tree.
no code implementations • 25 Sep 2019 • Yuandong Tian
Our analysis shows that over-parameterization plays two roles: (1) it is a necessary condition for alignment to happen at the critical points, and (2) in training dynamics, it helps student nodes cover more teacher nodes with fewer iterations.
no code implementations • 25 Sep 2019 • Lexing Ying, Yuandong Tian
For the two-layer networks, we derive the necessary condition of the stationary distributions of the mean field equation and explain an empirical phenomenon concerning training speed differences using the Wasserstein flow description.
no code implementations • 25 Sep 2019 • Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian
As a result, using manually designed action space to perform NAS often leads to sample-inefficient explorations of architectures and thus can be sub-optimal.
no code implementations • 25 Sep 2019 • Qucheng Gong, Yu Jiang, Yuandong Tian
While playing is relatively easy for modern software, bidding is challenging and requires agents to learn a communication protocol to reach the optimal contract jointly, with their own private information.
no code implementations • 25 Sep 2019 • Qucheng Gong, Yuandong Tian
We use simulation reweighing in the playing phase of the game contract bridge, and show that it outperforms previous state-of-the-art Monte Carlo simulation based methods, and achieves better play per decision.
1 code implementation • ICCV 2019 • Yi Wu, Yuxin Wu, Aviv Tamar, Stuart Russell, Georgia Gkioxari, Yuandong Tian
We introduce a new memory architecture, Bayesian Relational Memory (BRM), to improve the generalization ability for semantic visual navigation agents in unseen environments, where an agent is given a semantic target to navigate towards.
no code implementations • 28 Jun 2019 • Cheng Fu, Huili Chen, Haolan Liu, Xinyun Chen, Yuandong Tian, Farinaz Koushanfar, Jishen Zhao
Reverse engineering of binary executables is a critical problem in the computer security domain.
1 code implementation • 17 Jun 2019 • Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian
To improve the sample efficiency, this paper proposes Latent Action Neural Architecture Search (LaNAS), which learns actions to recursively partition the search space into good or bad regions that contain networks with similar performance metrics.
no code implementations • ICLR 2020 • Haonan Yu, Sergey Edunov, Yuandong Tian, Ari S. Morcos
The lottery ticket hypothesis proposes that over-parameterization of deep neural networks (DNNs) aids training by increasing the probability of a "lucky" sub-network initialization being present rather than by helping the optimization process (Frankle & Carbin, 2019).
2 code implementations • NeurIPS 2019 • Ari S. Morcos, Haonan Yu, Michela Paganini, Yuandong Tian
Perhaps surprisingly, we found that, within the natural images domain, winning ticket initializations generalized across a variety of datasets, including Fashion MNIST, SVHN, CIFAR-10/100, ImageNet, and Places365, often achieving performance close to that of winning tickets generated on the same dataset.
1 code implementation • NeurIPS 2019 • Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis
We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.
1 code implementation • 31 May 2019 • Yuandong Tian, Tina Jiang, Qucheng Gong, Ari Morcos
We analyze the dynamics of training deep ReLU networks and their implications on generalization capability.
no code implementations • ICLR 2019 • Yuandong Tian
In this paper, we propose a novel theoretical framework for such networks with ReLU nonlinearity.
no code implementations • ICLR 2019 • Tianmin Shu, Yuandong Tian
Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward.
1 code implementation • 26 Mar 2019 • Linnan Wang, Yiyang Zhao, Yuu Jinnai, Yuandong Tian, Rodrigo Fonseca
Neural Architecture Search (NAS) has shown great success in automating the design of neural networks, but the prohibitive amount of computations behind current NAS methods requires further investigations in improving the sample efficiency and the network evaluation cost to get better results in a shorter time.
Ranked #7 on
Neural Architecture Search
on CIFAR-10 Image Classification
(Params metric)
1 code implementation • 12 Feb 2019 • Yuandong Tian, Jerry Ma, Qucheng Gong, Shubho Sengupta, Zhuoyuan Chen, James Pinkerton, C. Lawrence Zitnick
The AlphaGo, AlphaGo Zero, and AlphaZero series of algorithms are remarkable demonstrations of deep reinforcement learning's capabilities, achieving superhuman performance in the complex game of Go with progressively increasing autonomy.
1 code implementation • 1 Jan 2019 • Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian
To improve the sample efficiency, this paper proposes Latent Action Neural Architecture Search (LaNAS), which learns actions to recursively partition the search space into good or bad regions that contain networks with similar performance metrics.
Ranked #19 on
Image Classification
on CIFAR-10
5 code implementations • CVPR 2019 • Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer
Due to this, previous neural architecture search (NAS) methods are computationally expensive.
Ranked #846 on
Image Classification
on ImageNet
no code implementations • ICLR 2019 • Bichen Wu, Yanghan Wang, Peizhao Zhang, Yuandong Tian, Peter Vajda, Kurt Keutzer
Recent work in network quantization has substantially reduced the time and space complexity of neural network inference, enabling their deployment on embedded and mobile devices with limited computational and memory resources.
1 code implementation • NeurIPS 2019 • Xinyun Chen, Yuandong Tian
Search-based methods for hard combinatorial optimization are often guided by heuristics.
1 code implementation • ICLR 2019 • Tianmin Shu, Yuandong Tian
Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward.
no code implementations • ICLR 2019 • Yi Wu, Yuxin Wu, Aviv Tamar, Stuart Russell, Georgia Gkioxari, Yuandong Tian
Building deep reinforcement learning agents that can generalize and adapt to unseen environments remains a fundamental challenge for AI.
no code implementations • 28 Sep 2018 • Yuandong Tian
Understanding theoretical properties of deep and locally connected nonlinear network, such as deep convolutional neural network (DCNN), is still a hard problem despite its empirical success.
2 code implementations • ICLR 2019 • Yuping Luo, Huazhe Xu, Yuanzhi Li, Yuandong Tian, Trevor Darrell, Tengyu Ma
Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL.
2 code implementations • 18 May 2018 • Linnan Wang, Yiyang Zhao, Yuu Jinnai, Yuandong Tian, Rodrigo Fonseca
Neural Architecture Search (NAS) has shown great success in automating the design of neural networks, but the prohibitive amount of computations behind current NAS methods requires further investigations in improving the sample efficiency and the network evaluation cost to get better results in a shorter time.
no code implementations • 3 Apr 2018 • Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman
3D-INN is trained on real images to estimate 2D keypoint heatmaps from an input image; it then predicts 3D object structure from heatmaps using knowledge learned from synthetic 3D shapes.
5 code implementations • ICLR 2018 • Yi Wu, Yuxin Wu, Georgia Gkioxari, Yuandong Tian
To generalize to unseen environments, an agent needs to be robust to low-level variations (e. g. color, texture, object changes), and also high-level variations (e. g. layout changes of the environment).
no code implementations • ICLR 2018 • Yuandong Tian, Qucheng Gong
Model-free deep reinforcement learning approaches have shown superhuman performance in simulated environments (e. g., Atari games, Go, etc).
2 code implementations • ACL 2019 • Jin-Hwa Kim, Nikita Kitaev, Xinlei Chen, Marcus Rohrbach, Byoung-Tak Zhang, Yuandong Tian, Dhruv Batra, Devi Parikh
The game involves two players: a Teller and a Drawer.
no code implementations • ICML 2018 • Simon S. Du, Jason D. Lee, Yuandong Tian, Barnabas Poczos, Aarti Singh
We consider the problem of learning a one-hidden-layer neural network with non-overlapping convolutional layer and ReLU activation, i. e., $f(\mathbf{Z}, \mathbf{w}, \mathbf{a}) = \sum_j a_j\sigma(\mathbf{w}^T\mathbf{Z}_j)$, in which both the convolutional weights $\mathbf{w}$ and the output weights $\mathbf{a}$ are parameters to be learned.
no code implementations • ICLR 2018 • Simon S. Du, Jason D. Lee, Yuandong Tian
We show that (stochastic) gradient descent with random initialization can learn the convolutional filter in polynomial time and the convergence rate depends on the smoothness of the input distribution and the closeness of patches.
2 code implementations • NeurIPS 2017 • Yuandong Tian, Qucheng Gong, Wenling Shang, Yuxin Wu, C. Lawrence Zitnick
In addition, our platform is flexible in terms of environment-agent communication topologies, choices of RL methods, changes in game parameters, and can host existing C/C++-based game environments like Arcade Learning Environment.
no code implementations • 12 Jun 2017 • Wenling Shang, Kihyuk Sohn, Yuandong Tian
Despite recent successes in synthesizing faces and bedrooms, existing generative models struggle to capture more complex image types, potentially due to the oversimplification of their latent space constructions.
1 code implementation • ICML 2017 • Yuandong Tian
We train our network with gradient descent on $\mathbf{w}$ to mimic the output of a teacher network with the same architecture and fixed parameters $\mathbf{w}^*$.
1 code implementation • 29 Apr 2016 • Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman
In this work, we propose 3D INterpreter Network (3D-INN), an end-to-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data.
7 code implementations • 7 Dec 2015 • Bolei Zhou, Yuandong Tian, Sainbayar Sukhbaatar, Arthur Szlam, Rob Fergus
We describe a very simple bag-of-words baseline for visual question answering.
3 code implementations • 19 Nov 2015 • Yuandong Tian, Yan Zhu
Against human players, the newest versions, darkfores2, achieve a stable 3d level on KGS Go Server as a ranked bot, a substantial improvement upon the estimated 4k-5k ranks for DCNN reported in Clark & Storkey (2015) based on games against other machine players.
2 code implementations • CVPR 2017 • Yan Zhu, Yuandong Tian, Dimitris Mexatas, Piotr Dollár
Specifically, we create an amodal segmentation of each image: the full extent of each region is marked, not just the visible pixels.
no code implementations • 26 Jun 2015 • Mark Tygert, Arthur Szlam, Soumith Chintala, Marc'Aurelio Ranzato, Yuandong Tian, Wojciech Zaremba
The conventional classification schemes -- notably multinomial logistic regression -- used in conjunction with convolutional networks (convnets) are classical in statistics, designed without consideration for the usual coupling with convnets, stochastic gradient descent, and backpropagation.