1 code implementation • 26 Apr 2022 • Linnan Wang, Chenhan Yu, Satish Salian, Slawomir Kierat, Szymon Migacz, Alex Fit Florea
This paper intends to expedite the model customization with a model hub that contains the optimized models tiered by their inference latency using Neural Architecture Search (NAS).
Ranked #2 on
Neural Architecture Search
on ImageNet
no code implementations • CVPR 2022 • Linnan Wang, Chenhan Yu, Satish Salian, Slawomir Kierat, Szymon Migacz, Alex Fit Florea
To achieve this goal, we build a distributed NAS system to search on a novel search space that consists of prominent factors to impact latency and accuracy.
1 code implementation • 7 Oct 2021 • Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian
In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.
no code implementations • ICLR 2022 • Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian
In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.
2 code implementations • NeurIPS 2021 • Kevin Yang, Tianjun Zhang, Chris Cummins, Brandon Cui, Benoit Steiner, Linnan Wang, Joseph E. Gonzalez, Dan Klein, Yuandong Tian
Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function.
1 code implementation • NeurIPS 2020 • Linnan Wang, Rodrigo Fonseca, Yuandong Tian
If the nonlinear partition function and the local model fits well with ground-truth black-box function, then good partitions and candidates can be reached with much fewer samples.
2 code implementations • 11 Jun 2020 • Yiyang Zhao, Linnan Wang, Yuandong Tian, Rodrigo Fonseca, Tian Guo
supernet, to approximate the performance of every architecture in the search space via weight-sharing.
no code implementations • 25 Sep 2019 • Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian
As a result, using manually designed action space to perform NAS often leads to sample-inefficient explorations of architectures and thus can be sub-optimal.
1 code implementation • 17 Jun 2019 • Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian
To improve the sample efficiency, this paper proposes Latent Action Neural Architecture Search (LaNAS), which learns actions to recursively partition the search space into good or bad regions that contain networks with similar performance metrics.
1 code implementation • 26 Mar 2019 • Linnan Wang, Yiyang Zhao, Yuu Jinnai, Yuandong Tian, Rodrigo Fonseca
Neural Architecture Search (NAS) has shown great success in automating the design of neural networks, but the prohibitive amount of computations behind current NAS methods requires further investigations in improving the sample efficiency and the network evaluation cost to get better results in a shorter time.
Ranked #7 on
Neural Architecture Search
on CIFAR-10 Image Classification
(Params metric)
1 code implementation • 1 Jan 2019 • Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian
To improve the sample efficiency, this paper proposes Latent Action Neural Architecture Search (LaNAS), which learns actions to recursively partition the search space into good or bad regions that contain networks with similar performance metrics.
Ranked #15 on
Image Classification
on CIFAR-10
2 code implementations • 18 May 2018 • Linnan Wang, Yiyang Zhao, Yuu Jinnai, Yuandong Tian, Rodrigo Fonseca
Neural Architecture Search (NAS) has shown great success in automating the design of neural networks, but the prohibitive amount of computations behind current NAS methods requires further investigations in improving the sample efficiency and the network evaluation cost to get better results in a shorter time.
no code implementations • 13 Jan 2018 • Linnan Wang, Jinmian Ye, Yiyang Zhao, Wei Wu, Ang Li, Shuaiwen Leon Song, Zenglin Xu, Tim Kraska
Given the limited GPU DRAM, SuperNeurons not only provisions the necessary memory for the training, but also dynamically allocates the memory for convolution workspaces to achieve the high performance.
no code implementations • CVPR 2018 • Jinmian Ye, Linnan Wang, Guangxi Li, Di Chen, Shandian Zhe, Xinqi Chu, Zenglin Xu
On three challenging tasks, including Action Recognition in Videos, Image Captioning and Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of both prediction accuracy and convergence rate.
no code implementations • 11 Nov 2016 • Guangxi Li, Zenglin Xu, Linnan Wang, Jinmian Ye, Irwin King, Michael Lyu
Probabilistic Temporal Tensor Factorization (PTTF) is an effective algorithm to model the temporal tensor data.
no code implementations • 17 Mar 2016 • Linnan Wang, Yi Yang, Martin Renqiang Min, Srimat Chakradhar
Then we present the study of ISGD batch size to the learning rate, parallelism, synchronization cost, system saturation and scalability.
no code implementations • 13 Nov 2015 • Linnan Wang, Wei Wu, Jianxiong Xiao, Yang Yi
This paper describes a method for accelerating large scale Artificial Neural Networks (ANN) training using multi-GPUs by reducing the forward and backward passes to matrix multiplication.
1 code implementation • 16 Oct 2015 • Linnan Wang, Wei Wu, Jianxiong Xiao, Yi Yang
Basic Linear Algebra Subprograms (BLAS) are a set of low level linear algebra kernels widely adopted by applications involved with the deep learning and scientific computing.
Distributed, Parallel, and Cluster Computing