Search Results for author: Adams Wei Yu

Found 25 papers, 10 papers with code

Efficient Structured Matrix Rank Minimization

no code implementations NeurIPS 2014 Adams Wei Yu, Wanli Ma, YaoLiang Yu, Jaime Carbonell, Suvrit Sra

We study the problem of finding structured low-rank matrices using nuclear norm regularization where the structure is encoded by a linear map.

Doubly Stochastic Primal-Dual Coordinate Method for Bilinear Saddle-Point Problem

no code implementations14 Aug 2015 Adams Wei Yu, Qihang Lin, Tianbao Yang

We propose a doubly stochastic primal-dual coordinate optimization algorithm for empirical risk minimization, which can be formulated as a bilinear saddle-point problem.

AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization

no code implementations20 Aug 2015 Suvrit Sra, Adams Wei Yu, Mu Li, Alexander J. Smola

We study distributed stochastic convex optimization under the delayed gradient model where the server nodes perform parameter updates, while the worker nodes compute stochastic gradients.

On Computationally Tractable Selection of Experiments in Measurement-Constrained Regression Models

no code implementations9 Jan 2016 Yining Wang, Adams Wei Yu, Aarti Singh

We derive computationally tractable methods to select a small subset of experiment settings from a large pool of given design points.

Combinatorial Optimization regression

An Improved Gap-Dependency Analysis of the Noisy Power Method

no code implementations23 Feb 2016 Maria Florina Balcan, Simon S. Du, Yining Wang, Adams Wei Yu

We consider the noisy power method algorithm, which has wide applications in machine learning and statistics, especially those related to principal component analysis (PCA) under resource (communication, memory or privacy) constraints.

Learning to Skim Text

4 code implementations ACL 2017 Adams Wei Yu, Hongrae Lee, Quoc V. Le

Recurrent Neural Networks are showing much promise in many sub-areas of natural language processing, ranging from document classification to machine translation to automatic question answering.

Document Classification General Classification +4

Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network

2 code implementations ICLR 2018 Adams Wei Yu, Lei Huang, Qihang Lin, Ruslan Salakhutdinov, Jaime Carbonell

In this paper, we propose a generic and simple strategy for utilizing stochastic gradient information in optimization.

Orthogonal Weight Normalization: Solution to Optimization over Multiple Dependent Stiefel Manifolds in Deep Neural Networks

1 code implementation16 Sep 2017 Lei Huang, Xianglong Liu, Bo Lang, Adams Wei Yu, Yongliang Wang, Bo Li

In this paper, we generalize such square orthogonal matrix to orthogonal rectangular matrix and formulating this problem in feed-forward Neural Networks (FNNs) as Optimization over Multiple Dependent Stiefel Manifolds (OMDSM).

Image Classification

EXPLORING NEURAL ARCHITECTURE SEARCH FOR LANGUAGE TASKS

no code implementations ICLR 2018 Minh-Thang Luong, David Dohan, Adams Wei Yu, Quoc V. Le, Barret Zoph, Vijay Vasudevan

Neural architecture search (NAS), the task of finding neural architectures automatically, has recently emerged as a promising approach for unveiling better models over human-designed ones.

Language Modelling Neural Architecture Search +2

Orthogonal Weight Normalization: Solution to Optimization overMultiple Dependent Stiefel Manifolds in Deep Neural Networks

1 code implementation The Thirty-Second AAAI Conferenceon Artificial Intelligence 2018 Lei Huang, Xianglong Liu, Bo Lang, Adams Wei Yu, Yongliang Wang, Bo Li

In this paper, we generalize such square orthogonal matrix to orthogonal rectangular matrix and formulating this problem in feed-forward Neural Networks (FNNs) as Optimization over Multiple Dependent Stiefel Manifolds (OMDSM).

Detecting Nonlinear Causality in Multivariate Time Series with Sparse Additive Models

no code implementations11 Mar 2018 Yingxiang Yang, Adams Wei Yu, Zhaoran Wang, Tuo Zhao

We propose a nonparametric method for detecting nonlinear causal relationship within a set of multidimensional discrete time series, by using sparse additive models (SpAMs).

Additive models Model Selection +2

Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension

no code implementations ICLR 2020 Xinyun Chen, Chen Liang, Adams Wei Yu, Denny Zhou, Dawn Song, Quoc V. Le

Integrating distributed representations with symbolic operations is essential for reading comprehension requiring complex reasoning, such as counting, sorting and arithmetics, but most existing approaches are hard to scale to more domains or more complex reasoning.

Data Augmentation Math +2

AutoHAS: Efficient Hyperparameter and Architecture Search

no code implementations5 Jun 2020 Xuanyi Dong, Mingxing Tan, Adams Wei Yu, Daiyi Peng, Bogdan Gabrys, Quoc V. Le

Efficient hyperparameter or architecture search methods have shown remarkable results, but each of them is only applicable to searching for either hyperparameters (HPs) or architectures.

Hyperparameter Optimization Neural Architecture Search +1

Compositional Generalization via Neural-Symbolic Stack Machines

no code implementations NeurIPS 2020 Xinyun Chen, Chen Liang, Adams Wei Yu, Dawn Song, Denny Zhou

Despite achieving tremendous success, existing deep learning models have exposed limitations in compositional generalization, the capability to learn compositional rules and apply them to unseen cases in a systematic manner.

Few-Shot Learning Machine Translation +1

SimVLM: Simple Visual Language Model Pretraining with Weak Supervision

2 code implementations ICLR 2022 ZiRui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, Yuan Cao

With recent progress in joint modeling of visual and textual representations, Vision-Language Pretraining (VLP) has achieved impressive performance on many multimodal downstream tasks.

Image Captioning Language Modelling +2

Finetuned Language Models Are Zero-Shot Learners

5 code implementations ICLR 2022 Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le

We show that instruction tuning -- finetuning language models on a collection of tasks described via instructions -- substantially improves zero-shot performance on unseen tasks.

Common Sense Reasoning Coreference Resolution +8

Towards Zero-Label Language Learning

no code implementations19 Sep 2021 ZiRui Wang, Adams Wei Yu, Orhan Firat, Yuan Cao

This paper explores zero-label learning in Natural Language Processing (NLP), whereby no human-annotated data is used anywhere during training and models are trained purely on synthetic data.

Data Augmentation

Combined Scaling for Zero-shot Transfer Learning

no code implementations19 Nov 2021 Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.

Classification Contrastive Learning +3

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection

1 code implementation CVPR 2022 Yingwei Li, Adams Wei Yu, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Bo Wu, Yifeng Lu, Denny Zhou, Quoc V. Le, Alan Yuille, Mingxing Tan

In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e. g., rotation, to enable accurate geometric alignment between lidar points and image pixels, and LearnableAlign that leverages cross-attention to dynamically capture the correlations between image and lidar features during fusion.

3D Object Detection Autonomous Driving +2

Does GNN Pretraining Help Molecular Representation?

no code implementations13 Jul 2022 Ruoxi Sun, Hanjun Dai, Adams Wei Yu

Extracting informative representations of molecules using Graph neural networks (GNNs) is crucial in AI-driven drug discovery.

Drug Discovery molecular representation

DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

2 code implementations NeurIPS 2023 Sang Michael Xie, Hieu Pham, Xuanyi Dong, Nan Du, Hanxiao Liu, Yifeng Lu, Percy Liang, Quoc V. Le, Tengyu Ma, Adams Wei Yu

The mixture proportions of pretraining data domains (e. g., Wikipedia, books, web text) greatly affect language model (LM) performance.

Language Modelling

Large Language Models Cannot Self-Correct Reasoning Yet

1 code implementation3 Oct 2023 Jie Huang, Xinyun Chen, Swaroop Mishra, Huaixiu Steven Zheng, Adams Wei Yu, Xinying Song, Denny Zhou

Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities across various applications.

Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.