no code implementations • NeurIPS 2018 • Catherine Wong, Neil Houlsby, Yifeng Lu, Andrea Gesmundo
We extend RL-based architecture search methods to support parallel training on multiple tasks and then transfer the search strategy to new tasks.
2 code implementations • 27 Jan 2020 • Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le
We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations.
no code implementations • NeurIPS 2020 • Daiyi Peng, Xuanyi Dong, Esteban Real, Mingxing Tan, Yifeng Lu, Hanxiao Liu, Gabriel Bender, Adam Kraft, Chen Liang, Quoc V. Le
As a result, AutoML can be reformulated as an automated process of symbolic manipulation.
1 code implementation • CVPR 2022 • Yingwei Li, Adams Wei Yu, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Bo Wu, Yifeng Lu, Denny Zhou, Quoc V. Le, Alan Yuille, Mingxing Tan
In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e. g., rotation, to enable accurate geometric alignment between lidar points and image pixels, and LearnableAlign that leverages cross-attention to dynamically capture the correlations between image and lidar features during fusion.
1 code implementation • 15 Apr 2022 • Chengrun Yang, Gabriel Bender, Hanxiao Liu, Pieter-Jan Kindermans, Madeleine Udell, Yifeng Lu, Quoc Le, Da Huang
The best neural architecture for a given machine learning problem depends on many factors: not only the complexity and structure of the dataset, but also on resource constraints including latency, compute, energy consumption, etc.
1 code implementation • 3 Feb 2023 • Daiyi Peng, Xuanyi Dong, Esteban Real, Yifeng Lu, Quoc V. Le
We also perform a case study of a large codebase where PyGlove led to an 80% reduction in the number of lines of code.
no code implementations • NeurIPS 2023 • Xiangning Chen, Chen Liang, Da Huang, Esteban Real, Kaiyuan Wang, Yao Liu, Hieu Pham, Xuanyi Dong, Thang Luong, Cho-Jui Hsieh, Yifeng Lu, Quoc V. Le
On diffusion models, Lion outperforms Adam by achieving a better FID score and reducing the training compute by up to 2. 3x.
no code implementations • 7 Mar 2023 • Jerry Wei, Jason Wei, Yi Tay, Dustin Tran, Albert Webson, Yifeng Lu, Xinyun Chen, Hanxiao Liu, Da Huang, Denny Zhou, Tengyu Ma
We next study semantically-unrelated label ICL (SUL-ICL), in which labels are semantically unrelated to their inputs (e. g., foo/bar instead of negative/positive), thereby forcing language models to learn the input-label mappings shown in in-context exemplars in order to perform the task.
no code implementations • 15 May 2023 • Jerry Wei, Le Hou, Andrew Lampinen, Xiangning Chen, Da Huang, Yi Tay, Xinyun Chen, Yifeng Lu, Denny Zhou, Tengyu Ma, Quoc V. Le
We present symbol tuning - finetuning language models on in-context input-label pairs where natural language labels (e. g., "positive/negative sentiment") are replaced with arbitrary symbols (e. g., "foo/bar").
2 code implementations • NeurIPS 2023 • Sang Michael Xie, Hieu Pham, Xuanyi Dong, Nan Du, Hanxiao Liu, Yifeng Lu, Percy Liang, Quoc V. Le, Tengyu Ma, Adams Wei Yu
The mixture proportions of pretraining data domains (e. g., Wikipedia, books, web text) greatly affect language model (LM) performance.
no code implementations • 29 May 2023 • Yanqi Zhou, Nan Du, Yanping Huang, Daiyi Peng, Chang Lan, Da Huang, Siamak Shakeri, David So, Andrew Dai, Yifeng Lu, Zhifeng Chen, Quoc Le, Claire Cui, James Laundon, Jeff Dean
Using this insight, we develop a complex block, named Brainformer, that consists of a diverse sets of layers such as sparsely gated feed-forward layers, dense feed-forward layers, attention layers, and various forms of layer normalization and activation functions.
1 code implementation • 7 Aug 2023 • Jerry Wei, Da Huang, Yifeng Lu, Denny Zhou, Quoc V. Le
Adding these data in a lightweight finetuning step can significantly reduce sycophantic behavior on held-out prompts.
2 code implementations • 7 Sep 2023 • Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, Xinyun Chen
Optimization is ubiquitous.
1 code implementation • 27 Mar 2024 • Jerry Wei, Chengrun Yang, Xinying Song, Yifeng Lu, Nathan Hu, Dustin Tran, Daiyi Peng, Ruibo Liu, Da Huang, Cosmo Du, Quoc V. Le
Empirically, we demonstrate that LLM agents can achieve superhuman rating performance - on a set of ~16k individual facts, SAFE agrees with crowdsourced human annotators 72% of the time, and on a random subset of 100 disagreement cases, SAFE wins 76% of the time.