no code implementations • 15 Feb 2025 • Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Man Luo, Sungduk Yu, Chendi Xue, Vasudev Lal
DeepSeek-R1, the largest open-source Mixture-of-Experts (MoE) model, has demonstrated reasoning capabilities comparable to proprietary frontier models.
no code implementations • 8 Dec 2024 • Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Shao-Yen Tseng, Vasudev Lal
Although capable of generating creative text, Large Language Models (LLMs) are poor judges of what constitutes "creativity".
no code implementations • 4 Dec 2024 • Neale Ratzlaff, Man Luo, Xin Su, Vasudev Lal, Phillip Howard
In this work, we explore the effects of multimodal instruction tuning on language reasoning performance.
no code implementations • 15 Nov 2024 • Neale Ratzlaff, Matthew Lyle Olson, Musashi Hinck, Estelle Aflalo, Shao-Yen Tseng, Vasudev Lal, Phillip Howard
Large Multi-Modal Models (LMMs) have demonstrated impressive capabilities as general-purpose chatbots that can engage in conversations about a provided input, such as an image.
no code implementations • 17 Oct 2024 • Neale Ratzlaff, Matthew Lyle Olson, Musashi Hinck, Shao-Yen Tseng, Vasudev Lal, Phillip Howard
Large Vision Language Models (LVLMs) such as LLaVA have demonstrated impressive capabilities as general-purpose chatbots that can engage in conversations about a provided input image.
no code implementations • 18 Jan 2023 • Megan M. Baker, Alexander New, Mario Aguilar-Simon, Ziad Al-Halah, Sébastien M. R. Arnold, Ese Ben-Iwhiwhu, Andrew P. Brna, Ethan Brooks, Ryan C. Brown, Zachary Daniels, Anurag Daram, Fabien Delattre, Ryan Dellana, Eric Eaton, Haotian Fu, Kristen Grauman, Jesse Hostetler, Shariq Iqbal, Cassandra Kent, Nicholas Ketz, Soheil Kolouri, George Konidaris, Dhireesha Kudithipudi, Erik Learned-Miller, Seungwon Lee, Michael L. Littman, Sandeep Madireddy, Jorge A. Mendez, Eric Q. Nguyen, Christine D. Piatko, Praveen K. Pilly, Aswin Raghavan, Abrar Rahman, Santhosh Kumar Ramakrishnan, Neale Ratzlaff, Andrea Soltoggio, Peter Stone, Indranil Sur, Zhipeng Tang, Saket Tiwari, Kyle Vedder, Felix Wang, Zifan Xu, Angel Yanguas-Gil, Harel Yedidsion, Shangqun Yu, Gautam K. Vallabha
Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed.
no code implementations • 29 Sep 2021 • Matthew Lyle Olson, Neale Ratzlaff, Weng-Keen Wong
This Beta loss is a proper composite loss with a Beta weight function.
no code implementations • 18 Aug 2021 • Matthew L. Olson, Thuy-Vy Nguyen, Gaurav Dixit, Neale Ratzlaff, Weng-Keen Wong, Minsuk Kahng
Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data.
no code implementations • 1 Mar 2021 • Neale Ratzlaff, Qinxun Bai, Li Fuxin, Wei Xu
Recently, particle-based variational inference (ParVI) methods have gained interest because they can avoid arbitrary parametric assumptions that are common in variational inference.
2 code implementations • NeurIPS 2020 • Alexander Matt Turner, Neale Ratzlaff, Prasad Tadepalli
By preserving optimal value for a single randomly generated reward function, AUP incurs modest overhead while leading the agent to complete the specified task and avoid many side effects.
no code implementations • ICML 2020 • Neale Ratzlaff, Qinxun Bai, Li Fuxin, Wei Xu
Each random draw from our generative model is a neural network that instantiates the dynamic function, hence multiple draws would approximate the posterior, and the variance in the future prediction based on this posterior is used as an intrinsic reward for exploration.
1 code implementation • 30 Jan 2019 • Neale Ratzlaff, Li Fuxin
We introduce HyperGAN, a new generative model for learning a distribution of neural network parameters.
no code implementations • 27 Sep 2018 • Neale Ratzlaff, Li Fuxin
We introduce HyperGAN, a generative network that learns to generate all the weight parameters of deep neural networks.
no code implementations • 5 Apr 2018 • Neale Ratzlaff, Li Fuxin
To evaluate against an adversary with complete knowledge of our defense, we adapt the bilateral filter as a trainable layer in a neural network and show that adding this layer makes ImageNet images significantly more robust to attacks.