no code implementations • 7 Jul 2024 • Talia Gershon, Seetharami Seelam, Brian Belgodere, Milton Bonilla, Lan Hoang, Danny Barnett, I-Hsin Chung, Apoorve Mohan, Ming-Hung Chen, Lixiang Luo, Robert Walkup, Constantinos Evangelinos, Shweta Salaria, Marc Dombrowa, Yoonho Park, Apo Kayi, Liran Schour, Alim Alim, Ali Sydney, Pavlos Maniotis, Laurent Schares, Bernard Metzler, Bengi Karacali-Akyamac, Sophia Wen, Tatsuhiro Chiba, Sunyanan Choochotkaew, Takeshi Yoshimura, Claudia Misale, Tonia Elengikal, Kevin O Connor, Zhuoran Liu, Richard Molina, Lars Schneidenbach, James Caden, Christopher Laibinis, Carlos Fonseca, Vasily Tarasov, Swaminathan Sundararaman, Frank Schmuck, Scott Guthridge, Jeremy Cohn, Marc Eshel, Paul Muench, Runyu Liu, William Pointer, Drew Wyskida, Bob Krull, Ray Rose, Brent Wolfe, William Cornejo, John Walter, Colm Malone, Clifford Perucci, Frank Franco, Nigel Hinds, Bob Calio, Pavel Druyan, Robert Kilduff, John Kienle, Connor McStay, Andrew Figueroa, Matthew Connolly, Edie Fost, Gina Roma, Jake Fonseca, Ido Levy, Michele Payne, Ryan Schenkel, Amir Malki, Lion Schneider, Aniruddha Narkhede, Shekeba Moshref, Alexandra Kisin, Olga Dodin, Bill Rippon, Henry Wrieth, John Ganci, Johnny Colino, Donna Habeger-Rose, Rakesh Pandey, Aditya Gidh, Dennis Patterson, Samsuddin Salmani, Rambilas Varma, Rumana Rumana, Shubham Sharma, Aditya Gaur, Mayank Mishra, Rameswar Panda, Aditya Prasad, Matt Stallone, Gaoyuan Zhang, Yikang Shen, David Cox, Ruchir Puri, Dakshi Agrawal, Drew Thorstensen, Joel Belog, Brent Tang, Saurabh Kumar Gupta, Amitabha Biswas, Anup Maheshwari, Eran Gampel, Jason Van Patten, Matthew Runion, Sai Kaki, Yigal Bogin, Brian Reitz, Steve Pritko, Shahan Najam, Surya Nambala, Radhika Chirra, Rick Welp, Frank DiMitri, Felipe Telles, Amilcar Arvelo, King Chu, Ed Seminaro, Andrew Schram, Felix Eickhoff, William Hanson, Eric Mckeever, Dinakaran Joseph, Piyush Chaudhary, Piyush Shivam, Puneet Chaudhary, Wesley Jones, Robert Guthrie, Chris Bostic, Rezaul Islam, Steve Duersch, Wayne Sawdon, John Lewars, Matthew Klos, Michael Spriggs, Bill McMillan, George Gao, Ashish Kamra, Gaurav Singh, Marc Curry, Tushar Katarki, Joe Talerico, Zenghui Shi, Sai Sindhur Malleni, Erwan Gallen
This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks.
no code implementations • 17 Jun 2024 • Junmo Kang, Leonid Karlinsky, Hongyin Luo, Zhen Wang, Jacob Hansen, James Glass, David Cox, Rameswar Panda, Rogerio Feris, Alan Ritter
Our findings highlight the critical role of modularity, the applicability of Self-MoE to multiple base LLMs, and the potential of self-improvement in achieving efficient, scalable, and adaptable systems.
no code implementations • 27 May 2024 • Runqian Wang, Soumya Ghosh, David Cox, Diego Antognini, Aude Oliva, Rogerio Feris, Leonid Karlinsky
Our approach relies on synthetic data to transfer LoRA modules.
no code implementations • 11 Oct 2023 • Cheng-I Jeff Lai, Freda Shi, Puyuan Peng, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David Cox, David Harwath, Yang Zhang, Karen Livescu, James Glass
We study phrase structure induction from visually-grounded speech.
1 code implementation • 9 Oct 2023 • Zhiqing Sun, Yikang Shen, Hongxin Zhang, Qinhong Zhou, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan
Supervised Fine-Tuning (SFT) on response demonstrations combined with Reinforcement Learning from Human Feedback (RLHF) constitutes a powerful paradigm for aligning LLM-based AI agents.
no code implementations • 29 Sep 2023 • Junmo Kang, Hongyin Luo, Yada Zhu, Jacob Hansen, James Glass, David Cox, Alan Ritter, Rogerio Feris, Leonid Karlinsky
Recent works have demonstrated the effectiveness of self-alignment in which a large language model is aligned to follow general instructions using instructional data generated from the model itself starting from a handful of human-written seeds.
1 code implementation • NeurIPS 2023 • Zhiqing Sun, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan
Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback (RLHF) to align the output of large language models (LLMs) with human intentions, ensuring they are helpful, ethical, and reliable.
no code implementations • 12 Feb 2023 • Lee Martie, Jessie Rosenberg, Veronique Demers, Gaoyuan Zhang, Onkar Bhardwaj, John Henning, Aditya Prasad, Matt Stallone, Ja Young Lee, Lucy Yip, Damilola Adesina, Elahe Paikari, Oscar Resendiz, Sarah Shaw, David Cox
Compositional AI systems, which combine multiple artificial intelligence components together with other application components to solve a larger problem, have no known pattern of development and are often approached in a bespoke and ad hoc style.
1 code implementation • CVPR 2023 • James Seale Smith, Paola Cascante-Bonilla, Assaf Arbelle, Donghyun Kim, Rameswar Panda, David Cox, Diyi Yang, Zsolt Kira, Rogerio Feris, Leonid Karlinsky
This leads to reasoning mistakes, which need to be corrected as they occur by teaching VL models the missing SVLC skills; often this must be done using private data where the issue was found, which naturally leads to a data-free continual (no task-id) VL learning setting.
1 code implementation • CVPR 2022 • Yi Li, Rameswar Panda, Yoon Kim, Chun-Fu, Chen, Rogerio Feris, David Cox, Nuno Vasconcelos
In particular, given a source sentence an autoregressive hallucination transformer is used to predict a discrete visual representation from the input text, and the combined text and hallucinated representations are utilized to obtain the target translation.
1 code implementation • 20 Apr 2022 • Kaizhi Qian, Yang Zhang, Heting Gao, Junrui Ni, Cheng-I Lai, David Cox, Mark Hasegawa-Johnson, Shiyu Chang
Self-supervised learning in speech involves training a speech representation network on a large-scale unannotated speech corpus, and then applying the learned representations to downstream tasks.
1 code implementation • NeurIPS 2021 • Yonggan Fu, Qixuan Yu, Yang Zhang, Shang Wu, Xu Ouyang, David Cox, Yingyan Lin
Deep Neural Networks (DNNs) are known to be vulnerable to adversarial attacks, i. e., an imperceptible perturbation to the input can mislead DNNs trained on clean images into making erroneous predictions.
no code implementations • 4 Oct 2021 • Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David Cox, James Glass
Are end-to-end text-to-speech (TTS) models over-parametrized?
1 code implementation • 16 Jun 2021 • Kaizhi Qian, Yang Zhang, Shiyu Chang, JinJun Xiong, Chuang Gan, David Cox, Mark Hasegawa-Johnson
In this paper, we propose AutoPST, which can disentangle global prosody style from speech without relying on any text transcriptions.
1 code implementation • 11 Jun 2021 • Yonggan Fu, Yongan Zhang, Yang Zhang, David Cox, Yingyan Lin
The key challenges include (1) the dilemma of whether to explode the memory consumption due to the huge joint space or achieve sub-optimal designs, (2) the discrete nature of the accelerator design space that is coupled yet different from that of the networks and bitwidths, and (3) the chicken and egg problem associated with network-accelerator co-search, i. e., co-search requires operation-wise hardware cost, which is lacking during search as the optimal accelerator depending on the whole network is still unknown during search.
no code implementations • NeurIPS 2021 • Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David Cox, James Glass
We investigate the existence of sparse subnetworks in pre-trained speech SSL models that achieve even better low-resource ASR results.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • NeurIPS 2020 • Joel Dapello, Tiago Marques, Martin Schrimpf, Franziska Geiger, David Cox, James J. DiCarlo
Current state-of-the-art object recognition models are largely based on convolutional neural network (CNN) architectures, which are loosely inspired by the primate visual system.
no code implementations • 2 Sep 2020 • Wang Zhou, Shiyu Chang, Norma Sosa, Hendrik Hamann, David Cox
Recent advances in object detection have benefited significantly from rapid developments in deep neural networks.
1 code implementation • 9 Jul 2020 • Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael Lingelbach, Aidan Curtis, Kevin Feigelis, Daniel M. Bear, Dan Gutfreund, David Cox, Antonio Torralba, James J. DiCarlo, Joshua B. Tenenbaum, Josh H. McDermott, Daniel L. K. Yamins
We introduce ThreeDWorld (TDW), a platform for interactive multi-modal physical simulation.
6 code implementations • ICML 2020 • Kaizhi Qian, Yang Zhang, Shiyu Chang, David Cox, Mark Hasegawa-Johnson
Speech information can be roughly decomposed into four components: language content, timbre, pitch, and rhythm.
no code implementations • 29 Jan 2020 • Rosaura G. VidalMata, Walter J. Scheirer, Anna Kukleva, David Cox, Hilde Kuehne
Understanding the structure of complex activities in untrimmed videos is a challenging task in the area of action recognition.
1 code implementation • NeurIPS 2019 • Quanfu Fan, Chun-Fu Chen, Hilde Kuehne, Marco Pistoia, David Cox
Current state-of-the-art models for video action recognition are mostly based on expensive 3D ConvNets.
Ranked #89 on Action Recognition on Something-Something V2 (using extra training data)
no code implementations • ICCV 2019 • Chuang Gan, Hang Zhao, Peihao Chen, David Cox, Antonio Torralba
At test time, the stereo-sound student network can work independently to perform object localization us-ing just stereo audio and camera meta-data, without any visual input.
1 code implementation • NeurIPS 2019 • Xiangyi Chen, Sijia Liu, Kaidi Xu, Xingguo Li, Xue Lin, Mingyi Hong, David Cox
In this paper, we propose a zeroth-order AdaMM (ZO-AdaMM) algorithm, that generalizes AdaMM to the gradient-free regime.
1 code implementation • MAPL 2019 • Philippe Tillet, H. T. Kung, David Cox
The validation and deployment of novel research ideas in the field of Deep Learning is often limited by the availability of efficient compute kernels for certain basic primitives.
1 code implementation • 21 Jul 2018 • Eric Wu, Kevin Wu, David Cox, William Lotter
Deep learning approaches to breast cancer detection in mammograms have recently shown promising results.
no code implementations • 28 May 2018 • William Lotter, Gabriel Kreiman, David Cox
Interestingly, recent work has shown that deep convolutional neural networks (CNNs) trained on large-scale image recognition tasks can serve as strikingly good models for predicting the responses of neurons in visual cortex to visual stimuli, suggesting that analogies between artificial and biological neural networks may be more than superficial.
no code implementations • 24 May 2018 • German Abrevaya, Irina Rish, Aleksandr Y. Aravkin, Guillermo Cecchi, James Kozloski, Pablo Polosecki, Peng Zheng, Silvina Ponce Dawson, Juliana Rhee, David Cox
Many real-world data sets, especially in biology, are produced by complex nonlinear dynamical systems.
no code implementations • 21 Jul 2017 • William Lotter, Greg Sorensen, David Cox
Screening mammography is an important front-line tool for the early detection of breast cancer, and some 39 million exams are conducted each year in the United States alone.
1 code implementation • 7 Jun 2017 • Hanlin Tang, Martin Schrimpf, Bill Lotter, Charlotte Moerman, Ana Paredes, Josue Ortega Caro, Walter Hardesty, David Cox, Gabriel Kreiman
First, subjects robustly recognized objects even when rendered <15% visible, but recognition was largely impaired when processing was interrupted by backward masking.
no code implementations • 16 Mar 2017 • Ruth Fong, Walter Scheirer, David Cox
The effectiveness of this approach points to a path forward for a new class of hybrid machine learning algorithms which take both inspiration and direct constraints from neuronal data.
1 code implementation • NeurIPS 2016 • Chuan-Yung Tsai, Andrew Saxe, David Cox
We present a novel neural network algorithm, the Tensor Switching (TS) network, which generalizes the Rectified Linear Unit (ReLU) nonlinearity to tensor-valued hidden units.
no code implementations • 8 Aug 2016 • David Cox
Delta Epsilon Alpha Star is a minimal coverage, real-time robotic search algorithm that yields a moderately aggressive search path with minimal backtracking.
1 code implementation • 8 Aug 2016 • David Cox
We present a self-contained system for constructing natural language models for use in text compression.
17 code implementations • 25 May 2016 • William Lotter, Gabriel Kreiman, David Cox
Here, we explore prediction of future frames in a video sequence as an unsupervised learning rule for learning about the structure of the visual world.
2 code implementations • 19 Nov 2015 • William Lotter, Gabriel Kreiman, David Cox
The ability to predict future states of the environment is a central pillar of intelligence.
no code implementations • CVPR 2014 • Eleonora Vig, Michael Dorr, David Cox
Our models outperform the state of the art on MIT1003, on which features and classifiers are learned.