2 code implementations • 22 Feb 2024 • Zechun Liu, Changsheng Zhao, Forrest Iandola, Chen Lai, Yuandong Tian, Igor Fedorov, Yunyang Xiong, Ernie Chang, Yangyang Shi, Raghuraman Krishnamoorthi, Liangzhen Lai, Vikas Chandra
The resultant models, denoted as MobileLLM-LS, demonstrate a further accuracy enhancement of 0. 7%/0. 8% than MobileLLM 125M/350M.
no code implementations • CVPR 2024 • Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra
In this paper, we reveal that the existing score distillation-based text-to-3D generation frameworks degenerate to maximal likelihood seeking on each view independently and thus suffer from the mode collapse problem, manifesting as the Janus artifact in practice.
no code implementations • 31 Dec 2023 • Peihao Wang, Zhiwen Fan, Dejia Xu, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra
In this paper, we reveal that the gradient estimation in score distillation is inherent to high variance.
no code implementations • 11 Dec 2023 • Balakrishnan Varadarajan, Bilge Soran, Forrest Iandola, Xiaoyu Xiang, Yunyang Xiong, Lemeng Wu, Chenchen Zhu, Raghuraman Krishnamoorthi, Vikas Chandra
A common user expectation is that a click on a specific part of an object will result in the segmentation of the entire object.
1 code implementation • CVPR 2024 • Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xiang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, Raghuraman Krishnamoorthi, Vikas Chandra
On segment anything task such as zero-shot instance segmentation, our EfficientSAMs with SAMI-pretrained lightweight image encoders perform favorably with a significant gain (e. g., ~4 AP on COCO/LVIS) over other fast SAM models.
Ranked #3 on Zero-Shot Instance Segmentation on LVIS v1.0 val
no code implementations • 1 Nov 2023 • Ernie Chang, Sidd Srinivasan, Mahi Luthra, Pin-Jie Lin, Varun Nagaraja, Forrest Iandola, Zechun Liu, Zhaoheng Ni, Changsheng Zhao, Yangyang Shi, Vikas Chandra
Text-to-audio generation (TTA) produces audio from a text description, learning from pairs of audio samples and hand-annotated text.
no code implementations • 1 Nov 2023 • Ernie Chang, Pin-Jie Lin, Yang Li, Sidd Srinivasan, Gael Le Lan, David Kant, Yangyang Shi, Forrest Iandola, Vikas Chandra
We show that the framework enhanced the audio quality across the set of collected user prompts, which were edited with reference to the training captions as exemplars.
1 code implementation • 24 Oct 2023 • Jay Zhangjie Wu, Xiuyu Li, Difei Gao, Zhen Dong, Jinbin Bai, Aishani Singh, Xiaoyu Xiang, Youzeng Li, Zuwei Huang, Yuanxi Sun, Rui He, Feng Hu, Junhua Hu, Hai Huang, Hanyu Zhu, Xu Cheng, Jie Tang, Mike Zheng Shou, Kurt Keutzer, Forrest Iandola
In this paper we present a retrospective on the competition and describe the winning method.
no code implementations • 15 Sep 2023 • Yangyang Shi, Gael Le Lan, Varun Nagaraja, Zhaoheng Ni, Xinhao Mei, Ernie Chang, Forrest Iandola, Yang Liu, Vikas Chandra
This paper presents an innovative approach to enhance control over audio generation by emphasizing the alignment between audio and text representations during model training.
no code implementations • 15 Sep 2023 • Gael Le Lan, Varun Nagaraja, Ernie Chang, David Kant, Zhaoheng Ni, Yangyang Shi, Forrest Iandola, Vikas Chandra
In language modeling based music generation, a generated waveform is represented by a sequence of hierarchical token stacks that can be decoded either in an auto-regressive manner or in parallel, depending on the codebook patterns.
1 code implementation • 5 Aug 2019 • Albert Shaw, Daniel Hunter, Forrest Iandola, Sammy Sidhu
For real time applications utilizing Deep Neural Networks (DNNs), it is critical that the models achieve high-accuracy on the target task and low-latency inference on the target computing platform.
Ranked #68 on Semantic Segmentation on Cityscapes val (using extra training data)
no code implementations • 17 Nov 2018 • Paden Tomasello, Sammy Sidhu, Anting Shen, Matthew W. Moskewicz, Nobie Redmon, Gayatri Joshi, Romi Phadte, Paras Jain, Forrest Iandola
Recently, autonomous vehicles have created a demand for depth information, which is often obtained using hardware sensors such as Light detection and ranging (LIDAR).
no code implementations • 7 Oct 2017 • Forrest Iandola, Kurt Keutzer
Over the last five years Deep Neural Nets have offered more accurate solutions to many problems in speech recognition, and computer vision, and these solutions have surpassed a threshold of acceptability for many applications.
no code implementations • 20 Dec 2016 • Forrest Iandola
Exploring the design space of CNN architectures.
13 code implementations • 4 Dec 2016 • Bichen Wu, Alvin Wan, Forrest Iandola, Peter H. Jin, Kurt Keutzer
In addition to requiring high accuracy to ensure safety, object detection for autonomous driving also requires real-time inference speed to guarantee prompt vehicle control, as well as small model size and energy efficiency to enable embedded system deployment.
no code implementations • 14 Nov 2016 • Peter H. Jin, Qiaochu Yuan, Forrest Iandola, Kurt Keutzer
Training time on large datasets for deep neural networks is the principal workflow bottleneck in a number of important applications of deep learning, such as object classification and detection in automatic driver assistance systems (ADAS).
1 code implementation • 1 Jun 2016 • Matthew Moskewicz, Forrest Iandola, Kurt Keutzer
Results are presented for a case study of targeting the Qualcomm Snapdragon 820 mobile computing platform for CNN deployment.
1 code implementation • ITSC 2015 • Forrest Iandola, Matthew Moskewicz, Kurt Keutzer
Histogram of Oriented Gradients (HOG) features are the underlying representation in automotive computer vision applications such as collision avoidance and lane keeping.
1 code implementation • CVPR 2015 • Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, C. Lawrence Zitnick, Geoffrey Zweig
The language model learns from a set of over 400, 000 image descriptions to capture the statistics of word usage.
Ranked #1 on Image Captioning on COCO Captions test
1 code implementation • CVPR 2015 • Ross Girshick, Forrest Iandola, Trevor Darrell, Jitendra Malik
Deformable part models (DPMs) and convolutional neural networks (CNNs) are two widely used tools for visual recognition.
Ranked #27 on Object Detection on PASCAL VOC 2007
2 code implementations • 7 Apr 2014 • Forrest Iandola, Matt Moskewicz, Sergey Karayev, Ross Girshick, Trevor Darrell, Kurt Keutzer
Convolutional Neural Networks (CNNs) can provide accurate object classification.
no code implementations • ICCV 2013 • Ning Zhang, Ryan Farrell, Forrest Iandola, Trevor Darrell
Recognizing objects in fine-grained domains can be extremely challenging due to the subtle differences between subcategories.
Ranked #26 on Fine-Grained Image Classification on CUB-200-2011