Search Results for author: Shital Shah

Found 15 papers, 7 papers with code

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations • 22 Apr 2024 • Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra, Xiyang Dai, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Victor Fragoso, Dan Iter, Mei Gao, Min Gao, Jianfeng Gao, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Ce Liu, Mengchen Liu, Weishung Liu, Eric Lin, Zeqi Lin, Chong Luo, Piyush Madan, Matt Mazzola, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Xin Wang, Lijuan Wang, Chunyu Wang, Yu Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Haiping Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Sonali Yadav, Fan Yang, Jianwei Yang, ZiYi Yang, Yifan Yang, Donghan Yu, Lu Yuan, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Language Modelling

Paper
Add Code

Textbooks Are All You Need

no code implementations • 20 Jun 2023 • Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li

Despite this small scale, phi-1 attains pass@1 accuracy 50. 6% on HumanEval and 55. 5% on MBPP.

Ranked #42 on Code Generation on HumanEval

Code Generation Language Modelling +1

Paper
Add Code

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

no code implementations • 6 Oct 2022 • Ganesh Jawahar, Subhabrata Mukherjee, Debadeepta Dey, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Caio Cesar Teodoro Mendes, Gustavo Henrique de Rosa, Shital Shah

In this work, we study the more challenging open-domain setting consisting of low frequency user prompt patterns (or broad prompts, e. g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models.

Inductive Bias

Paper
Add Code

One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning

no code implementations • 15 Mar 2022 • Sharath Girish, Debadeepta Dey, Neel Joshi, Vibhav Vineet, Shital Shah, Caio Cesar Teodoro Mendes, Abhinav Shrivastava, Yale Song

We conduct a large-scale study with over 100 variants of ResNet and MobileNet architectures and evaluate them across 11 downstream scenarios in the SSL setting.

Image Classification Self-Supervised Learning

Paper
Add Code

LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models

1 code implementation • 4 Mar 2022 • Mojan Javaheripi, Gustavo H. de Rosa, Subhabrata Mukherjee, Shital Shah, Tomasz L. Religa, Caio C. T. Mendes, Sebastien Bubeck, Farinaz Koushanfar, Debadeepta Dey

Results show that the perplexity of 16-layer GPT-2 and Transformer-XL can be achieved with up to 1. 5x, 2. 5x faster runtime and 1. 2x, 2. 0x lower peak memory utilization.

Decoder Language Modelling +1

458

Paper
Code

Ranking Convolutional Architectures by their Feature Extraction Capabilities

no code implementations • 29 Sep 2021 • Debadeepta Dey, Shital Shah, Sebastien Bubeck

We propose a simple but powerful method which we call FEAR, for ranking architectures in any search space.

Neural Architecture Search

Paper
Add Code

FEAR: A Simple Lightweight Method to Rank Architectures

1 code implementation • 7 Jun 2021 • Debadeepta Dey, Shital Shah, Sebastien Bubeck

We propose a simple but powerful method which we call FEAR, for ranking architectures in any search space.

Neural Architecture Search

458

Paper
Code

Ranking Architectures by Feature Extraction Capabilities

no code implementations • ICML Workshop AutoML 2021 • Debadeepta Dey, Shital Shah, Sebastien Bubeck

By training diﬀerent architectures in the search space to the same training or validation error and subsequently comparing the usefulness of the features extracted on the task-dataset of interest by freezing most of the architecture we obtain quick estimates of the relative performance.

Neural Architecture Search

Paper
Add Code

Understanding Failures of Deep Networks via Robust Feature Extraction

1 code implementation • CVPR 2021 • Sahil Singla, Besmira Nushi, Shital Shah, Ece Kamar, Eric Horvitz

Traditional evaluation metrics for learned models that report aggregate scores over a test set are insufficient for surfacing important and informative patterns of failure over features and instances.

Paper
Code

An Empirical Analysis of Backward Compatibility in Machine Learning Systems

no code implementations • 11 Aug 2020 • Megha Srivastava, Besmira Nushi, Ece Kamar, Shital Shah, Eric Horvitz

In many applications of machine learning (ML), updates are performed with the goal of enhancing model performance.

BIG-bench Machine Learning

Paper
Add Code

Safe Reinforcement Learning via Curriculum Induction

1 code implementation • NeurIPS 2020 • Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, Alekh Agarwal

In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.

Autonomous Driving reinforcement-learning +2

Paper
Code

A System for Real-Time Interactive Analysis of Deep Learning Training

1 code implementation • 5 Jan 2020 • Shital Shah, Roland Fernandez, Steven Drucker

To achieve this, we model various exploratory inspection and diagnostic tasks for deep learning training processes as specifications for streams using a map-reduce paradigm with which many data scientists are already familiar.

Ranked #1 on 3D Action Recognition on 100 sleep nights of 8 caregivers

3D Action Recognition

3,394

Paper
Code

A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities

1 code implementation • 19 Sep 2019 • Deepali Aneja, Daniel McDuff, Shital Shah

Embodied avatars as virtual agents have many applications and provide benefits over disembodied agents, allowing non-verbal social and interactional cues to be leveraged, in a similar manner to how humans interact with each other.

Paper
Code

AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles

25 code implementations • 15 May 2017 • Shital Shah, Debadeepta Dey, Chris Lovett, Ashish Kapoor

Developing and testing algorithms for autonomous vehicles in real world is an expensive and time consuming process.

Autonomous Vehicles

15,962

Paper
Code

Submodular Trajectory Optimization for Aerial 3D Scanning

no code implementations • ICCV 2017 • Mike Roberts, Debadeepta Dey, Anh Truong, Sudipta Sinha, Shital Shah, Ashish Kapoor, Pat Hanrahan, Neel Joshi

Drones equipped with cameras are emerging as a powerful tool for large-scale aerial 3D scanning, but existing automatic flight planners do not exploit all available information about the scene, and can therefore produce inaccurate and incomplete 3D models.

Trajectory Planning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.