Search Results for author: Ahmed Awadallah

Found 24 papers, 6 papers with code

AgentInstruct: Toward Generative Teaching with Agentic Flows

no code implementations3 Jul 2024 Arindam Mitra, Luciano del Corro, Guoqing Zheng, Shweti Mahajan, Dany Rouhana, Andres Codas, Yadong Lu, Wei-Ge Chen, Olga Vrousgos, Corby Rosset, Fillipe Silva, Hamed Khanpour, Yash Lara, Ahmed Awadallah

We focus on using synthetic data for post-training, specifically creating data by powerful models to teach a new skill or behavior to another model, we refer to this setting as Generative Teaching.

GSM8K MMLU +1

Assessing and Verifying Task Utility in LLM-Powered Applications

no code implementations3 May 2024 Negar Arabzadeh, Siqing Huo, Nikhil Mehta, Qinqyun Wu, Chi Wang, Ahmed Awadallah, Charles L. A. Clarke, Julia Kiseleva

The rapid development of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents, assisting humans in their daily tasks.

Math

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations22 Apr 2024 Marah Abdin, Jyoti Aneja, Hany Awadalla, Ahmed Awadallah, Ammar Ahmad Awan, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Qin Cai, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Weizhu Chen, Yen-Chun Chen, Yi-Ling Chen, Hao Cheng, Parul Chopra, Xiyang Dai, Matthew Dixon, Ronen Eldan, Victor Fragoso, Jianfeng Gao, Mei Gao, Min Gao, Amit Garg, Allie Del Giorno, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Wenxiang Hu, Jamie Huynh, Dan Iter, Sam Ade Jacobs, Mojan Javaheripi, Xin Jin, Nikos Karampatziakis, Piero Kauffmann, Mahoud Khademi, Dongwoo Kim, Young Jin Kim, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Xihui Lin, Zeqi Lin, Ce Liu, Liyuan Liu, Mengchen Liu, Weishung Liu, Xiaodong Liu, Chong Luo, Piyush Madan, Ali Mahmoudzadeh, David Majercak, Matt Mazzola, Caio César Teodoro Mendes, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Liliang Ren, Gustavo de Rosa, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Yelong Shen, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Praneetha Vaddamanu, Chunyu Wang, Guanhua Wang, Lijuan Wang, Shuohang Wang, Xin Wang, Yu Wang, Rachel Ward, Wen Wen, Philipp Witte, Haiping Wu, Xiaoxia Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Jilong Xue, Sonali Yadav, Fan Yang, Jianwei Yang, Yifan Yang, ZiYi Yang, Donghan Yu, Lu Yuan, Chenruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Ranked #5 on MMR total on MRR-Benchmark (using extra training data)

Language Modelling Math +2

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

no code implementations4 Apr 2024 Corby Rosset, Ching-An Cheng, Arindam Mitra, Michael Santacroce, Ahmed Awadallah, Tengyang Xie

In this paper, we introduce Direct Nash Optimization (DNO), a provable and scalable algorithm that marries the simplicity and stability of contrastive learning with theoretical generality from optimizing general preferences.

Contrastive Learning

Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents

no code implementations27 Feb 2024 Corby Rosset, Ho-Lam Chung, Guanghui Qin, Ethan C. Chau, Zhuo Feng, Ahmed Awadallah, Jennifer Neville, Nikhil Rao

We show that users spend a lot of ``effort'' on these questions in terms of signals like clicks and session length, and that they are also challenging for GPT-4.

Known Unknowns Question Answering +1

Orca-Math: Unlocking the potential of SLMs in Grade School Math

no code implementations16 Feb 2024 Arindam Mitra, Hamed Khanpour, Corby Rosset, Ahmed Awadallah

Ensembling provides a substantial boost in accuracy but at a significant cost increase with multiple calls to the model (e. g., Phi-GSM uses top-48 to boost the performance from 68. 2 to 81. 5).

Ranked #35 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

Towards better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications

no code implementations14 Feb 2024 Negar Arabzadeh, Julia Kiseleva, Qingyun Wu, Chi Wang, Ahmed Awadallah, Victor Dibia, Adam Fourney, Charles Clarke

The rapid development in the field of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents to assist humans in their daily tasks.

Math

Axiomatic Preference Modeling for Longform Question Answering

no code implementations2 Dec 2023 Corby Rosset, Guoqing Zheng, Victor Dibia, Ahmed Awadallah, Paul Bennett

The remarkable abilities of large language models (LLMs) like GPT-4 partially stem from post-training processes like Reinforcement Learning from Human Feedback (RLHF) involving human preferences encoded in a reward model.

Question Answering

Teaching Language Models to Hallucinate Less with Synthetic Tasks

no code implementations10 Oct 2023 Erik Jones, Hamid Palangi, Clarisse Simões, Varun Chandrasekaran, Subhabrata Mukherjee, Arindam Mitra, Ahmed Awadallah, Ece Kamar

We also find that optimizing the system message rather than the model weights can be critical; fine-tuning the entire model on the synthetic task can counterintuitively increase hallucination.

Abstractive Text Summarization Hallucination +3

SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference

no code implementations5 Jul 2023 Luciano del Corro, Allie Del Giorno, Sahaj Agarwal, Bin Yu, Ahmed Awadallah, Subhabrata Mukherjee

While existing token-level early exit methods show promising results for online inference, they cannot be readily applied for batch inferencing and Key-Value caching.

Text Generation

Orca: Progressive Learning from Complex Explanation Traces of GPT-4

4 code implementations5 Jun 2023 Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi, Ahmed Awadallah

To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka. ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs.

Imitation Learning Knowledge Distillation

ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels

1 code implementation24 Aug 2022 Yue Zhao, Guoqing Zheng, Subhabrata Mukherjee, Robert McCann, Ahmed Awadallah

In this work, we propose a method to leverage weak/noisy labels (e. g., risk scores generated by machine rules for detecting malware) that are cheaper to obtain for anomaly detection.

Anomaly Detection

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy

1 code implementation CVPR 2022 Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

However, a "head-to-toe assessment" regarding the extent of redundancy in ViTs, and how much we could gain by thoroughly mitigating such, has been absent for this field.

Diversity

Uncertainty-aware Self-training for Few-shot Text Classification

no code implementations NeurIPS 2020 Subhabrata Mukherjee, Ahmed Awadallah

Recent success of pre-trained language models crucially hinges on fine-tuning them on large amounts of labeled data for the downstream task, that are typically expensive to acquire or difficult to access for many applications.

Few-Shot Text Classification General Classification +1

Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching

no code implementations22 Apr 2018 Xiao Yang, Miaosen Wang, Wei Wang, Madian Khabsa, Ahmed Awadallah, Daniel Kifer, C. Lee Giles

We frame this task as a binary (relevant/irrelevant) classification problem, and present an adversarial training framework to alleviate label imbalance issue.

Answer Selection General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.