Search Results for author: Nikos Karampatziakis

Found 24 papers, 7 papers with code

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations22 Apr 2024 Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, ZiYi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Language Modelling

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

1 code implementation12 Oct 2023 Yixiao Li, Yifan Yu, Chen Liang, Pengcheng He, Nikos Karampatziakis, Weizhu Chen, Tuo Zhao

Quantization is an indispensable technique for serving Large Language Models (LLMs) and has recently found its way into LoRA fine-tuning.

Natural Language Understanding Quantization +2

AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

2 code implementations18 Mar 2023 Qingru Zhang, Minshuo Chen, Alexander Bukharin, Nikos Karampatziakis, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao

Therefore, many fine-tuning methods are proposed to learn incremental updates of pre-trained weights in a parameter efficient way, e. g., low-rank increments.

Question Answering Text Generation

Anytime-valid off-policy inference for contextual bandits

1 code implementation19 Oct 2022 Ian Waudby-Smith, Lili Wu, Aaditya Ramdas, Nikos Karampatziakis, Paul Mineiro

Importantly, our methods can be employed while the original experiment is still running (that is, not necessarily post-hoc), when the logging policy may be itself changing (due to learning), and even if the context distributions are a highly dependent time-series (such as if they are drifting over time).

counterfactual Multi-Armed Bandits +3

Contextual Bandit Applications in Customer Support Bot

no code implementations6 Dec 2021 Sandra Sajeev, Jade Huang, Nikos Karampatziakis, Matthew Hall, Sebastian Kochman, Weizhu Chen

We do, however, have access to partial feedback provided by the user (clicks, surveys, and other events) which can be leveraged to improve the user experience.

Multi-Armed Bandits

Off-policy Confidence Sequences

no code implementations18 Feb 2021 Nikos Karampatziakis, Paul Mineiro, Aaditya Ramdas

We develop confidence bounds that hold uniformly over time for off-policy evaluation in the contextual bandit setting.

Off-policy evaluation valid

Empirical Likelihood for Contextual Bandits

1 code implementation NeurIPS 2020 Nikos Karampatziakis, John Langford, Paul Mineiro

We propose an estimator and confidence interval for computing the value of a policy from off-policy data in the contextual bandit setting.

Multi-Armed Bandits

Lessons from Contextual Bandit Learning in a Customer Support Bot

no code implementations6 May 2019 Nikos Karampatziakis, Sebastian Kochman, Jade Huang, Paul Mineiro, Kathy Osborne, Weizhu Chen

In this work, we describe practical lessons we have learned from successfully using contextual bandits (CBs) to improve key business metrics of the Microsoft Virtual Agent for customer support.

Information Retrieval Multi-Armed Bandits +2

Gradient Coding

2 code implementations10 Dec 2016 Rashish Tandon, Qi Lei, Alexandros G. Dimakis, Nikos Karampatziakis

We propose a novel coding theoretic framework for mitigating stragglers in distributed learning.

Log-time and Log-space Extreme Classification

1 code implementation7 Nov 2016 Kalina Jasinska, Nikos Karampatziakis

We present LTLS, a technique for multiclass and multilabel prediction that can perform training and inference in logarithmic time and space.

Classification General Classification +1

Logarithmic Time One-Against-Some

no code implementations ICML 2017 Hal Daume III, Nikos Karampatziakis, John Langford, Paul Mineiro

Compared to previous approaches, we obtain substantially better statistical performance for two reasons: First, we prove a tighter and more complete boosting theorem, and second we translate the results more directly into an algorithm.

Binary Classification Classification +1

Active Information Acquisition

no code implementations5 Feb 2016 He He, Paul Mineiro, Nikos Karampatziakis

We propose a general framework for sequential and dynamic acquisition of useful information in order to solve a particular task.

General Reinforcement Learning Reinforcement Learning (RL) +1

A Hierarchical Spectral Method for Extreme Classification

no code implementations10 Nov 2015 Paul Mineiro, Nikos Karampatziakis

Extreme classification problems are multiclass and multilabel classification problems where the number of outputs is so large that straightforward strategies are neither statistically nor computationally viable.

Classification General Classification +1

Fast Label Embeddings for Extremely Large Output Spaces

no code implementations30 Mar 2015 Paul Mineiro, Nikos Karampatziakis

Many modern multiclass and multilabel problems are characterized by increasingly large output spaces.

Scalable Multilabel Prediction via Randomized Methods

1 code implementation9 Feb 2015 Nikos Karampatziakis, Paul Mineiro

In this work we show that a generic regularized nonlinearity mapping independent predictions to joint predictions is sufficient to achieve state-of-the-art performance on a variety of benchmark problems.

General Classification

Fast Label Embeddings via Randomized Linear Algebra

no code implementations19 Dec 2014 Paul Mineiro, Nikos Karampatziakis

Many modern multiclass and multilabel problems are characterized by increasingly large output spaces.

A Randomized Algorithm for CCA

no code implementations13 Nov 2014 Paul Mineiro, Nikos Karampatziakis

We present RandomizedCCA, a randomized algorithm for computing canonical analysis, suitable for large datasets stored either out of core or on a distributed file system.

Efficient Online Bootstrapping for Large Scale Learning

no code implementations18 Dec 2013 Zhen Qin, Vaclav Petricek, Nikos Karampatziakis, Lihong Li, John Langford

Bootstrapping is a useful technique for estimating the uncertainty of a predictor, for example, confidence intervals for prediction.

Combining Structured and Unstructured Randomness in Large Scale PCA

no code implementations23 Oct 2013 Nikos Karampatziakis, Paul Mineiro

Principal Component Analysis (PCA) is a ubiquitous tool with many applications in machine learning including feature construction, subspace embedding, and outlier detection.

BIG-bench Machine Learning Outlier Detection

Discriminative Features via Generalized Eigenvectors

no code implementations7 Oct 2013 Nikos Karampatziakis, Paul Mineiro

Representing examples in a way that is compatible with the underlying classifier can greatly enhance the performance of a learning system.

General Classification

Least Squares Revisited: Scalable Approaches for Multi-class Prediction

no code implementations7 Oct 2013 Alekh Agarwal, Sham M. Kakade, Nikos Karampatziakis, Le Song, Gregory Valiant

This work provides simple algorithms for multi-class (and multi-label) prediction in settings where both the number of examples n and the data dimension d are relatively large.

Loss-Proportional Subsampling for Subsequent ERM

no code implementations7 Jun 2013 Paul Mineiro, Nikos Karampatziakis

We propose a sampling scheme suitable for reducing a data set prior to selecting a hypothesis with minimum empirical risk.

Static Analysis of Binary Executables Using Structural SVMs

no code implementations NeurIPS 2010 Nikos Karampatziakis

We cast the problem of identifying basic blocks of code in a binary executable as learning a mapping from a byte sequence to a segmentation of the sequence.


Cannot find the paper you are looking for? You can Submit a new open access paper.