Search Results for author: Sercan O. Arik

Found 38 papers, 15 papers with code

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

no code implementations8 Oct 2024 Bowen Jin, Jinsung Yoon, Jiawei Han, Sercan O. Arik

Retrieval-augmented generation (RAG) empowers large language models (LLMs) to utilize external knowledge sources.

RAG Retrieval

CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL

no code implementations2 Oct 2024 Mohammadreza Pourreza, Hailong Li, Ruoxi Sun, Yeounoh Chung, Shayan Talaei, Gaurav Tarlok Kakkar, Yu Gan, Amin Saberi, Fatma Ozcan, Sercan O. Arik

In tackling the challenges of large language model (LLM) performance for Text-to-SQL tasks, we introduce CHASE-SQL, a new framework that employs innovative strategies, using test-time compute in multi-agent modeling to improve candidate generation and selection.

Large Language Model Text-To-SQL

SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging

no code implementations22 Aug 2024 Mohammadreza Pourreza, Ruoxi Sun, Hailong Li, Lesly Miculicich, Tomas Pfister, Sercan O. Arik

This leads to a versatile model optimized for multiple SQL dialects, outperforming single-dialect models and significantly enhancing overall performance.

Diversity Natural Language Queries +1

CROME: Cross-Modal Adapters for Efficient Multimodal LLM

no code implementations13 Aug 2024 Sayna Ebrahimi, Sercan O. Arik, Tejas Nama, Tomas Pfister

Multimodal Large Language Models (MLLMs) demonstrate remarkable image-language capabilities, but their widespread use faces challenges in cost-effective training and adaptation.

Instruction Following Language Modelling +2

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

no code implementations16 Jul 2024 Hongjin Su, Howard Yen, Mengzhou Xia, Weijia Shi, Niklas Muennighoff, Han-yu Wang, Haisu Liu, Quan Shi, Zachary S. Siegel, Michael Tang, Ruoxi Sun, Jinsung Yoon, Sercan O. Arik, Danqi Chen, Tao Yu

To better benchmark retrieval on such challenging queries, we introduce BRIGHT, the first text retrieval benchmark that requires intensive reasoning to retrieve relevant documents.

Question Answering Text Retrieval

Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization

no code implementations22 Jun 2024 Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Sercan O. Arik

We conclude that studying exemplar optimization both as a standalone method and its optimal combination with instruction optimization remain a crucial aspect of APO and deserve greater consideration in future research, even in the era of highly capable instruction-following models.

Instruction Following Prompt Engineering

PAITS: Pretraining and Augmentation for Irregularly-Sampled Time Series

1 code implementation25 Aug 2023 Nicasia Beebe-Wang, Sayna Ebrahimi, Jinsung Yoon, Sercan O. Arik, Tomas Pfister

In this paper, we present PAITS (Pretraining and Augmentation for Irregularly-sampled Time Series), a framework for identifying suitable pretraining strategies for sparse and irregularly sampled time series datasets.

Time Series

Business Metric-Aware Forecasting for Inventory Management

no code implementations24 Aug 2023 Helen Zhou, Sercan O. Arik, Jingtao Wang

We explore a wide range of plausible cost trade-off scenarios, and empirically demonstrate that end-to-end optimization often outperforms optimization of standard business-agnostic forecasting metrics (by up to 45. 7% for a simple scaling model, and up to 54. 0% for an LSTM encoder-decoder model).

Decoder Management +1

LANISTR: Multimodal Learning from Structured and Unstructured Data

1 code implementation26 May 2023 Sayna Ebrahimi, Sercan O. Arik, Yihe Dong, Tomas Pfister

To bridge this gap, we propose LANISTR, an attention-based framework to learn from LANguage, Image, and STRuctured data.

Time Series

Universal Self-Adaptive Prompting

no code implementations24 May 2023 Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Hanjun Dai, Julian Martin Eisenschlos, Sercan O. Arik, Tomas Pfister

A hallmark of modern large language models (LLMs) is their impressive general zero-shot and few-shot abilities, often elicited through in-context learning (ICL) via prompting.

In-Context Learning Natural Language Understanding +2

Better Zero-Shot Reasoning with Self-Adaptive Prompting

no code implementations23 May 2023 Xingchen Wan, Ruoxi Sun, Hanjun Dai, Sercan O. Arik, Tomas Pfister

Modern large language models (LLMs) have demonstrated impressive capabilities at sophisticated tasks, often through step-by-step reasoning similar to humans.

SLM: End-to-end Feature Selection via Sparse Learnable Masks

no code implementations6 Apr 2023 Yihe Dong, Sercan O. Arik

Feature selection has been widely used to alleviate compute requirements during training, elucidate model interpretability, and improve model generalizability.

feature selection

TSMixer: An All-MLP Architecture for Time Series Forecasting

4 code implementations10 Mar 2023 Si-An Chen, Chun-Liang Li, Nate Yoder, Sercan O. Arik, Tomas Pfister

Extending them, in this paper, we investigate the capabilities of linear models for time-series forecasting and present Time-Series Mixer (TSMixer), a novel architecture designed by stacking multi-layer perceptrons (MLPs).

Deep Learning Time Series +1

Neural Spline Search for Quantile Probabilistic Modeling

no code implementations12 Jan 2023 Ruoxi Sun, Chun-Liang Li, Sercan O. Arik, Michael W. Dusenberry, Chen-Yu Lee, Tomas Pfister

Accurate estimation of output quantiles is crucial in many use cases, where it is desired to model the range of possibility.

Attribute quantile regression +2

Provable Membership Inference Privacy

no code implementations12 Nov 2022 Zachary Izzo, Jinsung Yoon, Sercan O. Arik, James Zou

However, DP's strong theoretical guarantees often come at the cost of a large drop in its utility for machine learning, and DP guarantees themselves can be difficult to interpret.

Test-Time Adaptation for Visual Document Understanding

no code implementations15 Jun 2022 Sayna Ebrahimi, Sercan O. Arik, Tomas Pfister

For visual document understanding (VDU), self-supervised pretraining has been shown to successfully generate transferable representations, yet, effective adaptation of such representations to distribution shifts at test-time remains to be an unexplored area.

document understanding Language Modelling +5

Self-Adaptive Forecasting for Improved Deep Learning on Non-Stationary Time-Series

no code implementations4 Feb 2022 Sercan O. Arik, Nathanael C. Yoder, Tomas Pfister

Real-world time-series datasets often violate the assumptions of standard supervised learning for forecasting -- their distributions evolve over time, rendering the conventional training and model selection procedures suboptimal.

Decoder Model Selection +3

Controlling Neural Networks with Rule Representations

1 code implementation NeurIPS 2021 Sungyong Seo, Sercan O. Arik, Jinsung Yoon, Xiang Zhang, Kihyuk Sohn, Tomas Pfister

The key aspect of DeepCTRL is that it does not require retraining to adapt the rule strength -- at inference, the user can adjust it based on the desired operation point on accuracy vs. rule verification ratio.

Decision Making

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

6 code implementations26 May 2021 Zizhao Zhang, Han Zhang, Long Zhao, Ting Chen, Sercan O. Arik, Tomas Pfister

Hierarchical structures are popular in recent vision transformers, however, they require sophisticated designs and massive datasets to work well.

Decoder Image Classification +1

Explaining Deep Neural Networks using Unsupervised Clustering

no code implementations15 Jul 2020 Yu-Han Liu, Sercan O. Arik

We propose a novel method to explain trained deep neural networks (DNNs), by distilling them into surrogate models using unsupervised clustering.

Clustering

Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting

34 code implementations19 Dec 2019 Bryan Lim, Sercan O. Arik, Nicolas Loeff, Tomas Pfister

Multi-horizon forecasting problems often contain a complex mix of inputs -- including static (i. e. time-invariant) covariates, known future inputs, and other exogenous time series that are only observed historically -- without any prior information on how they interact with the target.

Interpretable Machine Learning Time Series +1

On Completeness-aware Concept-Based Explanations in Deep Neural Networks

2 code implementations NeurIPS 2020 Chih-Kuan Yeh, Been Kim, Sercan O. Arik, Chun-Liang Li, Tomas Pfister, Pradeep Ravikumar

Next, we propose a concept discovery method that aims to infer a complete set of concepts that are additionally encouraged to be interpretable, which addresses the limitations of existing methods on concept explanations.

Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling Cost

no code implementations ECCV 2020 Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan O. Arik, Larry S. Davis, Tomas Pfister

Active learning (AL) combines data labeling and model training to minimize the labeling cost by prioritizing the selection of high value data that can best improve model performance.

Active Learning Image Classification +1

Distilling Effective Supervision from Severe Label Noise

2 code implementations CVPR 2020 Zizhao Zhang, Han Zhang, Sercan O. Arik, Honglak Lee, Tomas Pfister

For instance, on CIFAR100 with a $40\%$ uniform noise ratio and only 10 trusted labeled data per class, our method achieves $80. 2{\pm}0. 3\%$ classification accuracy, where the error rate is only $1. 4\%$ higher than a neural network trained without label noise.

Image Classification

Data Valuation using Reinforcement Learning

2 code implementations ICML 2020 Jinsung Yoon, Sercan O. Arik, Tomas Pfister

To adaptively learn data values jointly with the target task predictor model, we propose a meta learning framework which we name Data Valuation using Reinforcement Learning (DVRL).

Data Valuation Domain Adaptation +5

Consistency-Based Semi-Supervised Active Learning: Towards Minimizing Labeling Budget

no code implementations25 Sep 2019 Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan O. Arik, Larry S. Davis, Tomas Pfister

Active learning (AL) aims to integrate data labeling and model training in a unified way, and to minimize the labeling budget by prioritizing the selection of high value data that can best improve model performance.

Active Learning Representation Learning

Learning to Transfer Learn: Reinforcement Learning-Based Selection for Adaptive Transfer Learning

no code implementations ECCV 2020 Linchao Zhu, Sercan O. Arik, Yi Yang, Tomas Pfister

We propose a novel adaptive transfer learning framework, learning to transfer learn (L2TL), to improve performance on a target dataset by careful extraction of the related information from a source dataset.

reinforcement-learning Reinforcement Learning +2

TabNet: Attentive Interpretable Tabular Learning

19 code implementations20 Aug 2019 Sercan O. Arik, Tomas Pfister

We propose a novel high-performance and interpretable canonical deep tabular data learning architecture, TabNet.

Decision Making Poker Hand Classification +2

ProtoAttend: Attention-Based Prototypical Learning

4 code implementations17 Feb 2019 Sercan O. Arik, Tomas Pfister

We propose a novel inherently interpretable machine learning method that bases decisions on few relevant examples that we call prototypes.

Decision Making General Classification +1

Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks

no code implementations20 Aug 2018 Sercan O. Arik, Heewoo Jun, Gregory Diamos

We propose the multi-head convolutional neural network (MCNN) architecture for waveform synthesis from spectrograms.

speech-recognition Speech Recognition +1

Neural Voice Cloning with a Few Samples

2 code implementations NeurIPS 2018 Sercan O. Arik, Jitong Chen, Kainan Peng, Wei Ping, Yanqi Zhou

Speaker adaptation is based on fine-tuning a multi-speaker generative model with a few cloning samples.

Speech Synthesis Voice Cloning

Cannot find the paper you are looking for? You can Submit a new open access paper.