Search Results for author: Sercan Ö. Arik

Found 11 papers, 3 papers with code

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

no code implementations4 Jun 2024 Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, Sercan Ö. Arik

Addressing the challenge of effectively processing long contexts has become a critical issue for Large Language Models (LLMs).

Code Completion Question Answering +1

Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training

no code implementations31 May 2024 Maximillian Chen, Ruoxi Sun, Sercan Ö. Arik, Tomas Pfister

Large language models (LLMs) aligned through reinforcement learning from human feedback (RLHF) have quickly become one of the dominant paradigms for building intelligent conversational assistant agents.

Machine Reading Comprehension Question Answering +1

Mitigating Object Hallucination via Data Augmented Contrastive Tuning

no code implementations28 May 2024 Pritam Sarkar, Sayna Ebrahimi, Ali Etemad, Ahmad Beirami, Sercan Ö. Arik, Tomas Pfister

For a given factual token, we create a hallucinated token through generative data augmentation by selectively altering the ground-truth information.

Data Augmentation Hallucination +1

Effective Large Language Model Adaptation for Improved Grounding and Citation Generation

no code implementations16 Nov 2023 Xi Ye, Ruoxi Sun, Sercan Ö. Arik, Tomas Pfister

Our framework tunes LLMs to selfground the claims in their responses and provide accurate citations to retrieved documents.

Language Modelling Large Language Model +2

COSTAR: Improved Temporal Counterfactual Estimation with Self-Supervised Learning

1 code implementation1 Nov 2023 Chuizheng Meng, Yihe Dong, Sercan Ö. Arik, Yan Liu, Tomas Pfister

Estimation of temporal counterfactual outcomes from observed history is crucial for decision-making in many domains such as healthcare and e-commerce, particularly when randomized controlled trials (RCTs) suffer from high cost or impracticality.

counterfactual Decision Making +2

SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended)

no code implementations26 May 2023 Ruoxi Sun, Sercan Ö. Arik, Alex Muzio, Lesly Miculicich, Satya Gundabathula, Pengcheng Yin, Hanjun Dai, Hootan Nakhost, Rajarishi Sinha, Zifeng Wang, Tomas Pfister

Text-to-SQL, the process of translating natural language into Structured Query Language (SQL), represents a transformative application of large language models (LLMs), potentially revolutionizing how humans interact with data.

Data Augmentation In-Context Learning +3

Koopman Neural Forecaster for Time Series with Temporal Distribution Shifts

1 code implementation7 Oct 2022 Rui Wang, Yihe Dong, Sercan Ö. Arik, Rose Yu

Temporal distributional shifts, with underlying dynamics changing over time, frequently occur in real-world time series and pose a fundamental challenge for deep neural networks (DNNs).

Time Series Time Series Forecasting

Self-Supervised Learning with an Information Maximization Criterion

1 code implementation16 Sep 2022 Serdar Ozsoy, Shadi Hamdan, Sercan Ö. Arik, Deniz Yuret, Alper T. Erdogan

In this article, we argue that a straightforward application of information maximization among alternative latent representations of the same input naturally solves the collapse problem and achieves competitive empirical results.

Self-Supervised Learning

Invariant Structure Learning for Better Generalization and Causal Explainability

no code implementations13 Jun 2022 Yunhao Ge, Sercan Ö. Arik, Jinsung Yoon, Ao Xu, Laurent Itti, Tomas Pfister

ISL splits the data into different environments, and learns a structure that is invariant to the target across different environments by imposing a consistency constraint.

Self-Supervised Learning

Interpretable Mixture of Experts

no code implementations5 Jun 2022 Aya Abdelsalam Ismail, Sercan Ö. Arik, Jinsung Yoon, Ankur Taly, Soheil Feizi, Tomas Pfister

In addition to constituting a standalone inherently-interpretable architecture, IME has the premise of being integrated with existing DNNs to offer interpretability to a subset of samples while maintaining the accuracy of the DNNs.

Decision Making Time Series

Cannot find the paper you are looking for? You can Submit a new open access paper.