Search Results for author: Alborz Geramifard

Found 24 papers, 7 papers with code

DialogStitch: Synthetic Deeper and Multi-Context Task-Oriented Dialogs

1 code implementation SIGDIAL (ACL) 2021 Satwik Kottur, Chinnadhurai Sankar, Zhou Yu, Alborz Geramifard

Real-world conversational agents must effectively handle long conversations that span multiple contexts.

An Analysis of State-of-the-Art Models for Situated Interactive MultiModal Conversations (SIMMC)

no code implementations SIGDIAL (ACL) 2021 Satwik Kottur, Paul Crook, Seungwhan Moon, Ahmad Beirami, Eunjoon Cho, Rajen Subba, Alborz Geramifard

There is a growing interest in virtual assistants with multimodal capabilities, e. g., inferring the context of a conversation through scene understanding.

Scene Understanding

Sequential Decision-Making for Inline Text Autocomplete

no code implementations21 Mar 2024 Rohan Chitnis, Shentao Yang, Alborz Geramifard

In particular, we hypothesize that the objectives under which sequential decision-making can improve autocomplete systems are not tailored solely to text entry speed, but more broadly to metrics such as user satisfaction and convenience.

Decision Making Language Modelling

SMORE: Score Models for Offline Goal-Conditioned Reinforcement Learning

no code implementations3 Nov 2023 Harshit Sikchi, Rohan Chitnis, Ahmed Touati, Alborz Geramifard, Amy Zhang, Scott Niekum

Offline Goal-Conditioned Reinforcement Learning (GCRL) is tasked with learning to achieve multiple goals in an environment purely from offline datasets using sparse reward functions.

Contrastive Learning reinforcement-learning +1

When should we prefer Decision Transformers for Offline Reinforcement Learning?

1 code implementation23 May 2023 Prajjwal Bhargava, Rohan Chitnis, Alborz Geramifard, Shagun Sodhani, Amy Zhang

Three popular algorithms for offline RL are Conservative Q-Learning (CQL), Behavior Cloning (BC), and Decision Transformer (DT), from the class of Q-Learning, Imitation Learning, and Sequence Modeling respectively.

D4RL Imitation Learning +5

Curriculum Script Distillation for Multilingual Visual Question Answering

no code implementations17 Jan 2023 Khyathi Raghavi Chandu, Alborz Geramifard

Pre-trained models with dual and cross encoders have shown remarkable success in propelling the landscape of several tasks in vision and language in Visual Question Answering (VQA).

Question Answering Visual Question Answering

Navigating Connected Memories with a Task-oriented Dialog System

1 code implementation15 Nov 2022 Seungwhan Moon, Satwik Kottur, Alborz Geramifard, Babak Damavandi

Recent years have seen an increasing trend in the volume of personal media captured by users, thanks to the advent of smartphones and smart glasses, resulting in large media collections.


Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation

no code implementations8 Nov 2022 Satwik Kottur, Seungwhan Moon, Aram H. Markosyan, Hardik Shah, Babak Damavandi, Alborz Geramifard

We collect a new dataset C3 (Conversational Content Creation), comprising 10k dialogs conditioned on media montages simulated from a large media collection.

Benchmarking Retrieval

Multilingual Multimodality: A Taxonomical Survey of Datasets, Techniques, Challenges and Opportunities

no code implementations30 Oct 2022 Khyathi Raghavi Chandu, Alborz Geramifard

To this end, we review the languages studied, gold or silver data with parallel annotations, and understand how these modalities and languages interact in modeling.

Database Search Results Disambiguation for Task-Oriented Dialog Systems

no code implementations NAACL 2022 Kun Qian, Ahmad Beirami, Satwik Kottur, Shahin Shayandeh, Paul Crook, Alborz Geramifard, Zhou Yu, Chinnadhurai Sankar

We find that training on our augmented dialog data improves the model's ability to deal with ambiguous scenarios, without sacrificing performance on unmodified turns.

Multi-Task Learning

Robustness through Data Augmentation Loss Consistency

1 code implementation21 Oct 2021 Tianjian Huang, Shaunak Halbe, Chinnadhurai Sankar, Pooyan Amini, Satwik Kottur, Alborz Geramifard, Meisam Razaviyayn, Ahmad Beirami

Our experiments show that DAIR consistently outperforms ERM and DA-ERM with little marginal computational cost and sets new state-of-the-art results in several benchmarks involving covariant data augmentation.

Multi-domain Dialogue State Tracking Visual Question Answering

Annotation Inconsistency and Entity Bias in MultiWOZ

no code implementations SIGDIAL (ACL) 2021 Kun Qian, Ahmad Beirami, Zhouhan Lin, Ankita De, Alborz Geramifard, Zhou Yu, Chinnadhurai Sankar

In this work, we identify an overlooked issue with dialog state annotation inconsistencies in the dataset, where a slot type is tagged inconsistently across similar dialogs leading to confusion for DST modeling.

dialog state tracking Memorization +1

SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

1 code implementation EMNLP 2021 Satwik Kottur, Seungwhan Moon, Alborz Geramifard, Babak Damavandi

Next generation task-oriented dialog systems need to understand conversational contexts with their perceived surroundings, to effectively help users in the real-world multimodal environment.

Language Modelling

DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue

1 code implementation ACL 2021 Hung Le, Chinnadhurai Sankar, Seungwhan Moon, Ahmad Beirami, Alborz Geramifard, Satwik Kottur

A video-grounded dialogue system is required to understand both dialogue, which contains semantic dependencies from turn to turn, and video, which contains visual cues of spatial and temporal scene variations.

Object Tracking Visual Reasoning

Situated and Interactive Multimodal Conversations

2 code implementations COLING 2020 Seungwhan Moon, Satwik Kottur, Paul A. Crook, Ankita De, Shivani Poddar, Theodore Levin, David Whitney, Daniel Difranco, Ahmad Beirami, Eunjoon Cho, Rajen Subba, Alborz Geramifard

Next generation virtual assistants are envisioned to handle multimodal inputs (e. g., vision, memories of previous interactions, in addition to the user's utterances), and perform multimodal actions (e. g., displaying a route in addition to generating the system's utterance).

Response Generation

SIMMC: Situated Interactive Multi-Modal Conversational Data Collection And Evaluation Platform

no code implementations7 Nov 2019 Paul A. Crook, Shivani Poddar, Ankita De, Semir Shafi, David Whitney, Alborz Geramifard, Rajen Subba

To this end, we introduce SIMMC, an extension to ParlAI for multi-modal conversational data collection and system evaluation.


Batch-iFDD for Representation Expansion in Large MDPs

no code implementations26 Sep 2013 Alborz Geramifard, Thomas J. Walsh, Nicholas Roy, Jonathan How

Matching pursuit (MP) methods are a promising class of feature construction algorithms for value function approximation.

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

no code implementations13 Jun 2012 Richard S. Sutton, Csaba Szepesvari, Alborz Geramifard, Michael P. Bowling

Our main results are to prove that linear Dyna-style planning converges to a unique solution independent of the generating distribution, under natural conditions.

Cannot find the paper you are looking for? You can Submit a new open access paper.