Search Results for author: Paul A. Crook

Found 8 papers, 2 papers with code

SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM

no code implementations7 Mar 2024 JieLin Qiu, Andrea Madotto, Zhaojiang Lin, Paul A. Crook, Yifan Ethan Xu, Xin Luna Dong, Christos Faloutsos, Lei LI, Babak Damavandi, Seungwhan Moon

We have developed the \textbf{SnapNTell Dataset}, distinct from traditional VQA datasets: (1) It encompasses a wide range of categorized entities, each represented by images and explicitly named in the answers; (2) It features QA pairs that require extensive knowledge for accurate responses.

Question Answering Retrieval +1

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

1 code implementation16 Feb 2024 Zekun Li, Zhiyu Zoey Chen, Mike Ross, Patrick Huber, Seungwhan Moon, Zhaojiang Lin, Xin Luna Dong, Adithya Sagar, Xifeng Yan, Paul A. Crook

We also show that by fine-tuning on a small collection of diverse task-oriented dialogues, we can equip modestly sized models, specifically a 13B parameter LLaMA2-Chat model, with function-calling capabilities and DST performance comparable to ChatGPT while maintaining their chat capabilities.

Avg Dialogue State Tracking +1

Teaching Models new APIs: Domain-Agnostic Simulators for Task Oriented Dialogue

no code implementations13 Oct 2021 Moya Chen, Paul A. Crook, Stephen Roller

We demonstrate that large language models are able to simulate Task Oriented Dialogues in novel domains, provided only with an API implementation and a list of goals.

Situated and Interactive Multimodal Conversations

2 code implementations COLING 2020 Seungwhan Moon, Satwik Kottur, Paul A. Crook, Ankita De, Shivani Poddar, Theodore Levin, David Whitney, Daniel Difranco, Ahmad Beirami, Eunjoon Cho, Rajen Subba, Alborz Geramifard

Next generation virtual assistants are envisioned to handle multimodal inputs (e. g., vision, memories of previous interactions, in addition to the user's utterances), and perform multimodal actions (e. g., displaying a route in addition to generating the system's utterance).

Response Generation

On the Linear Belief Compression of POMDPs: A re-examination of current methods

no code implementations5 Aug 2015 Zhuoran Wang, Paul A. Crook, Wenshuo Tang, Oliver Lemon

Belief compression improves the tractability of large-scale partially observable Markov decision processes (POMDPs) by finding projections from high-dimensional belief space onto low-dimensional approximations, where solving to obtain action selection policies requires fewer computations.

Cannot find the paper you are looking for? You can Submit a new open access paper.