Characterizing Tradeoffs in Language Model Decoding with Informational Interpretations

no code implementations16 Nov 2023 Chung-Ching Chang, William W. Cohen, Yun-Hsuan Sung

We propose a theoretical framework for formulating language model decoder algorithms with dynamic programming and information theory.

Language Modelling

Memory Augmented Language Models through Mixture of Word Experts

no code implementations15 Nov 2023 Cicero Nogueira dos santos, James Lee-Thorp, Isaac Noble, Chung-Ching Chang, David Uthus

We demonstrate that MoWE performs significantly better than the T5 family of models with similar number of FLOPs in a variety of NLP tasks.

Hallucination Augmented Recitations for Language Models

no code implementations13 Nov 2023 Abdullatif Köksal, Renat Aksitov, Chung-Ching Chang

For open book QA as a case study, we demonstrate that models finetuned with our counterfactual datasets improve text grounding, leading to better open book QA performance, with up to an 8. 0% increase in F1 score.

counterfactual Hallucination +1

KL-Divergence Guided Temperature Sampling

2 code implementations2 Jun 2023 Chung-Ching Chang, David Reitter, Renat Aksitov, Yun-Hsuan Sung

One common approach to mitigate hallucinations is to provide source/grounding documents and the model is trained to produce predictions that bind to and are attributable to the provided source.

Conversational Question Answering Language Modelling +1

Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models

no code implementations11 Feb 2023 Renat Aksitov, Chung-Ching Chang, David Reitter, Siamak Shakeri, YunHsuan Sung

One common solution to this is augmenting LLMs with a retrieval system and making sure that the generated output is attributable to the retrieved information.


