1 code implementation • 31 Oct 2024 • Sina Rismanchian, Yasaman Razeghi, Sameer Singh, Shayan Doroudi
TurtleBench stands as one of the few benchmarks to evaluate the integration of visual understanding and code generation capabilities in LMMs, setting the stage for future research.
no code implementations • 11 Oct 2024 • Yu Fei, Yasaman Razeghi, Sameer Singh
Building on this insight, nudging employs a small aligned model to generate nudging tokens to guide the base model's output during decoding when the base model's uncertainty is high.
1 code implementation • 1 May 2024 • Catarina G Belém, Preethi Seshadri, Yasaman Razeghi, Sameer Singh
A key observation in prior work is that models reinforce stereotypes as a consequence of the gendered correlations that are present in the training data.
1 code implementation • 16 Sep 2023 • Rajasekhar Reddy Mekala, Yasaman Razeghi, Sameer Singh
On average, EchoPrompt improves the Zero-shot-CoT performance of code-davinci-002 by 5% in numerical tasks and 13% in reading comprehension tasks.
no code implementations • 21 Jul 2023 • Kolby Nottingham, Yasaman Razeghi, KyungMin Kim, JB Lanier, Pierre Baldi, Roy Fox, Sameer Singh
Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games, utilizing their general world knowledge and planning abilities.
no code implementations • 23 Mar 2022 • Henrique Santos, Ke Shen, Alice M. Mulvehill, Yasaman Razeghi, Deborah L. McGuinness, Mayank Kejriwal
Preliminary results suggest that the benchmark is challenging even for advanced language representation models designed for discriminative CSR question answering tasks.
no code implementations • 15 Feb 2022 • Yasaman Razeghi, Robert L. Logan IV, Matt Gardner, Sameer Singh
Pretrained Language Models (LMs) have demonstrated ability to perform numerical reasoning by extrapolating from a few examples in few-shot settings.
5 code implementations • EMNLP 2020 • Taylor Shin, Yasaman Razeghi, Robert L. Logan IV, Eric Wallace, Sameer Singh
The remarkable success of pretrained language models has motivated the study of what kinds of knowledge these models learn during pretraining.