Search Results for author: Abulhair Saparov

Found 11 papers, 8 papers with code

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

1 code implementation • 15 Apr 2024 • Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs).

Paper
Code

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?

no code implementations • 31 Jan 2024 • Andreas Opedal, Alessandro Stolfo, Haruki Shirakami, Ying Jiao, Ryan Cotterell, Bernhard Schölkopf, Abulhair Saparov, Mrinmaya Sachan

We find evidence that LLMs, with and without instruction-tuning, exhibit human-like biases in both the text-comprehension and the solution-planning steps of the solving process, but not during the final step which relies on the problem's arithmetic expressions (solution execution).

Reading Comprehension

Paper
Add Code

Noisy Exemplars Make Large Language Models More Robust: A Domain-Agnostic Behavioral Analysis

1 code implementation • 1 Nov 2023 • Hongyi Zheng, Abulhair Saparov

Recent advances in prompt engineering enable large language models (LLMs) to solve multi-hop logical reasoning problems with impressive accuracy.

Logical Reasoning Prompt Engineering

Paper
Code

Personas as a Way to Model Truthfulness in Language Models

no code implementations • 27 Oct 2023 • Nitish Joshi, Javier Rando, Abulhair Saparov, Najoung Kim, He He

This allows the model to separate truth from falsehoods and controls the truthfulness of its generation.

Paper
Add Code

Retrieval-Augmented Chain-of-Thought in Semi-structured Domains

no code implementations • 22 Oct 2023 • Vaibhav Mavi, Abulhair Saparov, Chen Zhao

Applying existing question answering (QA) systems to specialized domains like law and finance presents challenges that necessitate domain expertise.

In-Context Learning Question Answering +1

Paper
Add Code

World Models for Math Story Problems

1 code implementation • 7 Jun 2023 • Andreas Opedal, Niklas Stoehr, Abulhair Saparov, Mrinmaya Sachan

In this paper, we consolidate previous work on categorizing and representing math story problems and develop MathWorld, which is a graph-based semantic formalism specific for the domain of math story problems.

Math

Paper
Code

Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples

1 code implementation • NeurIPS 2023 • Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Seyed Mehran Kazemi, Najoung Kim, He He

Given the intractably large size of the space of proofs, any model that is capable of general deductive reasoning must generalize to proofs of greater complexity.

Paper
Code

Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought

1 code implementation • 3 Oct 2022 • Abulhair Saparov, He He

Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought prompts (examples with intermediate reasoning steps).

Mathematical Reasoning Question Answering +1

Paper
Code

Towards General Natural Language Understanding with Probabilistic Worldbuilding

2 code implementations • 6 May 2021 • Abulhair Saparov, Tom M. Mitchell

We derive and implement an inference algorithm that reads sentences by parsing and abducing updates to its latent world model that capture the semantics of those sentences, and evaluate it on two out-of-domain question-answering datasets: (1) ProofWriter and (2) a new dataset we call FictionalGeoQA, designed to be more representative of real language but still simple enough to focus on evaluating reasoning ability, while being robust against heuristics.

Natural Language Understanding Question Answering +1

Paper
Code

Jelly Bean World: A Testbed for Never-Ending Learning

3 code implementations • ICLR 2020 • Emmanouil Antonios Platanios, Abulhair Saparov, Tom Mitchell

Never-ending learning is a machine learning paradigm that aims to bridge this gap, with the goal of encouraging researchers to design machine learning systems that can learn to perform a wider variety of inter-related tasks in more complex environments.

BIG-bench Machine Learning Navigate

Paper
Code

A Probabilistic Generative Grammar for Semantic Parsing

2 code implementations • CONLL 2017 • Abulhair Saparov

The work relies on a novel application of hierarchical Dirichlet processes (HDPs) for structured prediction, which we also present in this manuscript.

Natural Language Understanding Semantic Parsing +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.