Arithmetic Reasoning
70 papers with code • 2 benchmarks • 3 datasets
Libraries
Use these libraries to find Arithmetic Reasoning models and implementationsMost implemented papers
Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data
However, most CoT studies rely on carefully designed human-annotated rational chains to prompt LLMs, posing challenges for real-world applications where labeled data is available without rational chains.
Sparks of Artificial General Intelligence: Early experiments with GPT-4
We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models.
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models
The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e. g., ChatDoctor) or instruction data (e. g., Alpaca).
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL
We identify a previously overlooked objective of query dependency in such optimization and elucidate two ensuing challenges that impede the successful and economical design of prompt optimization techniques.
ReFT: Representation Finetuning for Language Models
LoReFT is a drop-in replacement for existing PEFTs and learns interventions that are 10x-50x more parameter-efficient than prior state-of-the-art PEFTs.
Learning to Reason for Text Generation from Scientific Tables
In this paper, we introduce SciGen, a new challenge dataset for the task of reasoning-aware data-to-text generation consisting of tables from scientific articles and their corresponding descriptions.
Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning
We further propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem Solver (Inter-GPS).
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning
Also, we develop a strong IconQA baseline Patch-TRM that applies a pyramid cross-modal Transformer with input diagram embeddings pre-trained on the icon dataset.
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks.
UL2: Unifying Language Learning Paradigms
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.