Search Results for author: Andrew Chi-Chih Yao

Found 7 papers, 7 papers with code

Tensor Product Attention Is All You Need

1 code implementation11 Jan 2025 Yifan Zhang, Yifeng Liu, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew Chi-Chih Yao

Scaling language models to handle longer input sequences typically necessitates large key-value (KV) caches, resulting in substantial memory overhead during inference.

All Language Modeling +1

On the Diagram of Thought

1 code implementation16 Sep 2024 Yifan Zhang, Yang Yuan, Andrew Chi-Chih Yao

We introduce Diagram of Thought (DoT), a framework that models iterative reasoning in large language models (LLMs) as the construction of a directed acyclic graph (DAG) within a single model.

Autonomous Data Selection with Language Models for Mathematical Texts

2 code implementations12 Feb 2024 Yifan Zhang, Yifan Luo, Yang Yuan, Andrew Chi-Chih Yao

Our method showcases a 2 times increase in pretraining token efficiency compared to state-of-the-art baselines, underscoring the potential of our approach in enhancing models' mathematical reasoning capabilities.

Continual Pretraining GSM8K +4

Augmenting Math Word Problems via Iterative Question Composing

1 code implementation17 Jan 2024 Haoxiong Liu, Yifan Zhang, Yifan Luo, Andrew Chi-Chih Yao

Despite the advancements in large language models (LLMs) for mathematical reasoning, solving competition-level math problems remains a significant challenge, especially for open-source LLMs without external tools.

Ranked #66 on Math Word Problem Solving on MATH (using extra training data)

Math Math Word Problem Solving

Meta Prompting for AI Systems

1 code implementation20 Nov 2023 Yifan Zhang, Yang Yuan, Andrew Chi-Chih Yao

In this work, we present a comprehensive study of Meta Prompting (MP), an innovative technique reshaping the utilization of language models (LMs) and AI systems in problem-solving and data interaction.

Data Interaction GSM8K +3

Cumulative Reasoning with Large Language Models

1 code implementation8 Aug 2023 Yifan Zhang, Jingqin Yang, Yang Yuan, Andrew Chi-Chih Yao

We demonstrate CR's superiority through several complex reasoning tasks: it outperforms existing methods in logical inference tasks with up to a 9. 3% improvement, achieving 98. 04% accuracy on the curated FOLIO wiki dataset.

Decision Making Logical Reasoning +2

FedCM: Federated Learning with Client-level Momentum

2 code implementations21 Jun 2021 Jing Xu, Sen Wang, LiWei Wang, Andrew Chi-Chih Yao

Federated Learning is a distributed machine learning approach which enables model training without data sharing.

Federated Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.