Search Results for author: Yin Tat Lee

Found 47 papers, 11 papers with code

Leverage Score Sampling for Faster Accelerated Regression and ERM

no code implementations • 22 Nov 2017 • Naman Agarwal, Sham Kakade, Rahul Kidambi, Yin Tat Lee, Praneeth Netrapalli, Aaron Sidford

Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a vector $b \in\mathbb{R}^{d}$, we show how to compute an $\epsilon$-approximate solution to the regression problem $ \min_{x\in\mathbb{R}^{d}}\frac{1}{2} \|\mathbf{A} x - b\|_{2}^{2} $ in time $ \tilde{O} ((n+\sqrt{d\cdot\kappa_{\text{sum}}})\cdot s\cdot\log\epsilon^{-1}) $ where $\kappa_{\text{sum}}=\mathrm{tr}\left(\mathbf{A}^{\top}\mathbf{A}\right)/\lambda_{\min}(\mathbf{A}^{T}\mathbf{A})$ and $s$ is the maximum number of non-zero entries in a row of $\mathbf{A}$.

regression

Paper
Add Code

Convergence Rate of Riemannian Hamiltonian Monte Carlo and Faster Polytope Volume Computation

no code implementations • 17 Oct 2017 • Yin Tat Lee, Santosh S. Vempala

A key ingredient of our analysis is a proof of an analog of the KLS conjecture for Gibbs distributions over manifolds.

Paper
Add Code

An SDP-Based Algorithm for Linear-Sized Spectral Sparsification

no code implementations • 27 Feb 2017 • Yin Tat Lee, He Sun

Noticing that $\Omega(m)$ time is needed for any algorithm to construct a spectral sparsifier and a spectral sparsifier of $G$ requires $\Omega(n)$ edges, a natural question is to investigate, for any constant $\varepsilon$, if a $(1+\varepsilon)$-spectral sparsifier of $G$ with $O(n)$ edges can be constructed in $\tilde{O}(m)$ time, where the $\tilde{O}$ notation suppresses polylogarithmic factors.

Paper
Add Code

Kernel-based methods for bandit convex optimization

no code implementations • 11 Jul 2016 • Sébastien Bubeck, Ronen Eldan, Yin Tat Lee

We consider the adversarial convex bandit problem and we build the first $\mathrm{poly}(T)$-time algorithm with $\mathrm{poly}(n) \sqrt{T}$-regret for this problem.

Paper
Add Code

A geometric alternative to Nesterov's accelerated gradient descent

no code implementations • 26 Jun 2015 • Sébastien Bubeck, Yin Tat Lee, Mohit Singh

The new algorithm has a simple geometric interpretation, loosely inspired by the ellipsoid method.

Paper
Add Code

Uniform Sampling for Matrix Approximation

no code implementations • 21 Aug 2014 • Michael B. Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, Aaron Sidford

In addition to an improved understanding of uniform sampling, our main proof introduces a structural result of independent interest: we show that every matrix can be made to have low coherence by reweighting a small subset of its rows.

regression

Paper
Add Code

Adversarial Examples from Cryptographic Pseudo-Random Generators

no code implementations • 15 Nov 2018 • Sébastien Bubeck, Yin Tat Lee, Eric Price, Ilya Razenshteyn

In our recent work (Bubeck, Price, Razenshteyn, arXiv:1805. 10204) we argued that adversarial examples in machine learning might be due to an inherent computational hardness of the problem.

Binary Classification General Classification

Paper
Add Code

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

no code implementations • NeurIPS 2018 • Kevin Scaman, Francis Bach, Sébastien Bubeck, Yin Tat Lee, Laurent Massoulié

Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

Optimization and Control

Paper
Add Code

Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities

no code implementations • 15 Dec 2018 • Yin Tat Lee, Zhao Song, Santosh S. Vempala

We apply this to the sampling problem to obtain a nearly linear implementation of HMC for a broad class of smooth, strongly logconcave densities, with the number of iterations (parallel depth) and gradient evaluations being $\mathit{polylogarithmic}$ in the dimension (rather than polynomial as in previous work).

Paper
Add Code

Solving Empirical Risk Minimization in the Current Matrix Multiplication Time

no code implementations • 11 May 2019 • Yin Tat Lee, Zhao Song, Qiuyi Zhang

Our result generalizes the very recent result of solving linear programs in the current matrix multiplication time [Cohen, Lee, Song'19] to a more broad class of problems.

Paper
Add Code

Complexity of Highly Parallel Non-Smooth Convex Optimization

no code implementations • NeurIPS 2019 • Sébastien Bubeck, Qijia Jiang, Yin Tat Lee, Yuanzhi Li, Aaron Sidford

Namely we consider optimization algorithms interacting with a highly parallel gradient oracle, that is one that can answer $\mathrm{poly}(d)$ gradient queries in parallel.

Paper
Add Code

The Randomized Midpoint Method for Log-Concave Sampling

no code implementations • NeurIPS 2019 • Ruoqi Shen, Yin Tat Lee

To solve the sampling problem, we propose a new framework to discretize stochastic differential equations.

Paper
Add Code

Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo

no code implementations • 10 Feb 2020 • Yin Tat Lee, Ruoqi Shen, Kevin Tian

We show that the gradient norm $\|\nabla f(x)\|$ for $x \sim \exp(-f(x))$, where $f$ is strongly convex and smooth, concentrates tightly around its mean.

Art Analysis

Paper
Add Code

An Improved Cutting Plane Method for Convex Optimization, Convex-Concave Games and its Applications

no code implementations • 8 Apr 2020 • Haotian Jiang, Yin Tat Lee, Zhao Song, Sam Chiu-wai Wong

We propose a new cutting plane algorithm that uses an optimal $O(n \log (\kappa))$ evaluations of the oracle and an additional $O(n^2)$ time per evaluation, where $\kappa = nR/\epsilon$.

Paper
Add Code

Network size and weights size for memorization with two-layers neural networks

no code implementations • 4 Jun 2020 • Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer

In contrast we propose a new training procedure for ReLU networks, based on complex (as opposed to real) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.

Memorization

Paper
Add Code

Composite Logconcave Sampling with a Restricted Gaussian Oracle

no code implementations • 10 Jun 2020 • Ruoqi Shen, Kevin Tian, Yin Tat Lee

We consider sampling from composite densities on $\mathbb{R}^d$ of the form $d\pi(x) \propto \exp(-f(x) - g(x))dx$ for well-conditioned $f$ and convex (but possibly non-smooth) $g$, a family generalizing restrictions to a convex set, through the abstraction of a restricted Gaussian oracle.

Paper
Add Code

FAST DIFFERENTIALLY PRIVATE-SGD VIA JL PROJECTIONS

no code implementations • 1 Jan 2021 • Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Uthaipon Tantipongpipat

Differentially Private-SGD (DP-SGD) of Abadi et al. (2016) and its variations are the only known algorithms for private training of large scale neural networks.

Paper
Add Code

Structured Logconcave Sampling with a Restricted Gaussian Oracle

no code implementations • 7 Oct 2020 • Yin Tat Lee, Ruoqi Shen, Kevin Tian

For composite densities $\exp(-f(x) - g(x))$, where $f$ has condition number $\kappa$ and convex (but possibly non-smooth) $g$ admits an RGO, we obtain a mixing time of $O(\kappa d \log^3\frac{\kappa d}{\epsilon})$, matching the state-of-the-art non-composite bound; no composite samplers with better mixing than general-purpose logconcave samplers were previously known.

Paper
Add Code

Network size and size of the weights in memorization with two-layers neural networks

no code implementations • NeurIPS 2020 • Sebastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer

In contrast we propose a new training procedure for ReLU networks, based on {\em complex} (as opposed to {\em real}) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.

Memorization

Paper
Add Code

Minimum Cost Flows, MDPs, and $\ell_1$-Regression in Nearly Linear Time for Dense Instances

no code implementations • 14 Jan 2021 • Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, Di Wang

In the special case of the minimum cost flow problem on $n$-vertex $m$-edge graphs with integer polynomially-bounded costs and capacities we obtain a randomized method which solves the problem in $\tilde{O}(m+n^{1. 5})$ time.

Data Structures and Algorithms Optimization and Control

Paper
Add Code

Fast and Memory Efficient Differentially Private-SGD via JL Projections

no code implementations • NeurIPS 2021 • Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Judy Hanwen Shen, Uthaipon Tantipongpipat

Unlike previous attempts to make DP-SGD faster which work only on a subset of network architectures or use compiler techniques, we propose an algorithmic solution which works for any network in a black-box manner which is the main contribution of this paper.

Paper
Add Code

Private Non-smooth Empirical Risk Minimization and Stochastic Convex Optimization in Subquadratic Steps

no code implementations • 29 Mar 2021 • Janardhan Kulkarni, Yin Tat Lee, Daogao Liu

More precisely, our differentially private algorithm requires $O(\frac{N^{3/2}}{d^{1/8}}+ \frac{N^2}{d})$ gradient queries for optimal excess empirical risk, which is achieved with the help of subsampling and smoothing the function via convolution.

Paper
Add Code

Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions

no code implementations • NeurIPS 2021 • Yin Tat Lee, Ruoqi Shen, Kevin Tian

We give lower bounds on the performance of two of the most popular sampling methods in practice, the Metropolis-adjusted Langevin algorithm (MALA) and multi-step Hamiltonian Monte Carlo (HMC) with a leapfrog integrator, when applied to well-conditioned distributions.

Open-Ended Question Answering

Paper
Add Code

Private Non-smooth ERM and SCO in Subquadratic Steps

no code implementations • NeurIPS 2021 • Janardhan Kulkarni, Yin Tat Lee, Daogao Liu

We study the differentially private Empirical Risk Minimization (ERM) and Stochastic Convex Optimization (SCO) problems for non-smooth convex functions.

Paper
Add Code

Private Convex Optimization via Exponential Mechanism

no code implementations • 1 Mar 2022 • Sivakanth Gopi, Yin Tat Lee, Daogao Liu

Furthermore, we show how to implement this mechanism using $\widetilde{O}(n \min(d, n))$ queries to $f_i(x)$ for the DP-SCO where $n$ is the number of samples/users and $d$ is the ambient dimension.

Paper
Add Code

Private Convex Optimization in General Norms

no code implementations • 18 Jul 2022 • Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian

We propose a new framework for differentially private optimization of convex functions which are Lipschitz in an arbitrary norm $\|\cdot\|$.

Paper
Add Code

Decomposable Non-Smooth Convex Optimization with Nearly-Linear Gradient Oracle Complexity

no code implementations • 7 Aug 2022 • Sally Dong, Haotian Jiang, Yin Tat Lee, Swati Padmanabhan, Guanghao Ye

In this work, we give an algorithm that minimizes the above convex formulation to $\epsilon$-accuracy in $\widetilde{O}(\sum_{i=1}^n d_i \log (1 /\epsilon))$ gradient computations, with no assumptions on the condition number.

Paper
Add Code

Condition-number-independent convergence rate of Riemannian Hamiltonian Monte Carlo with numerical integrators

no code implementations • 13 Oct 2022 • Yunbum Kook, Yin Tat Lee, Ruoqi Shen, Santosh S. Vempala

We show that for distributions in the form of $e^{-\alpha^{\top}x}$ on a polytope with $m$ constraints, the convergence rate of a family of commonly-used integrators is independent of $\left\Vert \alpha\right\Vert _{2}$ and the geometry of the polytope.

Paper
Add Code

Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping

no code implementations • 3 Dec 2022 • Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, Jiang Bian

To reduce the compute time overhead of private learning, we show that \emph{per-layer clipping}, where the gradient of each neural network layer is clipped separately, allows clipping to be performed in conjunction with backpropagation in differentially private optimization.

Computational Efficiency

Paper
Add Code

Learning threshold neurons via the "edge of stability"

no code implementations • 14 Dec 2022 • Kwangjun Ahn, Sébastien Bubeck, Sinho Chewi, Yin Tat Lee, Felipe Suarez, Yi Zhang

For these models, we provably establish the edge of stability phenomenon and discover a sharp phase transition for the step size below which the neural network fails to learn "threshold-like" neurons (i. e., neurons with a non-zero first-layer bias).

Inductive Bias

Paper
Add Code

ReSQueing Parallel and Private Stochastic Convex Optimization

no code implementations • 1 Jan 2023 • Yair Carmon, Arun Jambulapati, Yujia Jin, Yin Tat Lee, Daogao Liu, Aaron Sidford, Kevin Tian

We give a parallel algorithm obtaining optimization error $\epsilon_{\text{opt}}$ with $d^{1/3}\epsilon_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator.

Paper
Add Code

Algorithmic Aspects of the Log-Laplace Transform and a Non-Euclidean Proximal Sampler

no code implementations • 13 Feb 2023 • Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian

The development of efficient sampling algorithms catering to non-Euclidean geometries has been a challenging endeavor, as discretization techniques which succeed in the Euclidean setting do not readily carry over to more general settings.

Paper
Add Code

$k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models

no code implementations • 21 Feb 2023 • Yangsibo Huang, Daogao Liu, Zexuan Zhong, Weijia Shi, Yin Tat Lee

Fine-tuning a language model on a new domain is standard practice for domain adaptation.

Domain Adaptation Language Modelling +1

Paper
Add Code

Textbooks Are All You Need

no code implementations • 20 Jun 2023 • Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li

Despite this small scale, phi-1 attains pass@1 accuracy 50. 6% on HumanEval and 55. 5% on MBPP.

Ranked #42 on Code Generation on HumanEval

Code Generation Language Modelling +1

Paper
Add Code

Positional Description Matters for Transformers Arithmetic

no code implementations • 22 Nov 2023 • Ruoqi Shen, Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Yuanzhi Li, Yi Zhang

For (i) we train a small model on a small dataset (100M parameters and 300k samples) with remarkable aptitude in (direct, no scratchpad) 15 digits multiplication and essentially perfect up to 12 digits, while usual training in this context would give a model failing at 4 digits multiplication.

Memorization

Paper
Add Code

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations • 22 Apr 2024 • Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, ZiYi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Language Modelling

Paper
Add Code

Optimal algorithms for smooth and strongly convex distributed optimization in networks

1 code implementation • ICML 2017 • Kevin Scaman, Francis Bach, Sébastien Bubeck, Yin Tat Lee, Laurent Massoulié

For centralized (i. e. master/slave) algorithms, we show that distributing Nesterov's accelerated gradient descent is optimal and achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_g}(1+\Delta\tau)\ln(1/\varepsilon))$, where $\kappa_g$ is the condition number of the (global) function to optimize, $\Delta$ is the diameter of the network, and $\tau$ (resp.

Distributed Optimization regression

Paper
Code

Textbooks Are All You Need II: phi-1.5 technical report

1 code implementation • 11 Sep 2023 • Yuanzhi Li, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar, Yin Tat Lee

We continue the investigation into the power of smaller Transformer-based language models as initiated by \textbf{TinyStories} -- a 10 million parameter model that can produce coherent English -- and the follow-up work on \textbf{phi-1}, a 1. 3 billion parameter model with Python coding performance close to the state-of-the-art.

Ranked #12 on Question Answering on SIQA

Code Generation Common Sense Reasoning +3

Paper
Code

Sampling with Riemannian Hamiltonian Monte Carlo in a Constrained Space

1 code implementation • 3 Feb 2022 • Yunbum Kook, Yin Tat Lee, Ruoqi Shen, Santosh S. Vempala

We demonstrate for the first time that ill-conditioned, non-smooth, constrained distributions in very high dimension, upwards of 100, 000, can be sampled efficiently $\textit{in practice}$.

Paper
Code

Differentially Private Synthetic Data via Foundation Model APIs 2: Text

1 code implementation • 4 Mar 2024 • Chulin Xie, Zinan Lin, Arturs Backurs, Sivakanth Gopi, Da Yu, Huseyin A Inan, Harsha Nori, Haotian Jiang, Huishuai Zhang, Yin Tat Lee, Bo Li, Sergey Yekhanin

Lin et al. (2024) recently introduced the Private Evolution (PE) algorithm to generate DP synthetic images with only API access to diffusion models.

Privacy Preserving

Paper
Code

Differentially Private Fine-tuning of Language Models

2 code implementations • ICLR 2022 • Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

For example, on the MNLI dataset we achieve an accuracy of $87. 8\%$ using RoBERTa-Large and $83. 5\%$ using RoBERTa-Base with a privacy budget of $\epsilon = 6. 7$.

Text Generation

Paper
Code

Numerical Composition of Differential Privacy

1 code implementation • NeurIPS 2021 • Sivakanth Gopi, Yin Tat Lee, Lukas Wutschitz

We give a fast algorithm to optimally compose privacy guarantees of differentially private (DP) algorithms to arbitrary accuracy.

Paper
Code

When Does Differentially Private Learning Not Suffer in High Dimensions?

1 code implementation • 1 Jul 2022 • Xuechen Li, Daogao Liu, Tatsunori Hashimoto, Huseyin A. Inan, Janardhan Kulkarni, Yin Tat Lee, Abhradeep Guha Thakurta

Large pretrained models can be privately fine-tuned to achieve performance approaching that of non-private models.

Vocal Bursts Intensity Prediction

138

Paper
Code

Automatic Prompt Optimization with "Gradient Descent" and Beam Search

4 code implementations • 4 May 2023 • Reid Pryzant, Dan Iter, Jerry Li, Yin Tat Lee, Chenguang Zhu, Michael Zeng

Large Language Models (LLMs) have shown impressive performance as general purpose agents, but their abilities remain highly dependent on prompts which are hand written with onerous trial-and-error effort.

3,191

Paper
Code

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

1 code implementation • 28 Nov 2023 • Harsha Nori, Yin Tat Lee, Sheng Zhang, Dean Carignan, Richard Edgar, Nicolo Fusi, Nicholas King, Jonathan Larson, Yuanzhi Li, Weishung Liu, Renqian Luo, Scott Mayer McKinney, Robert Osazuwa Ness, Hoifung Poon, Tao Qin, Naoto Usuyama, Chris White, Eric Horvitz

We find that prompting innovation can unlock deeper specialist capabilities and show that GPT-4 easily tops prior leading results for medical benchmarks.

Ranked #1 on Question Answering on MedQA

Electrical Engineering Experimental Design +3

5,065

Paper
Code

Sparks of Artificial General Intelligence: Early experiments with GPT-4

2 code implementations • 22 Mar 2023 • Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang

We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models.

Ranked #33 on Arithmetic Reasoning on GSM8K

Arithmetic Reasoning Math Word Problem Solving

17,431

Paper
Code

An Empirical Study on Challenging Math Problem Solving with GPT-4

1 code implementation • 2 Jun 2023 • Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang

Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields.

Elementary Mathematics Math

25,414

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.