Search Results for author: Michael Shieh

Found 2 papers, 1 papers with code

Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning

no code implementations1 May 2024 Yuxi Xie, Anirudh Goyal, Wenyue Zheng, Min-Yen Kan, Timothy P. Lillicrap, Kenji Kawaguchi, Michael Shieh

We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process inspired by the successful strategy employed by AlphaZero.

Accelerating Greedy Coordinate Gradient via Probe Sampling

1 code implementation2 Mar 2024 Yiran Zhao, Wenyue Zheng, Tianle Cai, Xuan Long Do, Kenji Kawaguchi, Anirudh Goyal, Michael Shieh

Safety of Large Language Models (LLMs) has become a central issue given their rapid progress and wide applications.

Cannot find the paper you are looking for? You can Submit a new open access paper.