Search Results for author: Tian Liang

Found 7 papers, 7 papers with code

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments

1 code implementation • 18 Mar 2024 • Jen-tse Huang, Eric John Li, Man Ho Lam, Tian Liang, Wenxuan Wang, Youliang Yuan, Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Michael R. Lyu

Additionally, we conduct evaluations across various LLMs and find that GPT-4 outperforms other models on GAMA-Bench, achieving a score of 72. 5.

Decision Making

Paper
Code

CriticBench: Benchmarking LLMs for Critique-Correct Reasoning

1 code implementation • 22 Feb 2024 • Zicheng Lin, Zhibin Gou, Tian Liang, Ruilin Luo, Haowei Liu, Yujiu Yang

Utilizing CriticBench, we evaluate and dissect the performance of 17 LLMs in generation, critique, and correction reasoning, i. e., GQC reasoning.

Benchmarking

Paper
Code

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models

1 code implementation • 31 Oct 2023 • Tian Liang, Zhiwei He, Jen-tse Huang, Wenxuan Wang, Wenxiang Jiao, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi, Xing Wang

Ideally, an advanced agent should possess the ability to accurately describe a given word using an aggressive description while concurrently maximizing confusion in the conservative description, enhancing its participation in the game.

Paper
Code

Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

1 code implementation • 30 May 2023 • Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi

To address the DoT problem, we propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution.

Arithmetic Reasoning Machine Translation

170

Paper
Code

Exploring Human-Like Translation Strategy with Large Language Models

2 code implementations • 6 May 2023 • Zhiwei He, Tian Liang, Wenxiang Jiao, Zhuosheng Zhang, Yujiu Yang, Rui Wang, Zhaopeng Tu, Shuming Shi, Xing Wang

Compared to typical machine translation that focuses solely on source-to-target mapping, LLM-based translation can potentially mimic the human translation process which might take preparatory steps to ensure high-quality translation.

Hallucination Machine Translation +2

170

Paper
Code

ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback

1 code implementation • 5 Apr 2023 • Wenxiang Jiao, Jen-tse Huang, Wenxuan Wang, Zhiwei He, Tian Liang, Xing Wang, Shuming Shi, Zhaopeng Tu

Therefore, we propose ParroT, a framework to enhance and regulate the translation abilities during chat based on open-source LLMs (e. g., LLaMA), human-written translation and feedback data.

Instruction Following Machine Translation +1

164

Paper
Code

AIM 2020 Challenge on Learned Image Signal Processing Pipeline

1 code implementation • 10 Nov 2020 • Andrey Ignatov, Radu Timofte, Zhilu Zhang, Ming Liu, Haolin Wang, WangMeng Zuo, Jiawei Zhang, Ruimao Zhang, Zhanglin Peng, Sijie Ren, Linhui Dai, Xiaohong Liu, Chengqi Li, Jun Chen, Yuichi Ito, Bhavya Vasudeva, Puneesh Deora, Umapada Pal, Zhenyu Guo, Yu Zhu, Tian Liang, Chenghua Li, Cong Leng, Zhihong Pan, Baopu Li, Byung-Hoon Kim, Joonyoung Song, Jong Chul Ye, JaeHyun Baek, Magauiya Zhussip, Yeskendir Koishekenov, Hwechul Cho Ye, Xin Liu, Xueying Hu, Jun Jiang, Jinwei Gu, Kai Li, Pengliang Tan, Bingxin Hou

This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results.

Demosaicking Denoising +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.