Search Results for author: Tony Xia

Found 4 papers, 4 papers with code

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

1 code implementation • 3 Oct 2023 • Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao

To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks.

Chatbot Image Captioning +5

177

Paper
Code

TheoremQA: A Theorem-driven Question Answering dataset

1 code implementation • 21 May 2023 • Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, Tony Xia

We evaluate a wide spectrum of 16 large language and code models with different prompting strategies like Chain-of-Thoughts and Program-of-Thoughts.

Ranked #1 on Natural Questions on TheoremQA

GPT-4 Math +1

152

Paper
Code

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

1 code implementation • 20 Sep 2022 • Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering ScienceQA questions.

Ranked #5 on Science Question Answering on ScienceQA

Multimodal Deep Learning Multimodal Reasoning +5

548

Paper
Code

IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning

1 code implementation • 25 Oct 2021 • Pan Lu, Liang Qiu, Jiaqi Chen, Tony Xia, Yizhou Zhao, Wei zhang, Zhou Yu, Xiaodan Liang, Song-Chun Zhu

Also, we develop a strong IconQA baseline Patch-TRM that applies a pyramid cross-modal Transformer with input diagram embeddings pre-trained on the icon dataset.

Ranked #1 on Visual Question Answering (VQA) on IconQA

Arithmetic Reasoning Math Word Problem Solving +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.