ToolQA

ToolQA is a question answering benchmark for Large Language Models (LLMs) which is designed to faithfully evaluate LLMs' ability to use external tools for question answering. The development of ToolQA involved a scalable, automated process for dataset curation, along with 13 specialized tools designed for interaction with external knowledge in order to answer questions.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


Modalities


Languages