JEEBench is a considerably more challenging benchmark dataset for evaluating the problem solving abilities of LLMs. It curates 515 challenging pre-engineering mathematics, physics and chemistry problems from the IIT JEE-Advanced Exam. Long-horizon reasoning on top of deep in-domain knowledge is essential for solving problems in this benchmark.

Source: Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages