TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Emotional Intelligence	Emotional Intelligence	OpenAI gpt-4-0613	EQ-Bench Score	62.52	# 1
Emotional Intelligence	Emotional Intelligence	OpenAI ADA	EQ-Bench Score	2.25	# 23
Emotional Intelligence	Emotional Intelligence	OpenAI ADA	EQ-Bench Score	2.25	# 23
Emotional Intelligence	Emotional Intelligence	OpenAI text-davinci-001	EQ-Bench Score	15.19	# 22
Emotional Intelligence	Emotional Intelligence	lmsys/vicuna-7b-v1.1	EQ-Bench Score	22.24	# 21
Emotional Intelligence	Emotional Intelligence	Koala 13B	EQ-Bench Score	24.92	# 20
Emotional Intelligence	Emotional Intelligence	meta-llama/Llama-2-7b-chat-hf	EQ-Bench Score	25.43	# 19
Emotional Intelligence	Emotional Intelligence	lmsys/vicuna-13b-v1.1	EQ-Bench Score	32.85	# 18
Emotional Intelligence	Emotional Intelligence	meta-llama/Llama-2-13b-chat-hf	EQ-Bench Score	33.02	# 17
Emotional Intelligence	Emotional Intelligence	lmsys/vicuna-33b-v1.3	EQ-Bench Score	36.52	# 16
Emotional Intelligence	Emotional Intelligence	openchat/openchat 3.5	EQ-Bench Score	37.08	# 15
Emotional Intelligence	Emotional Intelligence	OpenAI text-davinci-002	EQ-Bench Score	39.44	# 14
Emotional Intelligence	Emotional Intelligence	Intel/neural-chat-7b-v3-1	EQ-Bench Score	43.61	# 13
Emotional Intelligence	Emotional Intelligence	OpenAI text-davinci-003	EQ-Bench Score	43.73	# 12
Emotional Intelligence	Emotional Intelligence	Qwen/Qwen-14B-Chat	EQ-Bench Score	43.76	# 11
Emotional Intelligence	Emotional Intelligence	Open-Orca/Mistral-7B-OpenOrca	EQ-Bench Score	44.40	# 10
Emotional Intelligence	Emotional Intelligence	OpenAI gpt-3.5-turbo-0301	EQ-Bench Score	47.61	# 9
Emotional Intelligence	Emotional Intelligence	OpenAI gpt-3.5-0613	EQ-Bench Score	49.17	# 8
Emotional Intelligence	Emotional Intelligence	01-ai/Yi-34B-Chat	EQ-Bench Score	51.03	# 7
Emotional Intelligence	Emotional Intelligence	meta-llama/Llama-2-70b-chat-hf	EQ-Bench Score	51.56	# 6
Emotional Intelligence	Emotional Intelligence	Anthropic Claude2	EQ-Bench Score	52.14	# 5
Emotional Intelligence	Emotional Intelligence	Qwen/Qwen-72B-Chat	EQ-Bench Score	52.44	# 4
Emotional Intelligence	Emotional Intelligence	migtissera/SynthIA-70B-v1.5	EQ-Bench Score	54.83	# 2
Emotional Intelligence	Emotional Intelligence	OpenAI gpt-4-0314	EQ-Bench Score	53.39	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/eq-bench-an-emotional-intelligence-benchmark/emotional-intelligence-on-emotional)](https://paperswithcode.com/sota/emotional-intelligence-on-emotional?p=eq-bench-an-emotional-intelligence-benchmark)`

EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models

11 Dec 2023 · Samuel J. Paech ·

We introduce EQ-Bench, a novel benchmark designed to evaluate aspects of emotional intelligence in Large Language Models (LLMs). We assess the ability of LLMs to understand complex emotions and social interactions by asking them to predict the intensity of emotional states of characters in a dialogue. The benchmark is able to discriminate effectively between a wide range of models. We find that EQ-Bench correlates strongly with comprehensive multi-domain benchmarks like MMLU (Hendrycks et al., 2020) (r=0.97), indicating that we may be capturing similar aspects of broad intelligence. Our benchmark produces highly repeatable results using a set of 60 English-language questions. We also provide open-source code for an automated benchmarking pipeline at https://github.com/EQ-bench/EQ-Bench and a leaderboard at https://eqbench.com

PDF Abstract

Code

Add Remove Mark official

eq-bench/eq-bench official

126

Tasks

Add Remove

Benchmarking

Emotional Intelligence

Datasets

Introduced in the Paper:

Emotional Intelligence

Used in the Paper:

MMLU

HellaSwag

TruthfulQA MT-Bench

Results from the Paper

Edit

Ranked #1 on Emotional Intelligence on Emotional Intelligence

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Emotional Intelligence	Emotional Intelligence	OpenAI gpt-4-0613	EQ-Bench Score	62.52	# 1	Compare
Emotional Intelligence	Emotional Intelligence	OpenAI ADA	EQ-Bench Score	2.25	# 23	Compare
Emotional Intelligence	Emotional Intelligence	OpenAI ADA	EQ-Bench Score	2.25	# 23	Compare
Emotional Intelligence	Emotional Intelligence	OpenAI text-davinci-001	EQ-Bench Score	15.19	# 22	Compare
Emotional Intelligence	Emotional Intelligence	lmsys/vicuna-7b-v1.1	EQ-Bench Score	22.24	# 21	Compare
Emotional Intelligence	Emotional Intelligence	Koala 13B	EQ-Bench Score	24.92	# 20	Compare
Emotional Intelligence	Emotional Intelligence	meta-llama/Llama-2-7b-chat-hf	EQ-Bench Score	25.43	# 19	Compare
Emotional Intelligence	Emotional Intelligence	lmsys/vicuna-13b-v1.1	EQ-Bench Score	32.85	# 18	Compare
Emotional Intelligence	Emotional Intelligence	meta-llama/Llama-2-13b-chat-hf	EQ-Bench Score	33.02	# 17	Compare
Emotional Intelligence	Emotional Intelligence	lmsys/vicuna-33b-v1.3	EQ-Bench Score	36.52	# 16	Compare
Emotional Intelligence	Emotional Intelligence	openchat/openchat 3.5	EQ-Bench Score	37.08	# 15	Compare
Emotional Intelligence	Emotional Intelligence	OpenAI text-davinci-002	EQ-Bench Score	39.44	# 14	Compare
Emotional Intelligence	Emotional Intelligence	Intel/neural-chat-7b-v3-1	EQ-Bench Score	43.61	# 13	Compare
Emotional Intelligence	Emotional Intelligence	OpenAI text-davinci-003	EQ-Bench Score	43.73	# 12	Compare
Emotional Intelligence	Emotional Intelligence	Qwen/Qwen-14B-Chat	EQ-Bench Score	43.76	# 11	Compare
Emotional Intelligence	Emotional Intelligence	Open-Orca/Mistral-7B-OpenOrca	EQ-Bench Score	44.40	# 10	Compare
Emotional Intelligence	Emotional Intelligence	OpenAI gpt-3.5-turbo-0301	EQ-Bench Score	47.61	# 9	Compare
Emotional Intelligence	Emotional Intelligence	OpenAI gpt-3.5-0613	EQ-Bench Score	49.17	# 8	Compare
Emotional Intelligence	Emotional Intelligence	01-ai/Yi-34B-Chat	EQ-Bench Score	51.03	# 7	Compare
Emotional Intelligence	Emotional Intelligence	meta-llama/Llama-2-70b-chat-hf	EQ-Bench Score	51.56	# 6	Compare
Emotional Intelligence	Emotional Intelligence	Anthropic Claude2	EQ-Bench Score	52.14	# 5	Compare
Emotional Intelligence	Emotional Intelligence	Qwen/Qwen-72B-Chat	EQ-Bench Score	52.44	# 4	Compare
Emotional Intelligence	Emotional Intelligence	migtissera/SynthIA-70B-v1.5	EQ-Bench Score	54.83	# 2	Compare
Emotional Intelligence	Emotional Intelligence	OpenAI gpt-4-0314	EQ-Bench Score	53.39	# 3	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove