Claude 3.5 Sonnet Model Card Addendum

Preprint 2024  ·  Anthropic ·

This addendum to our Claude 3 Model Card describes Claude 3.5 Sonnet, a new model which outperforms our previous most capable model, Claude 3 Opus, while operating faster and at a lower cost. Claude 3.5 Sonnet offers improved capabilities, including better coding and visual processing. Since it is an evolution of the Claude 3 model family, we are providing an addendum rather than a new model card. We provide updated key evaluations and results from our safety testing.

PDF

Results from the Paper


 Ranked #1 on Multi-task Language Understanding on MMLU (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Code Generation HumanEval GPT-4o (0-shot) Pass@1 90.2 # 14
Multi-task Language Understanding MMLU Claude 3.5 Sonnet (5-shot) Average (%) 88.7 # 1
Visual Question Answering MM-Vet Claude 3.5 Sonnet (claude-3-5-sonnet-20240620) GPT-4 score 74.2±0.2 # 5
Visual Question Answering MM-Vet v2 Claude 3.5 Sonnet (claude-3-5-sonnet-20240620) GPT-4 score 71.8±0.2 # 3
MMR total MRR-Benchmark Claude 3.5 Sonnet Total Column Score 463 # 1

Methods


No methods listed for this paper. Add relevant methods here