no code implementations • 20 Dec 2022 • Alex Tamkin, Kunal Handa, Avash Shrestha, Noah Goodman
We investigate how both humans and models behave in the face of such task ambiguity by proposing AmbiBench, a new benchmark of six ambiguously-specified classification tasks.