1 code implementation • 19 Oct 2023 • Juan Rocamonde, Victoriano Montesinos, Elvis Nava, Ethan Perez, David Lindner
We find that VLM-RMs are remarkably robust as long as the VLM is large enough.
Prompt Engineering reinforcement-learning +2