1 code implementation • 17 Jun 2024 • Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna
As a result, when a developer wants to identify which models to use for their application, they are overwhelmed by the number of benchmarks and remain uncertain about which benchmark's results are most reflective of their specific use case.
1 code implementation • 17 Mar 2024 • Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna
With m&m's, we evaluate 6 popular LLMs with 2 planning strategies (multi-step vs. step-by-step planning), 2 plan formats (JSON vs. code), and 3 types of feedback (parsing/verification/execution).
1 code implementation • NeurIPS 2023 • Cheng-Yu Hsieh, Jieyu Zhang, Zixian Ma, Aniruddha Kembhavi, Ranjay Krishna
In the last year alone, a surge of new benchmarks to measure compositional understanding of vision-language models have permeated the machine learning ecosystem.
1 code implementation • 6 Mar 2023 • Michelle S. Lam, Zixian Ma, Anne Li, Izequiel Freitas, Dakuo Wang, James A. Landay, Michael S. Bernstein
Machine learning practitioners often end up tunneling on low-level technical details like model architectures and performance metrics.
1 code implementation • CVPR 2023 • Zixian Ma, Jerry Hong, Mustafa Omer Gul, Mona Gandhi, Irena Gao, Ranjay Krishna
To measure systematicity, CREPE consists of a test dataset containing over $370K$ image-text pairs and three different seen-unseen splits.
1 code implementation • 9 Oct 2022 • Zixian Ma, Rose Wang, Li Fei-Fei, Michael Bernstein, Ranjay Krishna
These results identify tasks where expectation alignment is a more useful strategy than curiosity-driven exploration for multi-agent coordination, enabling agents to do zero-shot coordination.
no code implementations • 11 Jan 2022 • Xin Liu, Yuntao Wang, Sinan Xie, XiaoYu Zhang, Zixian Ma, Daniel McDuff, Shwetak Patel
Camera-based contactless photoplethysmography refers to a set of popular techniques for contactless physiological measurement.
1 code implementation • ACL 2021 • Guoyang Zeng, Fanchao Qi, Qianrui Zhou, Tingji Zhang, Zixian Ma, Bairu Hou, Yuan Zang, Zhiyuan Liu, Maosong Sun
Textual adversarial attacking has received wide and increasing attention in recent years.