no code implementations • 3 Jun 2024 • Robert Osazuwa Ness, Katie Matton, Hayden Helm, Sheng Zhang, Junaid Bajwa, Carey E. Priebe, Eric Horvitz
Medical question-answering benchmarks rely on assumptions consistent with quantifying LLM performance but that may not hold in the open world of the clinic.
no code implementations • 21 Mar 2024 • Ricardo Cannizzaro, Michael Groom, Jonathan Routley, Robert Osazuwa Ness, Lars Kunze
Robot object manipulation in real-world environments is challenging because robot operation must be robust to a range of sensing, estimation, and actuation uncertainties to avoid potentially unsafe and costly mistakes that are a barrier to their adoption.
2 code implementations • 28 Nov 2023 • Harsha Nori, Yin Tat Lee, Sheng Zhang, Dean Carignan, Richard Edgar, Nicolo Fusi, Nicholas King, Jonathan Larson, Yuanzhi Li, Weishung Liu, Renqian Luo, Scott Mayer McKinney, Robert Osazuwa Ness, Hoifung Poon, Tao Qin, Naoto Usuyama, Chris White, Eric Horvitz
We find that prompting innovation can unlock deeper specialist capabilities and show that GPT-4 easily tops prior leading results for medical benchmarks.
Ranked #2 on Question Answering on MedQA (using extra training data)
1 code implementation • NeurIPS 2019 • Robert Osazuwa Ness, Kaushal Paneri, Olga Vitek
In contrast, structural causal models support counterfactual inference, but do not identify the mechanisms.