1 code implementation • 23 Feb 2024 • Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
To address this, we present Farsight, a novel in situ interactive tool that helps people identify potential harms from the AI applications they are prototyping.
no code implementations • 16 Feb 2024 • Minsuk Kahng, Ian Tenney, Mahima Pushkarna, Michael Xieyang Liu, James Wexler, Emily Reif, Krystal Kallarackal, Minsuk Chang, Michael Terry, Lucas Dixon
Automatic side-by-side evaluation has emerged as a promising approach to evaluating the quality of responses from large language models (LLMs).
no code implementations • 24 Oct 2023 • Savvas Petridis, Michael Terry, Carrie J. Cai
Prototyping AI applications is notoriously difficult.
no code implementations • 24 Oct 2023 • Savvas Petridis, Ben Wedin, James Wexler, Aaron Donsbach, Mahima Pushkarna, Nitesh Goyal, Carrie J. Cai, Michael Terry
Inspired by these findings, we developed ConstitutionMaker, an interactive tool for converting user feedback into principles, to steer LLM-based chatbots.
no code implementations • 23 Oct 2023 • Michael Terry, Chinmay Kulkarni, Martin Wattenberg, Lucas Dixon, Meredith Ringel Morris
AI alignment considers the overall problem of ensuring an AI produces desired outcomes, without undesirable side effects.
no code implementations • 15 Apr 2023 • Meredith Ringel Morris, Carrie J. Cai, Jess Holbrook, Chinmay Kulkarni, Michael Terry
Card et al.'s classic paper "The Design Space of Input Devices" established the value of design spaces as a tool for HCI analysis and invention.
no code implementations • 26 Jan 2022 • Eldon Schoop, Ben Wedin, Andrei Kapishnikov, Tolga Bolukbasi, Michael Terry
Developing a suitable Deep Neural Network (DNN) often requires significant iteration, where different model versions are evaluated and compared.
no code implementations • 4 Oct 2021 • Tongshuang Wu, Michael Terry, Carrie J. Cai
Although large language models (LLMs) have demonstrated impressive potential on simple tasks, their breadth of scope, lack of transparency, and insufficient controllability can make them less effective when assisting humans on more complex tasks.
1 code implementation • 16 Aug 2021 • Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton
Our largest models, even without finetuning on a code dataset, can synthesize solutions to 59. 6 percent of the problems from MBPP using few-shot learning with a well-designed prompt.
1 code implementation • CVPR 2021 • Andrei Kapishnikov, Subhashini Venugopalan, Besim Avci, Ben Wedin, Michael Terry, Tolga Bolukbasi
To minimize the effect of this source of noise, we propose adapting the attribution path itself -- conditioning the path not just on the image but also on the model being explained.
2 code implementations • ICCV 2019 • Andrei Kapishnikov, Tolga Bolukbasi, Fernanda Viégas, Michael Terry
Saliency methods can aid understanding of deep neural networks.
no code implementations • 30 Jan 2019 • Narayan Hegde, Jason D. Hipp, Yun Liu, Michael E. Buck, Emily Reif, Daniel Smilkov, Michael Terry, Carrie J. Cai, Mahul B. Amin, Craig H. Mermel, Phil Q. Nelson, Lily H. Peng, Greg S. Corrado, Martin C. Stumpe
SMILY may be a useful general-purpose tool in the pathologist's arsenal, to improve the efficiency of searching large archives of histopathology images, without the need to develop and implement specific tools for each application.
no code implementations • 16 Jan 2019 • Daniel Smilkov, Nikhil Thorat, Yannick Assogba, Ann Yuan, Nick Kreeger, Ping Yu, Kangyi Zhang, Shanqing Cai, Eric Nielsen, David Soergel, Stan Bileschi, Michael Terry, Charles Nicholson, Sandeep N. Gupta, Sarah Sirajuddin, D. Sculley, Rajat Monga, Greg Corrado, Fernanda B. Viégas, Martin Wattenberg
TensorFlow. js is a library for building and executing machine learning algorithms in JavaScript.
no code implementations • 28 Nov 2016 • Brian Patton, Yannis Agiomyrgiannakis, Michael Terry, Kevin Wilson, Rif A. Saurous, D. Sculley
Developers of text-to-speech synthesizers (TTS) often make use of human raters to assess the quality of synthesized speech.