no code implementations • 2 Aug 2024 • Pedro Freire, Egor Manuylovich, Jaroslaw E. Prilepsky, Sergei K. Turitsy
This tutorial-review on applications of artificial neural networks in photonics targets a broad audience, ranging from optical research and engineering communities to computer science and applied mathematics.
no code implementations • 19 Feb 2024 • Pedro Freire, ChengCheng Tan, Adam Gleave, Dan Hendrycks, Scott Emmons
Do language models implicitly learn a concept of human wellbeing?
no code implementations • 27 Jul 2023 • Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals.
no code implementations • 24 Jun 2022 • Pedro Freire, Sasipim Srivallapanondh, Antonio Napoli, Jaroslaw E. Prilepsky, Sergei K. Turitsyn
We provide and link four software-to-hardware complexity measures, defining how the different complexity metrics relate to the layers' hyper-parameters.
no code implementations • 1 Jan 2021 • Rohin Shah, Pedro Freire, Neel Alex, Rachel Freedman, Dmitrii Krasheninnikov, Lawrence Chan, Michael D Dennis, Pieter Abbeel, Anca Dragan, Stuart Russell
By merging reward learning and control, assistive agents can reason about the impact of control actions on reward learning, leading to several advantages over agents based on reward learning.
2 code implementations • 2 Dec 2020 • Pedro Freire, Adam Gleave, Sam Toyer, Stuart Russell
We evaluate a range of common reward and imitation learning algorithms on our tasks.