Search Results for author: David Wu

Found 15 papers, 8 papers with code

The Virtues of Pessimism in Inverse Reinforcement Learning

no code implementations • 4 Feb 2024 • David Wu, Gokul Swamy, J. Andrew Bagnell, Zhiwei Steven Wu, Sanjiban Choudhury

Inverse Reinforcement Learning (IRL) is a powerful framework for learning complex behaviors from expert demonstrations.

Paper
Add Code

Accelerating Inverse Reinforcement Learning with Expert Bootstrapping

no code implementations • 4 Feb 2024 • David Wu, Sanjiban Choudhury

Existing inverse reinforcement learning methods (e. g. MaxEntIRL, $f$-IRL) search over candidate reward functions and solve a reinforcement learning problem in the inner loop.

Imitation Learning reinforcement-learning

Paper
Add Code

The KiTS21 Challenge: Automatic segmentation of kidneys, renal tumors, and renal cysts in corticomedullary-phase CT

1 code implementation • 5 Jul 2023 • Nicholas Heller, Fabian Isensee, Dasha Trofimova, Resha Tejpaul, Zhongchen Zhao, Huai Chen, Lisheng Wang, Alex Golts, Daniel Khapun, Daniel Shats, Yoel Shoshan, Flora Gilboa-Solomon, Yasmeen George, Xi Yang, Jianpeng Zhang, Jing Zhang, Yong Xia, Mengran Wu, Zhiyang Liu, Ed Walczak, Sean McSweeney, Ranveer Vasdev, Chris Hornung, Rafat Solaiman, Jamee Schoephoerster, Bailey Abernathy, David Wu, Safa Abdulkadir, Ben Byun, Justice Spriggs, Griffin Struyk, Alexandra Austin, Ben Simpson, Michael Hagstrom, Sierra Virnig, John French, Nitin Venkatesh, Sarah Chan, Keenan Moore, Anna Jacobsen, Susan Austin, Mark Austin, Subodh Regmi, Nikolaos Papanikolopoulos, Christopher Weight

Overall KiTS21 facilitated a significant advancement in the state of the art in kidney tumor segmentation, and provides useful insights that are applicable to the field of semantic segmentation as a whole.

Segmentation Tumor Segmentation

171

Paper
Code

CryptOpt: Automatic Optimization of Straightline Code

1 code implementation • 31 May 2023 • Joel Kuepper, Andres Erbsen, Jason Gross, Owen Conoly, Chuyue Sun, Samuel Tian, David Wu, Adam Chlipala, Chitchanok Chuengsatiansup, Daniel Genkin, Markus Wagner, Yuval Yarom

Manual engineering of high-performance implementations typically consumes many resources and requires in-depth knowledge of the hardware.

Paper
Code

Robust Risk-Aware Option Hedging

no code implementations • 27 Mar 2023 • David Wu, Sebastian Jaimungal

The objectives of option hedging/trading extend beyond mere protection against downside risks, with a desire to seek gains also driving agent's strategies.

Reinforcement Learning (RL)

Paper
Add Code

Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines

no code implementations • 15 Dec 2022 • Andrew Lee, David Wu, Emily Dinan, Mike Lewis

Despite many recent advancements in language modeling, state-of-the-art language models lack grounding in the real world and struggle with tasks involving complex reasoning.

Language Modelling

Paper
Add Code

Human-level play in the game of Diplomacy by combining language models with strategic reasoning

1 code implementation • Science 2022 • Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sash Mitts, Aditya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, Markus Zijlstra

Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge.

1,237

Paper
Code

CryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives (full version)

1 code implementation • 19 Nov 2022 • Joel Kuepper, Andres Erbsen, Jason Gross, Owen Conoly, Chuyue Sun, Samuel Tian, David Wu, Adam Chlipala, Chitchanok Chuengsatiansup, Daniel Genkin, Markus Wagner, Yuval Yarom

Most software domains rely on compilers to translate high-level code to multiple different machine languages, with performance not too much worse than what developers would have the patience to write directly in assembly language.

Benchmarking C++ code

Paper
Code

Self-Explaining Deviations for Coordination

no code implementations • 13 Jul 2022 • Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster

Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world.

Paper
Add Code

$AIR^2$ for Interaction Prediction

1 code implementation • 16 Nov 2021 • David Wu, Yunnan Wu

The 2021 Waymo Interaction Prediction Challenge introduced a problem of predicting the future trajectories and confidences of two interacting agents jointly.

motion prediction

Paper
Code

QK Iteration: A Self-Supervised Representation Learning Algorithm for Image Similarity

no code implementations • 15 Nov 2021 • David Wu, Yunnan Wu

Previous work in contrastive self-supervised learning has identified the importance of being able to optimize representations while ``pushing'' against a large number of negative examples.

Copy Detection Image Retrieval +2

Paper
Add Code

No-Press Diplomacy from Scratch

1 code implementation • NeurIPS 2021 • Anton Bakhtin, David Wu, Adam Lerer, Noam Brown

Additionally, we extend our methods to full-scale no-press Diplomacy and for the first time train an agent from scratch with no human data.

Starcraft

Paper
Code

Likelihood-based estimation and prediction for a measles outbreak in Samoa

2 code implementations • 30 Mar 2021 • David Wu, Helen Petousis-Harris, Janine Paynter, Vinod Suresh, Oliver J. Maclaren

Stochastic models can help with misspecification but are even more expensive to simulate and perform inference with.

Uncertainty Quantification

Paper
Code

Off-Belief Learning

5 code implementations • 6 Mar 2021 • Hengyuan Hu, Adam Lerer, Brandon Cui, David Wu, Luis Pineda, Noam Brown, Jakob Foerster

Policies learned through self-play may adopt arbitrary conventions and implicitly rely on multi-step reasoning based on fragile assumptions about other agents' actions and thus fail when paired with humans or independently trained agents at test time.

Paper
Code

Maximum a Posteriori Inference of Random Dot Product Graphs via Conic Programming

no code implementations • 6 Jan 2021 • David Wu, David R. Palmer, Daryl R. Deford

We present a convex cone program to infer the latent probability matrix of a random dot product graph (RDPG).

Bayesian Inference

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.