no code implementations • 22 Mar 2024 • Erik Miehling, Manish Nagireddy, Prasanna Sattigeri, Elizabeth M. Daly, David Piorkowski, John T. Richards
Modern language models, while sophisticated, exhibit some inherent shortcomings, particularly in conversational settings.
no code implementations • 9 Mar 2024 • Swapnaja Achintalwar, Adriana Alvarado Garcia, Ateret Anaby-Tavor, Ioana Baldini, Sara E. Berger, Bishwaranjan Bhattacharjee, Djallel Bouneffouf, Subhajit Chaudhury, Pin-Yu Chen, Lamogha Chiazor, Elizabeth M. Daly, Rogério Abreu de Paula, Pierre Dognin, Eitan Farchi, Soumya Ghosh, Michael Hind, Raya Horesh, George Kour, Ja Young Lee, Erik Miehling, Keerthiram Murugesan, Manish Nagireddy, Inkit Padhi, David Piorkowski, Ambrish Rawat, Orna Raz, Prasanna Sattigeri, Hendrik Strobelt, Sarathkrishna Swaminathan, Christoph Tillmann, Aashka Trivedi, Kush R. Varshney, Dennis Wei, Shalisha Witherspooon, Marcel Zalmanovici
Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations.
no code implementations • 9 Sep 2020 • Muhammad Aneeq uz Zaman, Kaiqing Zhang, Erik Miehling, Tamer Başar
We propose an actor-critic algorithm to iteratively compute the mean-field equilibrium (MFE) of the LQ-MFG.
Multi-agent Reinforcement Learning reinforcement-learning +1
1 code implementation • 2 Apr 2020 • Weichao Mao, Kaiqing Zhang, Erik Miehling, Tamer Başar
To enable the development of tractable algorithms, we introduce the concept of an information state embedding that serves to compress agents' histories.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • NeurIPS 2019 • Xiangyuan Zhang, Kaiqing Zhang, Erik Miehling, Tamer Başar
Through interacting with the more informed player, the less informed player attempts to both infer, and act according to, the true objective function.
no code implementations • 6 Aug 2019 • Kaiqing Zhang, Erik Miehling, Tamer Başar
To demonstrate the applicability of the model, we propose a novel collaborative intrusion response model, where multiple agents (defenders) possessing asymmetric information aim to collaboratively defend a computer network.