no code implementations • 11 Apr 2024 • Richard Fang, Rohan Bindu, Akul Gupta, Daniel Kang
In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems.
no code implementations • 6 Feb 2024 • Richard Fang, Rohan Bindu, Akul Gupta, Qiusi Zhan, Daniel Kang
However, not much is known about the offensive capabilities of LLM agents.
no code implementations • 9 Nov 2023 • Qiusi Zhan, Richard Fang, Rohan Bindu, Akul Gupta, Tatsunori Hashimoto, Daniel Kang
In tandem, LLM vendors have been increasingly enabling fine-tuning of their most powerful models.