no code implementations • EMNLP (ClinicalNLP) 2020 • Wenjie Wang, Youngja Park, Taesung Lee, Ian Molloy, Pengfei Tang, Li Xiong
Among the modalities of medical data, the clinical summaries have higher risks to be attacked because they are generated by third-party companies.
no code implementations • 31 Jan 2025 • Mrinank Sharma, Meg Tong, Jesse Mu, Jerry Wei, Jorrit Kruthoff, Scott Goodfriend, Euan Ong, Alwin Peng, Raj Agarwal, Cem Anil, Amanda Askell, Nathan Bailey, Joe Benton, Emma Bluemke, Samuel R. Bowman, Eric Christiansen, Hoagy Cunningham, Andy Dau, Anjali Gopal, Rob Gilson, Logan Graham, Logan Howard, Nimit Kalra, Taesung Lee, Kevin Lin, Peter Lofgren, Francesco Mosconi, Clare O'Hara, Catherine Olsson, Linda Petrini, Samir Rajani, Nikhil Saxena, Alex Silverstein, Tanya Singh, Theodore Sumers, Leonard Tang, Kevin K. Troy, Constantin Weisser, Ruiqi Zhong, Giulio Zhou, Jan Leike, Jared Kaplan, Ethan Perez
Large language models (LLMs) are vulnerable to universal jailbreaks-prompting strategies that systematically bypass model safeguards and enable users to carry out harmful processes that require many model interactions, like manufacturing illegal substances at scale.
no code implementations • 30 Jan 2024 • Shrayani Mondal, Rishabh Garodia, Arbaaz Qureshi, Taesung Lee, Youngja Park
We leverage the potential of generative language models to discover human-interpretable descriptors present in a dataset and use an unsupervised approach to explain neurons with these descriptors.
1 code implementation • 3 Aug 2023 • Kevin Eykholt, Taesung Lee, Douglas Schales, Jiyong Jang, Ian Molloy, Masha Zorin
In this work, we propose a new framework to enable the generation of adversarial inputs irrespective of the input type and task domain.
1 code implementation • 15 Jun 2023 • Myles Foley, Ambrish Rawat, Taesung Lee, Yufang Hou, Gabriele Picco, Giulio Zizzo
The wide applicability and adaptability of generative large language models (LLMs) has enabled their rapid adoption.
no code implementations • 24 Jun 2022 • Shriya Atmakuri, Tejas Chheda, Dinesh Kandula, Nishant Yadav, Taesung Lee, Hessel Tuinhof
Explanation methods have emerged as an important tool to highlight the features responsible for the predictions of neural networks.
no code implementations • 14 Dec 2020 • Shiqi Wang, Kevin Eykholt, Taesung Lee, Jiyong Jang, Ian Molloy
On CIFAR10, a non-robust LeNet model has a 21. 63% error rate, while a model created using verifiable training and a L-infinity robustness criterion of 8/255, has an error rate of 57. 10%.
no code implementations • 11 Jun 2020 • Kathrin Grosse, Taesung Lee, Battista Biggio, Youngja Park, Michael Backes, Ian Molloy
Backdoor attacks mislead machine-learning models to output an attacker-specified class when presented a specific trigger at test time.
no code implementations • IJCNLP 2019 • Arpita Roy, Youngja Park, Taesung Lee, SHimei Pan
We propose a novel supervised open information extraction (Open IE) framework that leverages an ensemble of unsupervised Open IE systems and a small amount of labeled data to improve system performance.
1 code implementation • 9 Nov 2018 • Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, Biplav Srivastava
While machine learning (ML) models are being increasingly trusted to make decisions in different and varying areas, the safety of systems using such models has become an increasing concern.
no code implementations • 31 May 2018 • Taesung Lee, Benjamin Edwards, Ian Molloy, Dong Su
Machine learning models are vulnerable to simple model stealing attacks if the adversary can obtain output labels for chosen inputs.
no code implementations • ICLR 2018 • Taesung Lee, Youngja Park
We present a new unsupervised method for learning general-purpose sentence embeddings.
no code implementations • COLING 2016 • Taesung Lee, Seung-won Hwang, Zhongyuan Wang
Besides providing the relevant information, amusing users has been an important role of the web.