Search Results for author: Junlin Wu

Found 7 papers, 3 papers with code

Preference Poisoning Attacks on Reward Model Learning

no code implementations • 2 Feb 2024 • Junlin Wu, Jiongxiao Wang, Chaowei Xiao, Chenguang Wang, Ning Zhang, Yevgeniy Vorobeychik

In addition, we observe that the simpler and more scalable rank-by-distance approaches are often competitive with the best, and on occasion significantly outperform gradient-based methods.

Paper
Add Code

On the Exploitability of Reinforcement Learning with Human Feedback for Large Language Models

no code implementations • 16 Nov 2023 • Jiongxiao Wang, Junlin Wu, Muhao Chen, Yevgeniy Vorobeychik, Chaowei Xiao

Reinforcement Learning with Human Feedback (RLHF) is a methodology designed to align Large Language Models (LLMs) with human preferences, playing an important role in LLMs alignment.

Backdoor Attack Data Poisoning

Paper
Add Code

Neural Lyapunov Control for Discrete-Time Systems

1 code implementation • NeurIPS 2023 • Junlin Wu, Andrew Clark, Yiannis Kantaros, Yevgeniy Vorobeychik

However, finding Lyapunov functions for general nonlinear systems is a challenging task.

Paper
Code

Certifying Safety in Reinforcement Learning under Adversarial Perturbation Attacks

no code implementations • 28 Dec 2022 • Junlin Wu, Hussein Sibai, Yevgeniy Vorobeychik

Our experiments demonstrate both the efficacy of the proposed approach for certifying safety in adversarial environments, and the value of the PSRL framework coupled with adversarial training in improving certified safety while preserving high nominal reward and high-quality predictions of true state.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Robust Deep Reinforcement Learning through Bootstrapped Opportunistic Curriculum

1 code implementation • 21 Jun 2022 • Junlin Wu, Yevgeniy Vorobeychik

Despite considerable advances in deep reinforcement learning, it has been shown to be highly vulnerable to adversarial perturbations to state observations.

Adversarial Robustness reinforcement-learning +1

Paper
Code

Learning Generative Deception Strategies in Combinatorial Masking Games

no code implementations • 23 Sep 2021 • Junlin Wu, Charles Kamhoua, Murat Kantarcioglu, Yevgeniy Vorobeychik

Next, we present a novel highly scalable approach for approximately solving such games by representing the strategies of both players as neural networks.

Paper
Add Code

CARER: Contextualized Affect Representations for Emotion Recognition

1 code implementation • EMNLP 2018 • Elvis Saravia, Hsien-Chi Toby Liu, Yen-Hao Huang, Junlin Wu, Yi-Shin Chen

Emotions are expressed in nuanced ways, which varies by collective or individual experiences, knowledge, and beliefs.

Emotion Recognition Semantic Textual Similarity +1

187

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.