Search Results for author: Qianli Shen

Found 9 papers, 1 papers with code

Deep Reinforcement Learning with Smooth Policy

no code implementations ICML 2020 Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao

In contrast to policy parameterized by linear/reproducing kernel functions, where simple regularization techniques suffice to control smoothness, for neural network based reinforcement learning algorithms, there is no readily available solution to learn a smooth policy.

Deep Reinforcement Learning reinforcement-learning +1

FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models

no code implementations28 May 2024 Yang Zhang, Yawei Li, Xinpeng Wang, Qianli Shen, Barbara Plank, Bernd Bischl, Mina Rezaei, Kenji Kawaguchi

Overparametrized transformer networks are the state-of-the-art architecture for Large Language Models (LLMs).

Decoder

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

no code implementations7 Jan 2024 Haonan Wang, Qianli Shen, Yao Tong, Yang Zhang, Kenji Kawaguchi

Our method strategically embeds connections between pieces of copyrighted information and text references in poisoning data while carefully dispersing that information, making the poisoning data inconspicuous when integrated into a clean dataset.

Backdoor Attack Data Poisoning +1

VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models

1 code implementation CVPR 2024 Xiang Li, Qianli Shen, Kenji Kawaguchi

The booming use of text-to-image generative models has raised concerns about their high risk of producing copyright-infringing content.

State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning

no code implementations28 Nov 2022 Chen Chen, Hongyao Tang, Yi Ma, Chao Wang, Qianli Shen, Dong Li, Jianye Hao

The key idea of SA-PP is leveraging discounted stationary state distribution ratios between the learning policy and the offline dataset to modulate the degree of behavior regularization in a state-wise manner, so that pessimism can be implemented in a more appropriate way.

Offline RL Q-Learning +3

GFlowOut: Dropout with Generative Flow Networks

no code implementations24 Oct 2022 Dianbo Liu, Moksh Jain, Bonaventure Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio

These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation.

Bayesian Inference Variational Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.