Search Results for author: Tuomas Oikarinen

Found 7 papers, 3 papers with code

Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models

no code implementations20 Mar 2024 Nicholas Bai, Rahul A. Iyer, Tuomas Oikarinen, Tsui-Wei Weng

In this paper, we propose Describe-and-Dissect (DnD), a novel method to describe the roles of hidden neurons in vision networks.

Multimodal Deep Learning

Corrupting Neuron Explanations of Deep Visual Features

no code implementations ICCV 2023 Divyansh Srivastava, Tuomas Oikarinen, Tsui-Wei Weng

The inability of DNNs to explain their black-box behavior has led to a recent surge of explainability methods.

Fairness

The Importance of Prompt Tuning for Automated Neuron Explanations

no code implementations9 Oct 2023 Justin Lee, Tuomas Oikarinen, Arjun Chatha, Keng-Chi Chang, Yilan Chen, Tsui-Wei Weng

Recent advances have greatly increased the capabilities of large language models (LLMs), but our understanding of the models and their safety has not progressed as fast.

Language Modelling

Concept-Monitor: Understanding DNN training through individual neurons

no code implementations26 Apr 2023 Mohammad Ali Khan, Tuomas Oikarinen, Tsui-Wei Weng

In this work, we propose a general framework called Concept-Monitor to help demystify the black-box DNN training processes automatically using a novel unified embedding space and concept diversity metric.

Network Pruning

Label-Free Concept Bottleneck Models

1 code implementation12 Apr 2023 Tuomas Oikarinen, Subhro Das, Lam M. Nguyen, Tsui-Wei Weng

Motivated by these challenges, we propose Label-free CBM which is a novel framework to transform any neural network into an interpretable CBM without labeled concept data, while retaining a high accuracy.

CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

1 code implementation23 Apr 2022 Tuomas Oikarinen, Tsui-Wei Weng

Finally CLIP-Dissect is computationally efficient and can label all neurons from five layers of ResNet-50 in just 4 minutes, which is more than 10 times faster than existing methods.

Robust Deep Reinforcement Learning through Adversarial Loss

2 code implementations NeurIPS 2021 Tuomas Oikarinen, Wang Zhang, Alexandre Megretski, Luca Daniel, Tsui-Wei Weng

To address this issue, we propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against $l_p$-norm bounded adversarial attacks.

Adversarial Attack Atari Games +3

Cannot find the paper you are looking for? You can Submit a new open access paper.