Search Results for author: Daniel Filan

Found 8 papers, 3 papers with code

Quantifying Local Specialization in Deep Neural Networks

3 code implementations13 Oct 2021 Shlomi Hod, Daniel Filan, Stephen Casper, Andrew Critch, Stuart Russell

These results suggest that graph-based partitioning can reveal local specialization and that statistical methods can be used to automatedly screen for sets of neurons that can be understood abstractly.

Detecting Modularity in Deep Neural Networks

no code implementations29 Sep 2021 Shlomi Hod, Stephen Casper, Daniel Filan, Cody Wild, Andrew Critch, Stuart Russell

These results suggest that graph-based partitioning can reveal modularity and help us understand how deep neural networks function.

Clusterability in Neural Networks

2 code implementations4 Mar 2021 Daniel Filan, Stephen Casper, Shlomi Hod, Cody Wild, Andrew Critch, Stuart Russell

We also exhibit novel methods to promote clusterability in neural network training, and find that in multi-layer perceptrons they lead to more clusterable networks with little reduction in accuracy.

Importance and Coherence: Methods for Evaluating Modularity in Neural Networks

no code implementations1 Jan 2021 Shlomi Hod, Stephen Casper, Daniel Filan, Cody Wild, Andrew Critch, Stuart Russell

We apply these methods on partitionings generated by a spectral clustering algorithm which uses a graph representation of the network's neurons and weights.

Clustering

Pruned Neural Networks are Surprisingly Modular

1 code implementation10 Mar 2020 Daniel Filan, Shlomi Hod, Cody Wild, Andrew Critch, Stuart Russell

To discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate the modular structure of MLPs trained on datasets of small images.

Clustering Graph Clustering

Exploring Hierarchy-Aware Inverse Reinforcement Learning

no code implementations13 Jul 2018 Chris Cundy, Daniel Filan

We introduce a new generative model for human planning under the Bayesian Inverse Reinforcement Learning (BIRL) framework which takes into account the fact that humans often plan using hierarchical strategies.

BIRL reinforcement-learning +1

Self-Modification of Policy and Utility Function in Rational Agents

no code implementations10 May 2016 Tom Everitt, Daniel Filan, Mayank Daswani, Marcus Hutter

As we continue to create more and more intelligent agents, chances increase that they will learn about this ability.

General Reinforcement Learning

Loss Bounds and Time Complexity for Speed Priors

no code implementations12 Apr 2016 Daniel Filan, Marcus Hutter, Jan Leike

On a polynomial time computable sequence our speed prior is computable in exponential time.

Cannot find the paper you are looking for? You can Submit a new open access paper.