Search Results for author: Paul Micaelli

Found 5 papers, 4 papers with code

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

1 code implementation • CVPR 2023 • Paul Micaelli, Arash Vahdat, Hongxu Yin, Jan Kautz, Pavlo Molchanov

Our Landmark DEQ (LDEQ) achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching $3. 92$ NME with fewer parameters and a training memory cost of $\mathcal{O}(1)$ in the number of recurrent modules.

Ranked #2 on Face Alignment on WFLW

Face Alignment

Paper
Code

Non-greedy Gradient-based Hyperparameter Optimization Over Long Horizons

no code implementations • 28 Sep 2020 • Paul Micaelli, Amos Storkey

We demonstrate that the hyperparameters of this optimizer can be learned non-greedily without gradient degradation over $\sim 10^4$ inner gradient steps, by only requiring $\sim 10$ outer gradient steps.

Few-Shot Learning Hyperparameter Optimization

Paper
Add Code

Gradient-based Hyperparameter Optimization Over Long Horizons

1 code implementation • NeurIPS 2021 • Paul Micaelli, Amos Storkey

Gradient-based hyperparameter optimization has earned a widespread popularity in the context of few-shot meta-learning, but remains broadly impractical for tasks with long horizons (many gradient steps), due to memory scaling and gradient degradation issues.

Hyperparameter Optimization Meta-Learning

Paper
Code

Meta-Learning in Neural Networks: A Survey

1 code implementation • 11 Apr 2020 • Timothy Hospedales, Antreas Antoniou, Paul Micaelli, Amos Storkey

We survey promising applications and successes of meta-learning such as few-shot learning and reinforcement learning.

Few-Shot Learning Hyperparameter Optimization +1

951

Paper
Code

Zero-shot Knowledge Transfer via Adversarial Belief Matching

7 code implementations • NeurIPS 2019 • Paul Micaelli, Amos Storkey

Finally, we also propose a metric to quantify the degree of belief matching between teacher and student in the vicinity of decision boundaries, and observe a significantly higher match between our zero-shot student and the teacher, than between a student distilled with real data and the teacher.

Transfer Learning

136

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.