no code implementations • 10 Feb 2025 • Pedro Vélez, Luisa F. Polanía, Yi Yang, Chuhan Zhang, Rishab Kabra, Anurag Arnab, Mehdi S. M. Sajjadi
Diffusion models have revolutionized generative modeling, enabling unprecedented realism in image and video synthesis.
no code implementations • 4 Feb 2025 • Yuan Tian, Chuhan Zhang, Xiaotong Wang, Sitong Pan, Weiwei Cui, Haidong Zhang, Dazhen Deng, Yingcai Wu
Creating data reports is time-consuming, as it requires iterative exploration and understanding of data, followed by summarizing the insights.
no code implementations • 31 Jan 2025 • Chuhan Zhang, Cong Wang, Wei Pan, Cosimo Della Santina
Inspired by the dynamic coupling of moto-neurons and physical elasticity in animals, this work explores the possibility of generating locomotion gaits by utilizing physical oscillations in a soft snake by means of a low-level spiking neural mechanism.
no code implementations • 19 Dec 2024 • João Carreira, Dilara Gokay, Michael King, Chuhan Zhang, Ignacio Rocco, Aravindh Mahendran, Thomas Albert Keck, Joseph Heyward, Skanda Koppula, Etienne Pot, Goker Erdogan, Yana Hasson, Yi Yang, Klaus Greff, Guillaume Le Moing, Sjoerd van Steenkiste, Daniel Zoran, Drew A. Hudson, Pedro Vélez, Luisa Polanía, Luke Friedman, Chris Duvarney, Ross Goroshin, Kelsey Allen, Jacob Walker, Rishabh Kabra, Eric Aboussouan, Jennifer Sun, Thomas Kipf, Carl Doersch, Viorica Pătrăucean, Dima Damen, Pauline Luc, Mehdi S. M. Sajjadi, Andrew Zisserman
Scaling has not yet been convincingly demonstrated for pure self-supervised learning from video.
no code implementations • 18 Dec 2024 • Viorica Pătrăucean, Xu Owen He, Joseph Heyward, Chuhan Zhang, Mehdi S. M. Sajjadi, George-Cristian Muraru, Artem Zholus, Mahdi Karami, Ross Goroshin, Yutian Chen, Simon Osindero, João Carreira, Razvan Pascanu
We propose a novel block for video modelling.
1 code implementation • 25 Apr 2024 • Olivia Wiles, Chuhan Zhang, Isabela Albuquerque, Ivana Kajić, Su Wang, Emanuele Bugliarello, Yasumasa Onoe, Chris Knutsen, Cyrus Rashtchian, Jordi Pont-Tuset, Aida Nematzadeh
Human-rated prompt sets are generally small and the reliability of the ratings -- and thereby the prompt set used to compare models -- is not evaluated.
no code implementations • 9 Dec 2023 • Chuhan Zhang, Wei Pan, Cosimo Della Santina
Motor imagery, an important category in electroencephalogram (EEG) research, often intersects with scenarios demanding low energy consumption, such as portable medical devices and isolated environment operations.
1 code implementation • ICCV 2023 • Chuhan Zhang, Ankush Gupta, Andrew Zisserman
We demonstrate the performance of the object-aware representations learnt by our model, by: (i) evaluating it for strong transfer, i. e. through zero-shot testing, on a number of downstream video-text retrieval and classification benchmarks; and (ii) by using the representations learned as input for long-term video understanding tasks (e. g. Episodic Memory in Ego4D).
no code implementations • 3 May 2023 • Chuhan Zhang, Antoine Miech, Jiajun Shen, Jean-Baptiste Alayrac, Pauline Luc
Large-scale visual language models are widely used as pre-trained models and then adapted for various downstream tasks.
no code implementations • 20 Jul 2022 • Chuhan Zhang, Ankush Gupta, Andrew Zisserman
The model learns a set of object-centric summary vectors for the video, and uses these vectors to fuse the visual and spatio-temporal trajectory 'modalities' of the video clip.
no code implementations • CVPR 2021 • Chuhan Zhang, Ankush Gupta, Andrew Zisserman
It attends to relevant segments for each query with a temporal attention mechanism, and can be trained using only the labels for each query.
Ranked #13 on
Action Recognition
on Diving-48
no code implementations • ECCV 2020 • Chuhan Zhang, Ankush Gupta, Andrew Zisserman
In this work, our objective is to address the problems of generalization and flexibility for text recognition in documents.