no code implementations • 18 Sep 2024 • Joakim Bruslund Haurum, Sergio Escalera, Graham W. Taylor, Thomas B. Moeslund
We present Agglomerative Token Clustering (ATC), a novel token merging method that consistently outperforms previous token merging and pruning methods across image classification, image synthesis, and object detection & segmentation tasks.
no code implementations • 21 Jun 2024 • Akshita Gupta, Aditya Arora, Sanath Narayan, Salman Khan, Fahad Shahbaz Khan, Graham W. Taylor
Open-Vocabulary Temporal Action Localization (OVTAL) enables a model to recognize any desired action category in videos without the need to explicitly curate training data for all categories.
1 code implementation • 18 Jun 2024 • Zahra Gharaee, Scott C. Lowe, ZeMing Gong, Pablo Millan Arias, Nicholas Pellegrino, Austin T. Wang, Joakim Bruslund Haurum, Iuliia Zarubiieva, Lila Kari, Dirk Steinke, Graham W. Taylor, Paul Fieguth, Angel X. Chang
We propose three benchmark experiments to demonstrate the impact of the multi-modal data types on the classification and clustering accuracy.
2 code implementations • 4 Jun 2024 • Scott C. Lowe, Joakim Bruslund Haurum, Sageev Oore, Thomas B. Moeslund, Graham W. Taylor
Our suite of benchmarking experiments use encoders pretrained solely on ImageNet-1k with either supervised or self-supervised training techniques, deployed on image datasets that were not seen during training, and clustered with conventional clustering algorithms.
no code implementations • 3 Jun 2024 • Kevin Kasa, ZhiYu Zhang, Heng Yang, Graham W. Taylor
Conformal prediction (CP) enables machine learning models to output prediction sets with guaranteed coverage rate, assuming exchangeable data.
3 code implementations • 27 May 2024 • ZeMing Gong, Austin T. Wang, Xiaoliang Huo, Joakim Bruslund Haurum, Scott C. Lowe, Graham W. Taylor, Angel X. Chang
Measuring biodiversity is crucial for understanding ecosystem health.
no code implementations • 1 Apr 2024 • Akshita Gupta, Gaurav Mittal, Ahmed Magooda, Ye Yu, Graham W. Taylor, Mei Chen
Temporal Action Localization (TAL) involves localizing and classifying action snippets in an untrimmed video.
2 code implementations • 4 Nov 2023 • Pablo Millan Arias, Niousha Sadjadi, Monireh Safari, ZeMing Gong, Austin T. Wang, Scott C. Lowe, Joakim Bruslund Haurum, Iuliia Zarubiieva, Dirk Steinke, Lila Kari, Angel X. Chang, Graham W. Taylor
Understanding biodiversity is a global challenge, in which DNA barcodes - short snippets of DNA that cluster by species - play a pivotal role.
no code implementations • 31 Oct 2023 • Michal Lisicki, Mihai Nica, Graham W. Taylor
We introduce a novel approach for batch selection in Stochastic Gradient Descent (SGD) training, leveraging combinatorial bandit algorithms.
1 code implementation • 9 Aug 2023 • Joakim Bruslund Haurum, Sergio Escalera, Graham W. Taylor, Thomas B. Moeslund
While different methods have been explored to achieve this goal, we still lack understanding of the resulting reduction patterns and how those patterns differ across token reduction methods and datasets.
1 code implementation • NeurIPS 2023 • Zahra Gharaee, ZeMing Gong, Nicholas Pellegrino, Iuliia Zarubiieva, Joakim Bruslund Haurum, Scott C. Lowe, Jaclyn T. A. McKeown, Chris C. Y. Ho, Joschka McLeod, Yi-Yun C Wei, Jireh Agda, Sujeevan Ratnasingham, Dirk Steinke, Angel X. Chang, Graham W. Taylor, Paul Fieguth
In an effort to catalog insect biodiversity, we propose a new large dataset of hand-labelled insect images, the BIOSCAN-Insect Dataset.
Ranked #1 on Classification on BIOSCAN_1M_Insect Dataset
no code implementations • 3 Jul 2023 • Kevin Kasa, Graham W. Taylor
Here, we characterize the performance of several post-hoc and training-based conformal prediction methods under these settings, providing the first empirical evaluation on large-scale datasets and models.
1 code implementation • CVPR 2023 • Cong Wei, Brendan Duke, Ruowei Jiang, Parham Aarabi, Graham W. Taylor, Florian Shkurti
Equipped with the learned unstructured attention pattern, sparse attention ViT (Sparsifiner) produces a superior Pareto-optimal trade-off between FLOPs and top-1 accuracy on ImageNet compared to token sparsity.
no code implementations • 10 Feb 2023 • Mingjie Wang, Yande Li, Jun Zhou, Graham W. Taylor, Minglun Gong
The class-agnostic counting (CAC) problem has caught increasing attention recently due to its wide societal applications and arduous challenges.
2 code implementations • 19 Jan 2023 • Juan Carrasquilla, Mohamed Hibat-Allah, Estelle Inack, Alireza Makhzani, Kirill Neklyudov, Graham W. Taylor, Giacomo Torlai
Binary neural networks, i. e., neural networks whose parameters and activations are constrained to only two possible values, offer a compelling avenue for the deployment of deep learning models on energy- and memory-limited devices.
no code implementations • 19 Jul 2022 • Angus Galloway, Anna Golubeva, Mahmoud Salem, Mihai Nica, Yani Ioannou, Graham W. Taylor
Estimating the Generalization Error (GE) of Deep Neural Networks (DNNs) is an important task that often relies on availability of held-out data.
no code implementations • 27 Jun 2022 • Mohammed Adnan, Yani Ioannou, Chuan-Yung Tsai, Angus Galloway, H. R. Tizhoosh, Graham W. Taylor
The failure of deep neural networks to generalize to out-of-distribution data is a well-known problem and raises concerns about the deployment of trained networks in safety-critical domains such as healthcare, finance and autonomous vehicles.
no code implementations • 29 Apr 2022 • Eu Wern Teh, Graham W. Taylor
Our experiments show that patch classification performance can be improved by manipulating both the image and input resolution in annotation-scarce and annotation-rich environments.
no code implementations • 29 Jan 2022 • Chuan-Yung Tsai, Graham W. Taylor
Although machine learning (ML) has been successful in automating various software engineering needs, software testing still remains a highly challenging topic.
1 code implementation • ICLR 2022 • Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham W. Taylor
While we focus on applying these metrics to GGM evaluation, in practice this enables the ability to easily compute the dissimilarity between any two sets of graphs regardless of domain.
no code implementations • 7 Jan 2022 • Eu Wern Teh, Graham W. Taylor
Furthermore, we show that models trained with scribble labels yield the same performance boost as full pixel-wise segmentation labels despite being significantly easier and faster to collect.
no code implementations • 23 Nov 2021 • Mohammed Adnan, Yani A. Ioannou, Chuan-Yung Tsai, Graham W. Taylor
Recent advancements in self-supervised learning have reduced the gap between supervised and unsupervised representation learning.
2 code implementations • 5 Nov 2021 • Michal Lisicki, Arash Afkanpour, Graham W. Taylor
We consider policies based on a GP and a Student's t-process (TP).
no code implementations • NeurIPS 2021 • Hyunsoo Chung, Jungtaek Kim, Boris Knyazev, Jinhwi Lee, Graham W. Taylor, Jaesik Park, Minsu Cho
Discovering a solution in a combinatorial space is prevalent in many real-world problems but it is also challenging due to diverse complex constraints and the vast number of possible combinations.
1 code implementation • NeurIPS 2021 • Boris Knyazev, Michal Drozdzal, Graham W. Taylor, Adriana Romero-Soriano
We introduce a large-scale dataset of diverse computational graphs of neural architectures - DeepNets-1M - and use it to explore parameter prediction on CIFAR-10 and ImageNet.
Ranked #1 on Parameter Prediction on CIFAR10
no code implementations • NeurIPS Workshop SVRHM 2021 • Shashank Shekhar, Graham W. Taylor
Our framework uses (1) a multi-task visual relationship encoder to extract constituent concepts from raw visual input in the source domain, and (2) a neural module net analogy inference engine to reason compositionally about the inferred relation in the target domain.
1 code implementation • ICCV 2021 • Terrance DeVries, Miguel Angel Bautista, Nitish Srivastava, Graham W. Taylor, Joshua M. Susskind
In this paper, we introduce Generative Scene Networks (GSN), which learns to decompose scenes into a collection of many local radiance fields that can be rendered from a free moving camera.
Ranked #1 on Scene Generation on VizDoom
no code implementations • 31 Mar 2021 • Eu Wern Teh, Terrance DeVries, Brendan Duke, Ruowei Jiang, Parham Aarabi, Graham W. Taylor
We further show that GIST and RIST can be combined with existing semi-supervised learning methods to boost performance.
1 code implementation • CVPR 2021 • Rohit Saha, Brendan Duke, Florian Shkurti, Graham W. Taylor, Parham Aarabi
Therefore, we propose Latent Optimization of Hairstyles via Orthogonalization (LOHO), an optimization-based approach using GAN inversion to infill missing hair structure details in latent space during hairstyle transfer.
1 code implementation • CVPR 2021 • Brendan Duke, Abdalla Ahmed, Christian Wolf, Parham Aarabi, Graham W. Taylor
SST extracts per-pixel representations for each object in a video using sparse attention over spatiotemporal features.
1 code implementation • ICCV 2021 • Yichao Lu, Himanshu Rai, Jason Chang, Boris Knyazev, Guangwei Yu, Shashank Shekhar, Graham W. Taylor, Maksims Volkovs
In this task, the model needs to detect objects and predict visual relationships between them.
1 code implementation • 21 Dec 2020 • Rylee Thompson, Elahe Ghalebi, Terrance DeVries, Graham W. Taylor
Generative models are now used to create a variety of high-quality digital artifacts.
no code implementations • NeurIPS Workshop LMCA 2020 • Michal Lisicki, Arash Afkanpour, Graham W. Taylor
Neural combinatorial optimization (NCO) aims at designing problem-independent and efficient neural network-based strategies for solving combinatorial problems.
no code implementations • NeurIPS Workshop SVRHM 2020 • Nolan S. Dey, J. Eric Taylor, Bryan P. Tripp, Alexander Wong, Graham W. Taylor
While researchers have attempted to manually identify an analogue to these tuning dimensions in deep neural networks, we are unaware of an automatic way to discover them.
2 code implementations • NeurIPS 2020 • Terrance DeVries, Michal Drozdzal, Graham W. Taylor
By refining the empirical data distribution before training, we redirect model capacity towards high-density regions, which ultimately improves sample fidelity, lowers model capacity requirements, and significantly reduces training time.
Ranked #2 on Conditional Image Generation on ImageNet 64x64
1 code implementation • ICCV 2021 • Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky
However, test images might contain zero- and few-shot compositions of objects and relationships, e. g. <cup, on, surfboard>.
no code implementations • 30 Jun 2020 • Vithursan Thangarasa, Thomas Miconi, Graham W. Taylor
Continual learning is the problem of sequentially learning new tasks or knowledge while protecting previously acquired knowledge.
1 code implementation • 17 May 2020 • Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky
We show that such models can suffer the most in their ability to generalize to rare compositions, evaluating two different models on the Visual Genome dataset and its more recent, improved version, GQA.
no code implementations • 28 Apr 2020 • Katya Kudashkina, Valliappa Chockalingam, Graham W. Taylor, Michael Bowling
Human-computer interactive systems that rely on machine learning are becoming paramount to the lives of millions of people who use digital assistants on a daily basis.
1 code implementation • ECCV 2020 • Eu Wern Teh, Terrance DeVries, Graham W. Taylor
Additionally, our proposed fast moving proxies also addresses small gradient issue of proxies, and this component synergizes well with low temperature scaling and Global Max Pooling.
Ranked #2 on Image Retrieval on CARS196
2 code implementations • 27 Nov 2019 • Eu Wern Teh, Graham W. Taylor
In Digital Pathology (DP), labeled data is generally very scarce due to the requirement that medical experts provide annotations.
no code implementations • 28 Oct 2019 • Alaaeldin El-Nouby, Shuangfei Zhai, Graham W. Taylor, Joshua M. Susskind
Deep neural networks require collecting and annotating large amounts of data to train successfully.
Ranked #44 on Self-Supervised Action Recognition on UCF101
no code implementations • 11 Oct 2019 • Elahe Ghalebi, Hamidreza Mahyar, Radu Grosu, Graham W. Taylor, Sinead A. Williamson
As the availability and importance of temporal interaction data--such as email communication--increases, it becomes increasingly important to understand the underlying structure that underpins these interactions.
no code implementations • 25 Sep 2019 • Vithursan Thangarasa, Thomas Miconi, Graham W. Taylor
Continual learning is the problem of sequentially learning new tasks or knowledge while protecting previously acquired knowledge.
1 code implementation • 23 Sep 2019 • Boris Knyazev, Carolyn Augusta, Graham W. Taylor
We consider a common case in which edges can be short term interactions (e. g., messaging) or long term structural connections (e. g., friendship).
1 code implementation • 21 Jul 2019 • Boris Knyazev, Xiao Lin, Mohamed R. Amer, Graham W. Taylor
Graph Convolutional Networks (GCNs) are a class of general models that can learn from graph structured data.
1 code implementation • 11 Jul 2019 • Terrance DeVries, Adriana Romero, Luis Pineda, Graham W. Taylor, Michal Drozdzal
We show that FJD can be used as a promising single metric for cGAN benchmarking and model selection.
no code implementations • 28 May 2019 • Elahe Ghalebi, Hamidreza Mahyar, Radu Grosu, Graham W. Taylor, Sinead A. Williamson
Interaction graphs, such as those recording emails between individuals or transactions between institutions, tend to be sparse yet structured, and often grow in an unbounded manner.
2 code implementations • NeurIPS 2019 • Boris Knyazev, Graham W. Taylor, Mohamed R. Amer
We aim to better understand attention over nodes in graph neural networks (GNNs) and identify factors influencing its effectiveness.
Ranked #25 on Graph Classification on D&D
no code implementations • 6 May 2019 • Angus Galloway, Anna Golubeva, Thomas Tanay, Medhat Moussa, Graham W. Taylor
Batch normalization (batch norm) is often used in an attempt to stabilize and accelerate training in deep neural networks.
no code implementations • 21 Feb 2019 • Stefan Schneider, Graham W. Taylor, Stefan Linquist, Stefan C. Kremer
Without any species-specific modifications, our results demonstrate that similarity comparison networks can reach a performance level beyond that of humans for the task of animal re-identification.
no code implementations • 15 Jan 2019 • Vignesh Sankar, Devinder Kumar, David A. Clausi, Graham W. Taylor, Alexander Wong
Conclusion: The SISC radiomic sequencer is able to achieve state-of-the-art results in lung cancer prediction, and also offers prediction interpretability in the form of critical response maps.
1 code implementation • 30 Nov 2018 • Angus Galloway, Anna Golubeva, Graham W. Taylor
We analyze the adversarial examples problem in terms of a model's fault tolerance with respect to its input.
3 code implementations • ICCV 2019 • Alaaeldin El-Nouby, Shikhar Sharma, Hannes Schulz, Devon Hjelm, Layla El Asri, Samira Ebrahimi Kahou, Yoshua Bengio, Graham W. Taylor
Conditional text-to-image generation is an active area of research, with many possible applications.
Ranked #2 on Text-to-Image Generation on GeNeVA (i-CLEVR)
1 code implementation • 23 Nov 2018 • Boris Knyazev, Xiao Lin, Mohamed R. Amer, Graham W. Taylor
Spectral Graph Convolutional Networks (GCNs) are a generalization of convolutional networks to learning on graph-structured data.
Ranked #12 on Graph Classification on NCI109
no code implementations • 19 Nov 2018 • Stefan Schneider, Graham W. Taylor, Stefan S. Linquist, Stefan C. Kremer
The ability of a researcher to re-identify (re-ID) an individual animal upon re-encounter is fundamental for addressing a broad range of questions in the study of ecosystem function, community and population dynamics, and behavioural ecology.
no code implementations • 27 Sep 2018 • Angus Galloway, Anna Golubeva, Graham W. Taylor
The generalization ability of deep neural networks (DNNs) is intertwined with model complexity, robustness, and capacity.
1 code implementation • 24 Jul 2018 • Vithursan Thangarasa, Graham W. Taylor
Selecting the most appropriate data examples to present a deep neural network (DNN) at different stages of training is an unsolved challenge.
no code implementations • 3 Jul 2018 • Griffin Lacey, Graham W. Taylor, Shawki Areibi
Low precision weights, activations, and gradients have been proposed as a way to improve the computational efficiency and memory footprint of deep neural networks.
no code implementations • 2 Jul 2018 • Terrance DeVries, Graham W. Taylor
The first is producing spatial uncertainty maps, from which a clinician can observe where and why a system thinks it is failing.
2 code implementations • 10 Apr 2018 • Angus Galloway, Thomas Tanay, Graham W. Taylor
Performance-critical machine learning models should be robust to input perturbations not seen during training.
no code implementations • 28 Mar 2018 • Stefan Schneider, Graham W. Taylor, Stefan C. Kremer
Recent advances in the field of deep learning for object detection show promise towards automating the analysis of camera trap images.
no code implementations • 26 Mar 2018 • Brendan Duke, Graham W. Taylor
We propose a generalized class of multimodal fusion operators for the task of visual question answering (VQA).
no code implementations • 23 Feb 2018 • Alaaeldin El-Nouby, Graham W. Taylor
Finally, for better network initialization, we transfer from the task of action recognition to action detection by pre-training our framework using the recently released large-scale Kinetics dataset.
1 code implementation • CVPR 2018 • Fabien Baradel, Christian Wolf, Julien Mille, Graham W. Taylor
No spatial coherence is forced on the glimpse locations, which gives the module liberty to explore different points at each frame and better optimize the process of scrutinizing visual information.
Ranked #21 on Skeleton Based Action Recognition on N-UCLA
no code implementations • 13 Feb 2018 • Angus Galloway, Graham W. Taylor, Medhat Moussa
It has been suggested that adversarial examples cause deep learning models to make incorrect predictions with high confidence.
5 code implementations • 13 Feb 2018 • Terrance DeVries, Graham W. Taylor
Modern neural networks are very powerful predictive models, but they are often incapable of recognizing when their predictions may be wrong.
no code implementations • ICLR 2018 • Mohamed Amer, Aswin Raghavan, Graham W. Taylor, Sek Chai
Our key idea is to control the expressive power of the network by dynamically quantizing the range and set of values that the parameters can take.
no code implementations • ICLR 2018 • Vithursan Thangarasa, Graham W. Taylor
The \textit{student} CNN classifier dynamically selects samples to form a mini-batch based on the \textit{easiness} from cross-entropy losses and \textit{true diverseness} of examples from the representation space sculpted by the \textit{embedding} CNN.
1 code implementation • ICLR 2018 • Angus Galloway, Graham W. Taylor, Medhat Moussa
Neural networks with low-precision weights and activations offer compelling efficiency advantages over their full-precision equivalents.
no code implementations • 29 Oct 2017 • Devinder Kumar, Graham W. Taylor, Alexander Wong
Conclusion: We demonstrate the effectiveness and utility of the proposed CLEAR-DR system of enhancing the interpretability of diagnostic grading results for the application of diabetic retinopathy grading.
no code implementations • 5 Sep 2017 • Devinder Kumar, Graham W. Taylor, Alexander Wong
However, current deep learning algorithms have been criticized as uninterpretable "black-boxes" which cannot explain their decision making processes.
28 code implementations • 15 Aug 2017 • Terrance DeVries, Graham W. Taylor
Convolutional neural networks are capable of learning powerful representational spaces, which are necessary for tackling complex learning tasks.
Ranked #1 on Out-of-Distribution Generalization on ImageNet-W
no code implementations • 3 Jul 2017 • Dhanesh Ramachandram, Michal Lisicki, Timothy J. Shields, Mohamed R. Amer, Graham W. Taylor
A popular testbed for deep learning has been multimodal recognition of human activity or gesture involving diverse inputs such as video, audio, skeletal pose and depth images.
no code implementations • 13 Apr 2017 • Devinder Kumar, Alexander Wong, Graham W. Taylor
In this work, we propose CLass-Enhanced Attentive Response (CLEAR): an approach to visualize and understand the decisions made by deep neural networks (DNNs) given a specific input.
no code implementations • 18 Feb 2017 • Angus Galloway, Graham W. Taylor, Aaron Ramsay, Medhat Moussa
An original dataset for semantic segmentation, Ciona17, is introduced, which to the best of the authors' knowledge, is the first dataset of its kind with pixel-level annotations pertaining to invasive species in a marine environment.
3 code implementations • 17 Feb 2017 • Terrance DeVries, Graham W. Taylor
Our main insight is to perform the transformation not in input space, but in a learned feature space.
no code implementations • 7 Feb 2017 • Matthew Veres, Medhat Moussa, Graham W. Taylor
Deep learning is an established framework for learning hierarchical data representations.
no code implementations • 11 Jan 2017 • Matthew Veres, Medhat Moussa, Graham W. Taylor
Grasping is a complex process involving knowledge of the object, the surroundings, and of oneself.
no code implementations • 19 Nov 2016 • Devinder Kumar, Vlado Menkovski, Graham W. Taylor, Alexander Wong
One of the main challenges for broad adoption of deep learning based models such as convolutional neural networks (CNN), is the lack of understanding of their decisions.
1 code implementation • 11 Jul 2016 • Daniel Jiwoong Im, Graham W. Taylor
To extend its applicability outside of image-based domains, we propose to learn a metric which captures perceptual similarity.
1 code implementation • 26 May 2016 • He Ma, Fei Mao, Graham W. Taylor
We develop a scalable and extendable training framework that can utilize GPUs across nodes in a cluster and accelerate the training of deep learning models based on data parallelism.
no code implementations • 13 Feb 2016 • Griffin Lacey, Graham W. Taylor, Shawki Areibi
The rapid growth of data size and accessibility in recent years has instigated a shift of philosophy in algorithm design for artificial intelligence.
no code implementations • 31 Dec 2014 • Natalia Neverova, Christian Wolf, Graham W. Taylor, Florian Nebout
We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning.
no code implementations • 20 Dec 2014 • Daniel Jiwoong Im, Graham W. Taylor
In this work, we apply a dynamical systems view to GAEs, deriving a scoring function, and drawing connections to Restricted Boltzmann Machines.
no code implementations • 20 Dec 2014 • Jan Rudy, Weiguang Ding, Daniel Jiwoong Im, Graham W. Taylor
Regularization is essential when training large neural networks.
1 code implementation • 20 Dec 2014 • Daniel Jiwoong Im, Ethan Buchman, Graham W. Taylor
Here we propose a more general form for the sampling dynamics in MPF, and explore the consequences of different choices for these dynamics for training RBMs.
no code implementations • 11 Jun 2014 • Weiguang Ding, Graham W. Taylor
The human visual system is able to recognize objects despite transformations that can drastically alter their appearance.
1 code implementation • 27 Dec 2013 • Arjun Jain, Jonathan Tompson, Mykhaylo Andriluka, Graham W. Taylor, Christoph Bregler
This paper introduces a new architecture for human pose estimation using a multi- layer convolutional network architecture and a modified learning technique that learns low-level features and higher-level weak spatial models.
no code implementations • NeurIPS 2011 • Matthew D. Zeiler, Graham W. Taylor, Leonid Sigal, Iain Matthews, Rob Fergus
We present a type of Temporal Restricted Boltzmann Machine that defines a probability distribution over an output sequence conditional on an input sequence.
no code implementations • NeurIPS 2010 • Graham W. Taylor, Rob Fergus, George Williams, Ian Spiro, Christoph Bregler
We apply our method to challenging real-world data and show that it can generalize beyond hand localization to infer a more general notion of body pose.
no code implementations • NeurIPS 2008 • Ilya Sutskever, Geoffrey E. Hinton, Graham W. Taylor
The Temporal Restricted Boltzmann Machine (TRBM) is a probabilistic model for sequences that is able to successfully model (i. e., generate nice-looking samples of) several very high dimensional sequences, such as motion capture data and the pixels of low resolution videos of balls bouncing in a box.