Search Results for author: Tom Drummond

Found 56 papers, 22 papers with code

Pre-training Cross-lingual Open Domain Question Answering with Large-scale Synthetic Supervision

1 code implementation26 Feb 2024 Fan Jiang, Tom Drummond, Trevor Cohn

Cross-lingual question answering (CLQA) is a complex problem, comprising cross-lingual retrieval from a multilingual knowledge base, followed by answer generation either in English or the query language.

Answer Generation Cross-Lingual Question Answering +4

Perceiving Longer Sequences With Bi-Directional Cross-Attention Transformers

no code implementations19 Feb 2024 Markus Hiller, Krista A. Ehinger, Tom Drummond

We present a novel bi-directional Transformer architecture (BiXT) which scales linearly with input size in terms of computational cost and memory consumption, but does not suffer the drop in performance or limitation to only one input modality seen with other efficient Transformer-based approaches.

Image Classification Image Segmentation +1

Answering from Sure to Uncertain: Uncertainty-Aware Curriculum Learning for Video Question Answering

no code implementations3 Jan 2024 Haopeng Li, Qiuhong Ke, Mingming Gong, Tom Drummond

While significant advancements have been made in video question answering (VideoQA), the potential benefits of enhancing model generalization through tailored difficulty scheduling have been largely overlooked in existing research.

Question Answering Scheduling +1

Noisy Self-Training with Synthetic Queries for Dense Retrieval

1 code implementation27 Nov 2023 Fan Jiang, Tom Drummond, Trevor Cohn

Although existing neural retrieval models reveal promising results when training data is abundant and the performance keeps improving as training data increases, collecting high-quality annotated data is prohibitively costly.

Retrieval

Boot and Switch: Alternating Distillation for Zero-Shot Dense Retrieval

1 code implementation27 Nov 2023 Fan Jiang, Qiongkai Xu, Tom Drummond, Trevor Cohn

Experimental results demonstrate that our unsupervised $\texttt{ABEL}$ model outperforms both leading supervised and unsupervised retrievers on the BEIR benchmark.

Passage Retrieval Retrieval

Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise

no code implementations26 Oct 2023 Zhenkai Zhang, Krista A. Ehinger, Tom Drummond

This paper introduces two key contributions aimed at improving the speed and quality of images generated through inverse diffusion processes.

Denoising

Hey That's Mine Imperceptible Watermarks are Preserved in Diffusion Generated Outputs

no code implementations22 Aug 2023 Luke Ditria, Tom Drummond

We show that a generative Diffusion model trained on data that has been imperceptibly watermarked will generate new images with these watermarks present.

Long-Term Prediction of Natural Video Sequences with Robust Video Predictors

no code implementations21 Aug 2023 Luke Ditria, Tom Drummond

Predicting high dimensional video sequences is a curiously difficult problem.

Knowledge Combination to Learn Rotated Detection Without Rotated Annotation

1 code implementation CVPR 2023 Tianyu Zhu, Bryce Ferenczi, Pulak Purkait, Tom Drummond, Hamid Rezatofighi, Anton Van Den Hengel

Annotating rotated bounding boxes is such a laborious process that they are not provided in many detection datasets where axis-aligned annotations are used instead.

Multimorbidity Content-Based Medical Image Retrieval Using Proxies

no code implementations22 Nov 2022 Yunyan Xing, Benjamin J. Meyer, Mehrtash Harandi, Tom Drummond, ZongYuan Ge

Medical imaging data, such as radiology images, are often multimorbidity; a single sample may have more than one pathology present.

Content-Based Image Retrieval Decision Making +3

A Differentiable Distance Approximation for Fairer Image Classification

1 code implementation9 Oct 2022 Nicholas Rosa, Tom Drummond, Mehrtash Harandi

We demonstrate that our approach improves the fairness of AI models in varied task and dataset scenarios, whilst still maintaining a high level of classification accuracy.

Classification Fairness +1

Deep Laparoscopic Stereo Matching with Transformers

1 code implementation25 Jul 2022 Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Tom Drummond, Zhiyong Wang, ZongYuan Ge

The self-attention mechanism, successfully employed with the transformer structure is shown promise in many computer vision tasks including image recognition, and object detection.

object-detection Object Detection +2

Rethinking Generalization in Few-Shot Classification

1 code implementation15 Jun 2022 Markus Hiller, Rongkai Ma, Mehrtash Harandi, Tom Drummond

Single image-level annotations only correctly describe an often small subset of an image's content, particularly when complex real-world scenes are depicted.

Classification Few-Shot Image Classification +1

On Enforcing Better Conditioned Meta-Learning for Rapid Few-Shot Adaptation

no code implementations15 Jun 2022 Markus Hiller, Mehrtash Harandi, Tom Drummond

Inspired by the concept of preconditioning, we propose a novel method to increase adaptation speed for gradient-based meta-learning methods without incurring extra parameters.

Meta-Learning

Implicit Motion Handling for Video Camouflaged Object Detection

1 code implementation CVPR 2022 Xuelian Cheng, Huan Xiong, Deng-Ping Fan, Yiran Zhong, Mehrtash Harandi, Tom Drummond, ZongYuan Ge

We propose a new video camouflaged object detection (VCOD) framework that can exploit both short-term dynamics and long-term temporal consistency to detect camouflaged objects from video frames.

Camouflaged Object Segmentation Motion Estimation +4

Progressive Video Summarization via Multimodal Self-supervised Learning

no code implementations7 Jan 2022 Li Haopeng, Ke Qiuhong, Gong Mingming, Tom Drummond

Considering that the annotation of large-scale datasets is time-consuming, we propose a multimodal self-supervised learning framework to obtain semantic representations of videos, which benefits the video summarization task.

Self-Supervised Learning Video Classification +1

Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning

1 code implementation7 Dec 2021 Rongkai Ma, Pengfei Fang, Gil Avraham, Yan Zuo, Tianyu Zhu, Tom Drummond, Mehrtash Harandi

A principle way of achieving few-shot learning is to realize a model that can rapidly adapt to the context of a given task.

Few-Shot Learning Novel Concepts

Adaptive Poincaré Point to Set Distance for Few-Shot Classification

no code implementations3 Dec 2021 Rongkai Ma, Pengfei Fang, Tom Drummond, Mehrtash Harandi

To this end, we formulate the metric as a weighted sum on the tangent bundle of the hyperbolic space and develop a mechanism to obtain the weights adaptively and based on the constellation of the points.

Few-Shot Learning

Learning Online for Unified Segmentation and Tracking Models

no code implementations12 Nov 2021 Tianyu Zhu, Rongkai Ma, Mehrtash Harandi, Tom Drummond

A segmentation model cannot easily learn from prior information given in the visual tracking scenario.

Meta-Learning Visual Tracking

IDENTIFYING CONCEALED OBJECTS FROM VIDEOS

no code implementations29 Sep 2021 Xuelian Cheng, Huan Xiong, Deng-Ping Fan, Yiran Zhong, Mehrtash Harandi, Tom Drummond, ZongYuan Ge

The proposed SLT-Net leverages on both short-term dynamics and long-term temporal consistency to detect concealed objects in continuous video frames.

object-detection Object Detection

Relational Subsets Knowledge Distillation for Long-tailed Retinal Diseases Recognition

no code implementations22 Apr 2021 Lie Ju, Xin Wang, Lin Wang, Tongliang Liu, Xin Zhao, Tom Drummond, Dwarikanath Mahapatra, ZongYuan Ge

For example, there are estimated more than 40 different kinds of retinal diseases with variable morbidity, however with more than 30+ conditions are very rare from the global patient cohorts, which results in a typical long-tailed learning problem for deep learning-based screening models.

Knowledge Distillation

Looking Beyond Two Frames: End-to-End Multi-Object Tracking Using Spatial and Temporal Transformers

1 code implementation27 Mar 2021 Tianyu Zhu, Markus Hiller, Mahsa Ehsanpour, Rongkai Ma, Tom Drummond, Ian Reid, Hamid Rezatofighi

Tracking a time-varying indefinite number of objects in a video sequence over time remains a challenge despite recent advances in the field.

Multi-Object Tracking Object +1

Improving Medical Image Classification with Label Noise Using Dual-uncertainty Estimation

no code implementations28 Feb 2021 Lie Ju, Xin Wang, Lin Wang, Dwarikanath Mahapatra, Xin Zhao, Mehrtash Harandi, Tom Drummond, Tongliang Liu, ZongYuan Ge

In this paper, we systematically discuss and define the two common types of label noise in medical images - disagreement label noise from inconsistency expert opinions and single-target label noise from wrong diagnosis record.

Benchmarking General Classification +3

Leveraging Regular Fundus Images for Training UWF Fundus Diagnosis Models via Adversarial Learning and Pseudo-Labeling

no code implementations27 Nov 2020 Lie Ju, Xin Wang, Xin Zhao, Paul Bonnington, Tom Drummond, ZongYuan Ge

We propose the use of a modified cycle generative adversarial network (CycleGAN) model to bridge the gap between regular and UWF fundus and generate additional UWF fundus images for training.

Generative Adversarial Network Lesion Detection

Localising In Complex Scenes Using Balanced Adversarial Adaptation

no code implementations9 Nov 2020 Gil Avraham, Yan Zuo, Tom Drummond

Domain adaptation and generative modelling have collectively mitigated the expensive nature of data collection and labelling by leveraging the rich abundance of accurate, labelled data in simulation environments.

Domain Adaptation

Residual Likelihood Forests

no code implementations4 Nov 2020 Yan Zuo, Tom Drummond

This paper presents a novel ensemble learning approach called Residual Likelihood Forests (RLF).

Ensemble Learning

Hierarchical Neural Architecture Search for Deep Stereo Matching

1 code implementation NeurIPS 2020 Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun Chang, Tom Drummond, Hongdong Li, ZongYuan Ge

To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation.

Neural Architecture Search Semantic Segmentation +3

Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning

no code implementations10 Aug 2020 Abdelhak Loukkal, Yves GRANDVALET, Tom Drummond, You Li

Camera-based end-to-end driving neural networks bring the promise of a low-cost system that maps camera images to driving control commands.

Motion Forecasting Trajectory Planning

Supportive Actions for Manipulation in Human-Robot Coworker Teams

no code implementations2 May 2020 Shray Bansal, Rhys Newbury, Wesley Chan, Akansel Cosgun, Aimee Allen, Dana Kulić, Tom Drummond, Charles Isbell

We compare two robot modes in a shared table pick-and-place task: (1) Task-oriented: the robot only takes actions to further its own task objective and (2) Supportive: the robot sometimes prefers supportive actions to task-oriented ones when they reduce future goal-conflicts.

Reducing the Sim-to-Real Gap for Event Cameras

1 code implementation ECCV 2020 Timo Stoffregen, Cedric Scheerlinck, Davide Scaramuzza, Tom Drummond, Nick Barnes, Lindsay Kleeman, Robert Mahony

We present strategies for improving training data for event based CNNs that result in 20-40% boost in performance of existing state-of-the-art (SOTA) video reconstruction networks retrained with our method, and up to 15% for optic flow networks.

Event-Based Video Reconstruction Video Reconstruction

OpenGAN: Open Set Generative Adversarial Networks

2 code implementations18 Mar 2020 Luke Ditria, Benjamin J. Meyer, Tom Drummond

Using a state-of-the-art metric learning model that encodes both class-level and fine-grained semantic information, we are able to generate samples that are semantically similar to a given source image.

Data Augmentation Metric Learning

Switchable Precision Neural Networks

no code implementations7 Feb 2020 Luis Guerra, Bohan Zhuang, Ian Reid, Tom Drummond

Instantaneous and on demand accuracy-efficiency trade-off has been recently explored in the context of neural networks slimming.

Quantization

Automatic Pruning for Quantized Neural Networks

no code implementations3 Feb 2020 Luis Guerra, Bohan Zhuang, Ian Reid, Tom Drummond

In particular, for ResNet-18 on ImageNet, we prune 26. 12% of the model size with Binarized Neural Network quantization, achieving a top-1 classification accuracy of 47. 32% in a model of 2. 47 MB and 59. 30% with a 2-bit DoReFa-Net in 4. 36 MB.

Bayesian Optimization Quantization

Adversarial Pulmonary Pathology Translation for Pairwise Chest X-ray Data Augmentation

1 code implementation11 Oct 2019 Yunyan Xing, ZongYuan Ge, Rui Zeng, Dwarikanath Mahapatra, Jarrel Seah, Meng Law, Tom Drummond

We demonstrate the effectiveness of our model on two tasks: (i) we invite certified radiologists to assess the quality of the generated synthetic images against real and other state-of-the-art generative models, and (ii) data augmentation to improve the performance of disease localisation.

Data Augmentation Image-to-Image Translation +1

EMPNet: Neural Localisation and Mapping Using Embedded Memory Points

1 code implementation ICCV 2019 Gil Avraham, Yan Zuo, Thanuja Dharmasiri, Tom Drummond

Continuously estimating an agent's state space and a representation of its surroundings has proven vital towards full autonomy.

Event-Based Motion Segmentation by Motion Compensation

1 code implementation ICCV 2019 Timo Stoffregen, Guillermo Gallego, Tom Drummond, Lindsay Kleeman, Davide Scaramuzza

In contrast to traditional cameras, whose pixels have a common exposure time, event-based cameras are novel bio-inspired sensors whose pixels work independently and asynchronously output intensity changes (called "events"), with microsecond resolution.

Event Segmentation Motion Compensation +2

The Importance of Metric Learning for Robotic Vision: Open Set Recognition and Active Learning

no code implementations27 Feb 2019 Benjamin J. Meyer, Tom Drummond

Robotic problems are dynamic and open world; a robot will likely observe objects that are from outside of the training set distribution.

Active Learning Metric Learning +1

Look No Deeper: Recognizing Places from Opposing Viewpoints under Varying Scene Appearance using Single-View Depth Estimation

1 code implementation20 Feb 2019 Sourav Garg, Madhu Babu V, Thanuja Dharmasiri, Stephen Hausler, Niko Suenderhauf, Swagat Kumar, Tom Drummond, Michael Milford

Visual place recognition (VPR) - the act of recognizing a familiar visual place - becomes difficult when there is extreme environmental appearance change or viewpoint change.

Robotics

Traversing Latent Space using Decision Ferns

no code implementations6 Dec 2018 Yan Zuo, Gil Avraham, Tom Drummond

The practice of transforming raw data to a feature space so that inference can be performed in that space has been popular for many years.

Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks

1 code implementation16 Oct 2018 Zhibin Liao, Tom Drummond, Ian Reid, Gustavo Carneiro

Furthermore, the proposed measurements also allow us to show that it is possible to optimise the training process with a new dynamic sampling training approach that continuously and automatically change the mini-batch size and learning rate during the training process.

General Classification Image Classification

ENG: End-to-end Neural Geometry for Robust Depth and Pose Estimation using CNNs

no code implementations16 Jul 2018 Thanuja Dharmasiri, Andrew Spek, Tom Drummond

Recovering structure and motion parameters given a image pair or a sequence of images is a well studied problem in computer vision.

Depth Estimation Depth Prediction +5

Learning Factorized Representations for Open-set Domain Adaptation

no code implementations ICLR 2019 Mahsa Baktashmotlagh, Masoud Faraki, Tom Drummond, Mathieu Salzmann

To this end, we rely on the intuition that the source and target samples depicting the known classes can be generated by a shared subspace, whereas the target samples from unknown classes come from a different, private subspace.

Domain Adaptation

Generative Adversarial Forests for Better Conditioned Adversarial Learning

no code implementations14 May 2018 Yan Zuo, Gil Avraham, Tom Drummond

In recent times, many of the breakthroughs in various vision-related tasks have revolved around improving learning of deep models; these methods have ranged from network architectural improvements such as Residual Networks, to various forms of regularisation such as Batch Normalisation.

Just-in-Time Reconstruction: Inpainting Sparse Maps using Single View Depth Predictors as Priors

no code implementations11 May 2018 Chamara Saroj Weerasekera, Thanuja Dharmasiri, Ravi Garg, Tom Drummond, Ian Reid

Crucially, we obtain the confidence weights that parameterize the CRF model in a data-dependent manner via Convolutional Neural Networks (CNNs) which are trained to model the conditional depth error distributions given each source of input depth map and the associated RGB image.

Depth Estimation Depth Prediction

Efficient Subpixel Refinement with Symbolic Linear Predictors

no code implementations CVPR 2018 Vincent Lui, Jonathon Geeves, Winston Yii, Tom Drummond

We present an efficient subpixel refinement method usinga learning-based approach called Linear Predictors.

Joint Pose and Principal Curvature Refinement Using Quadrics

1 code implementation3 Jul 2017 Andrew Spek, Tom Drummond

In this paper we present a novel joint approach for optimising surface curvature and pose alignment.

Joint Prediction of Depths, Normals and Surface Curvature from RGB Images using CNNs

no code implementations23 Jun 2017 Thanuja Dharmasiri, Andrew Spek, Tom Drummond

To this end, we present a novel deep learning based framework that estimates depth, surface normals and surface curvature by only using a single RGB image.

Deep Metric Learning and Image Classification with Nearest Neighbour Gaussian Kernels

no code implementations27 May 2017 Benjamin J. Meyer, Ben Harwood, Tom Drummond

We present a Gaussian kernel loss function and training algorithm for convolutional neural networks that can be directly applied to both distance metric learning and image classification problems.

General Classification Image Classification +1

Smart Mining for Deep Metric Learning

no code implementations ICCV 2017 Ben Harwood, Vijay Kumar B G, Gustavo Carneiro, Ian Reid, Tom Drummond

In this paper, we propose a novel deep metric learning method that combines the triplet model and the global structure of the embedding space.

Metric Learning

FANNG: Fast Approximate Nearest Neighbour Graphs

no code implementations CVPR 2016 Ben Harwood, Tom Drummond

We also provide an efficient search algorithm that uses this graph to rapidly find the nearest neighbour to a query with high probability.

Faster and better: a machine learning approach to corner detection

1 code implementation14 Oct 2008 Edward Rosten, Reid Porter, Tom Drummond

The repeatability and efficiency of a corner detector determines how likely it is to be useful in a real-world application.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.