We present VPN - a content attribution method for recovering provenance information from videos shared online.
Scene Designer is a novel method for searching and generating images using free-hand sketches of scene compositions; i. e. drawings that describe both the appearance and relative positions of objects.
Our key contribution is OSCAR-Net (Object-centric Scene Graph Attention for Image Attribution Network); a robust image hashing model inspired by recent successes of Transformers in the visual domain.
We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects.
We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style.
Sketchformer is a novel transformer-based representation for encoding free-hand sketches input in a vector form, i. e. as a sequence of strokes.
We present a neural architecture search (NAS) technique to enhance the performance of unsupervised image de-noising, in-painting and super-resolution under the recently proposed Deep Image Prior (DIP).
We propose a novel video inpainting algorithm that simultaneously hallucinates missing appearance and motion (optical flow) information, building upon the recent 'Deep Image Prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in static images.
We aim to simultaneously estimate the 3D articulated pose and high fidelity volumetric occupancy of human performance, from multiple viewpoint video (MVV) with as few as two views.
Ranked #30 on 3D Human Pose Estimation on Human3.6M
We present a novel method for generating robust adversarial image examples building upon the recent `deep image prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in image synthesis.
We present a novel blockchain based service for proving the provenance of online digital identity, exposed as an assistive tool to help non-expert users make better decisions about whom to trust online.
We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives.
We present a convolutional autoencoder that enables high fidelity volumetric reconstructions of human performance to be captured from multi-view video comprising only a small set of camera views.
We present a method for simultaneously estimating 3D human pose and body shape from a sparse set of wide-baseline camera views.
Ranked #5 on 3D Human Pose Estimation on Total Capture
Content-aware image completion or in-painting is a fundamental tool for the correction of defects or removal of objects in images.
We propose a novel measure of visual similarity for image retrieval that incorporates both structural and aesthetic (style) constraints.
Furthermore, we carry out baseline experiments to show the value of this dataset for artistic style prediction, for improving the generality of existing object classifiers, and for the study of visual domain adaptation.
We propose and evaluate several triplet CNN architectures for measuring the similarity between sketches and photographs, within the context of the sketch based image retrieval (SBIR) task.