Search Results for author: Tao Xiang

Found 179 papers, 75 papers with code

DreamCreature: Crafting Photorealistic Virtual Creatures from Imagination

1 code implementation27 Nov 2023 Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

To bridge this gap, we introduce a novel task, Virtual Creatures Generation: Given a set of unlabeled images of the target concepts (e. g., 200 bird species), we aim to train a T2I model capable of creating new, hybrid concepts within diverse backgrounds and contexts.

Disentanglement Novel Concepts

Wired Perspectives: Multi-View Wire Art Embraces Generative AI

no code implementations26 Nov 2023 Zhiyu Qu, Lan Yang, Honggang Zhang, Tao Xiang, Kaiyue Pang, Yi-Zhe Song

Creating multi-view wire art (MVWA), a static 3D sculpture with diverse interpretations from different viewpoints, is a complex task even for skilled artists.

Knowledge Distillation

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

no code implementations9 Oct 2023 Yuren Cong, Mengmeng Xu, Christian Simon, Shoufa Chen, Jiawei Ren, Yanping Xie, Juan-Manuel Perez-Rua, Bodo Rosenhahn, Tao Xiang, Sen He

In this paper, for the first time, we introduce optical flow into the attention module in the diffusion model's U-Net to address the inconsistency issue for text-to-video editing.

Optical Flow Estimation Text-to-Video Editing +1

SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation

1 code implementation27 Aug 2023 Zhiyu Qu, Tao Xiang, Yi-Zhe Song

Through this work, we hope to aspire the way we create visual content, democratise the creative process, and inspire further research in enhancing human creativity in AIGC.

Mitigating Cross-client GANs-based Attack in Federated Learning

no code implementations25 Jul 2023 Hong Huang, Xinyu Lei, Tao Xiang

Since a benign client's data can be leaked to the adversary, this attack brings the risk of local data leakage for clients in many security-critical FL applications.

Federated Learning Knowledge Distillation

A Generalized Unbiased Risk Estimator for Learning with Augmented Classes

1 code implementation12 Jun 2023 Senlin Shu, Shuo He, Haobo Wang, Hongxin Wei, Tao Xiang, Lei Feng

In this paper, we propose a generalized URE that can be equipped with arbitrary loss functions while maintaining the theoretical guarantees, given unlabeled data for LAC.

Multi-class Classification

HeadSculpt: Crafting 3D Head Avatars with Text

no code implementations5 Jun 2023 Xiao Han, Yukang Cao, Kai Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang, Kwan-Yee K. Wong

Specifically, we first equip the diffusion model with 3D awareness by leveraging landmark-based control and a learned textual embedding representing the back view appearance of heads, enabling 3D-consistent head avatar generations.

SketchXAI: A First Look at Explainability for Human Sketches

no code implementations CVPR 2023 Zhiyu Qu, Yulia Gryaditskaya, Ke Li, Kaiyue Pang, Tao Xiang, Yi-Zhe Song

Following this, we design a simple explainability-friendly sketch encoder that accommodates the intrinsic properties of strokes: shape, location, and order.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +1

ChiroDiff: Modelling chirographic data with Diffusion Models

no code implementations7 Apr 2023 Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Such strictly-ordered discrete factorization however falls short of capturing key properties of chirographic data -- it fails to build holistic understanding of the temporal concept due to one-way visibility (causality).


Learning Garment DensePose for Robust Warping in Virtual Try-On

no code implementations30 Mar 2023 Aiyu Cui, Sen He, Tao Xiang, Antoine Toisoul

In this work, we propose a robust warping method for virtual try-on based on a learned garment DensePose which has a direct correspondence with the person's DensePose.

Virtual Try-on

What Can Human Sketches Do for Object Detection?

no code implementations CVPR 2023 Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

In particular, we first perform independent prompting on both sketch and photo branches of an SBIR model to build highly generalisable sketch and photo encoders on the back of the generalisation ability of CLIP.

object-detection Object Detection +2

DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion

1 code implementation ICCV 2023 Sauradip Nag, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang

Concretely, we establish the denoising process in the Transformer decoder (e. g., DETR) by introducing a temporal location query design with faster convergence in training.

Action Detection Denoising

Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR

no code implementations CVPR 2023 Aneeshan Sain, Ayan Kumar Bhunia, Subhadeep Koley, Pinaki Nath Chowdhury, Soumitri Chattopadhyay, Tao Xiang, Yi-Zhe Song

This paper advances the fine-grained sketch-based image retrieval (FG-SBIR) literature by putting forward a strong baseline that overshoots prior state-of-the-arts by ~11%.

Knowledge Distillation Sketch-Based Image Retrieval

Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

no code implementations CVPR 2023 Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

We further introduce specific designs to tackle the abstract nature of human sketches, including a fine-grained discriminative loss on the back of a trained sketch-photo retrieval model, and a partial-aware sketch augmentation strategy.

Image Generation Retrieval +1

Generative Model Based Noise Robust Training for Unsupervised Domain Adaptation

no code implementations10 Mar 2023 Zhongying Deng, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang

D-CFA minimizes the domain gap by augmenting the source data with distribution-sampled target features, and trains a noise-robust discriminative classifier by using target domain knowledge from the generative models.

Unsupervised Domain Adaptation

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

1 code implementation CVPR 2023 Xiao Han, Xiatian Zhu, Licheng Yu, Li Zhang, Yi-Zhe Song, Tao Xiang

In the fashion domain, there exists a variety of vision-and-language (V+L) tasks, including cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image captioning.

Cross-Modal Retrieval Image Captioning +4

A Survey of Secure Computation Using Trusted Execution Environments

no code implementations23 Feb 2023 Xiaoguo Li, Bowen Zhao, Guomin Yang, Tao Xiang, Jian Weng, Robert H. Deng

To the best of our knowledge, this article is the first survey to review TEE-based secure computation protocols and the comprehensive comparison can serve as a guideline for selecting suitable protocols for deployment in practice.

Privacy Preserving

Unsupervised Hashing with Similarity Distribution Calibration

1 code implementation15 Feb 2023 Kam Woh Ng, Xiatian Zhu, Jiun Tian Hoe, Chee Seng Chan, Tianyu Zhang, Yi-Zhe Song, Tao Xiang

However, these methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space, due to the limited similarity range of hash codes.

Deep Hashing Image Retrieval

Representing Noisy Image Without Denoising

1 code implementation18 Jan 2023 Shuren Qi, Yushu Zhang, Chao Wang, Tao Xiang, Xiaochun Cao, Yong Xiang

In this paper, we explore a non-learning paradigm that aims to derive robust representation directly from noisy images, without the denoising as pre-processing.

Data Augmentation Image Denoising

Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting

no code implementations ICCV 2023 Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

We perform pivoting on two existing datasets, each from a distant research domain to the other: 2D sketch and photo pairs from the sketch-based image retrieval field (SBIR), and 3D shapes from ShapeNet.

3D Shape Retrieval Retrieval +1

Controllable Person Image Synthesis with Pose-Constrained Latent Diffusion

no code implementations ICCV 2023 Xiao Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang

Controllable person image synthesis aims at rendering a source image based on user-specified changes in body pose or appearance.

Denoising Image Generation

FBLNet: FeedBack Loop Network for Driver Attention Prediction

no code implementations ICCV 2023 Yilong Chen, Zhixiong Nan, Tao Xiang

The driving experience is extremely important for safe driving, a skilled driver is able to effortlessly predict oncoming danger (before it becomes salient) based on the driving experience and quickly pay attention to the corresponding zones. However, the nonobjective driving experience is difficult to model, so a mechanism simulating the driver experience accumulation procedure is absent in existing methods, and the current methods usually follow the technique line of saliency prediction methods to predict driver attention.

Autonomous Driving Driver Attention Monitoring +1

Multi-Modal Few-Shot Temporal Action Detection

1 code implementation27 Nov 2022 Sauradip Nag, Mengmeng Xu, Xiatian Zhu, Juan-Manuel Perez-Rua, Bernard Ghanem, Yi-Zhe Song, Tao Xiang

In this work, we introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD by leveraging few-shot support videos and new class names jointly.

Action Detection Few-Shot Object Detection +3

Post-Processing Temporal Action Detection

1 code implementation CVPR 2023 Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

To address this problem, in this work we introduce a novel model-agnostic post-processing method without model redesign and retraining.

Action Classification Action Detection +1

Single Stage Multi-Pose Virtual Try-On

no code implementations19 Nov 2022 Sen He, Yi-Zhe Song, Tao Xiang

Key to our model is a parallel flow estimation module that predicts the flow fields for both person and garment images conditioned on the target pose.

Pose Transfer Virtual Try-on

Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization

1 code implementation CVPR 2023 Mengmeng Xu, Yanghao Li, Cheng-Yang Fu, Bernard Ghanem, Tao Xiang, Juan-Manuel Perez-Rua

Our experiments show the proposed adaptations improve egocentric query detection, leading to a better visual query localization system in both 2D and 3D configurations.

Learning to Augment via Implicit Differentiation for Domain Generalization

no code implementations25 Oct 2022 Tingwei Wang, Da Li, Kaiyang Zhou, Tao Xiang, Yi-Zhe Song

Machine learning models are intrinsically vulnerable to domain shift between training and testing data, resulting in poor performance in novel domains.

Data Augmentation Domain Generalization +1

Robust Target Training for Multi-Source Domain Adaptation

1 code implementation4 Oct 2022 Zhongying Deng, Da Li, Yi-Zhe Song, Tao Xiang

Given any existing fully-trained one-step MSDA model, BORT$^2$ turns it to a labeling function to generate pseudo-labels for the target data and trains a target model using pseudo-labeled target data only.

Domain Adaptation

Fine-Grained VR Sketching: Dataset and Insights

1 code implementation20 Sep 2022 Ling Luo, Yulia Gryaditskaya, Yongxin Yang, Tao Xiang, Yi-Zhe Song

We then, for the first time, study the scenario of fine-grained 3D VR sketch to 3D shape retrieval, as a novel VR sketching application and a proving ground to drive out generic insights to inform future research.

3D Shape Reconstruction 3D Shape Retrieval +1

Towards 3D VR-Sketch to 3D Shape Retrieval

1 code implementation20 Sep 2022 Ling Luo, Yulia Gryaditskaya, Yongxin Yang, Tao Xiang, Yi-Zhe Song

In this paper, we offer a different perspective towards answering these questions -- we study the use of 3D sketches as an input modality and advocate a VR-scenario where retrieval is conducted.

3D Shape Retrieval Retrieval

Structure-Aware 3D VR Sketch to 3D Shape Retrieval

1 code implementation19 Sep 2022 Ling Luo, Yulia Gryaditskaya, Tao Xiang, Yi-Zhe Song

In particular, we propose to use a triplet loss with an adaptive margin value driven by a "fitting gap", which is the similarity of two shapes under structure-preserving deformations.

3D Shape Retrieval Retrieval

Negative Frames Matter in Egocentric Visual Query 2D Localization

1 code implementation3 Aug 2022 Mengmeng Xu, Cheng-Yang Fu, Yanghao Li, Bernard Ghanem, Juan-Manuel Perez-Rua, Tao Xiang

The repeated gradient computation of the same object lead to an inefficient training; (2) The false positive rate is high on background frames.

Vision Transformers: From Semantic Segmentation to Dense Prediction

1 code implementation19 Jul 2022 Li Zhang, Jiachen Lu, Sixiao Zheng, Xinxuan Zhao, Xiatian Zhu, Yanwei Fu, Tao Xiang, Jianfeng Feng, Philip H. S. Torr

In this work, for the first time we explore the global context learning potentials of ViTs for dense visual prediction (e. g., semantic segmentation).

Image Classification Instance Segmentation +5

Zero-Shot Temporal Action Detection via Vision-Language Prompting

1 code implementation17 Jul 2022 Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

Such a novel design effectively eliminates the dependence between localization and classification by breaking the route for error propagation in-between.

Action Detection Classification +3

FashionViL: Fashion-Focused Vision-and-Language Representation Learning

1 code implementation17 Jul 2022 Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

We thus propose a Multi-View Contrastive Learning task for pulling closer the visual representation of one image to the compositional multimodal representation of another image+text.

Contrastive Learning Image Retrieval +2

Semi-Supervised Temporal Action Detection with Proposal-Free Masking

1 code implementation14 Jul 2022 Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

Such a novel design effectively eliminates the dependence between localization and classification by cutting off the route for error propagation in-between.

Action Detection General Classification +1

Softmax-free Linear Transformers

1 code implementation5 Jul 2022 Li Zhang, Jiachen Lu, Junge Zhang, Xiatian Zhu, Jianfeng Feng, Tao Xiang

With linear complexity, much longer token sequences are permitted by SOFT, resulting in superior trade-off between accuracy and complexity.

Adaptive Fine-Grained Sketch-Based Image Retrieval

1 code implementation4 Jul 2022 Ayan Kumar Bhunia, Aneeshan Sain, Parth Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

To solve this new problem, we introduce a novel model-agnostic meta-learning (MAML) based framework with several key modifications: (1) As a retrieval task with a margin-based contrastive loss, we simplify the MAML training in the inner loop to make it more stable and tractable.

Meta-Learning Retrieval +1

UIGR: Unified Interactive Garment Retrieval

1 code implementation6 Apr 2022 Xiao Han, Sen He, Li Zhang, Yi-Zhe Song, Tao Xiang

In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR.


Style-Based Global Appearance Flow for Virtual Try-On

3 code implementations CVPR 2022 Sen He, Yi-Zhe Song, Tao Xiang

To achieve this, a key step is garment warping which spatially aligns the target garment with the corresponding body parts in the person image.

Virtual Try-on

Sketch3T: Test-Time Training for Zero-Shot SBIR

no code implementations CVPR 2022 Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i. e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a result.

Meta-Learning Retrieval +2

Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches

no code implementations CVPR 2022 Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

In this paper, we push the boundary further for FSCIL by addressing two key questions that bottleneck its ubiquitous application (i) can the model learn from diverse modalities other than just photo (as humans do), and (ii) what if photos are not readily accessible (due to ethical and privacy constraints).

Few-Shot Class-Incremental Learning Graph Attention +2

Dynamic Instance Domain Adaptation

1 code implementation9 Mar 2022 Zhongying Deng, Kaiyang Zhou, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang

In this paper, we address both single-source and multi-source UDA from a completely different perspective, which is to view each instance as a fine domain.

Unsupervised Domain Adaptation

One Sketch for All: One-Shot Personalized Sketch Segmentation

no code implementations20 Dec 2021 Anran Qi, Yulia Gryaditskaya, Tao Xiang, Yi-Zhe Song

We aim to segment all sketches belonging to the same category provisioned with a single sketch with a given part annotation while (i) preserving the parts semantics embedded in the exemplar, and (ii) being robust to input style and abstraction.


Hybrid Graph Neural Networks for Few-Shot Learning

no code implementations13 Dec 2021 Tianyuan Yu, Sen He, Yi-Zhe Song, Tao Xiang

This is because they use an instance GNN as a label propagation/classification module, which is jointly meta-learned with a feature embedding network.

Few-Shot Learning

Domain Attention Consistency for Multi-Source Domain Adaptation

1 code implementation6 Nov 2021 Zhongying Deng, Kaiyang Zhou, Yongxin Yang, Tao Xiang

Importantly, the attention module is supervised by a consistency loss, which is imposed on the distributions of channel attention weights between source and target domains.

Domain Adaptation

Towards artificial general intelligence via a multimodal foundation model

1 code implementation27 Oct 2021 Nanyi Fei, Zhiwu Lu, Yizhao Gao, Guoxing Yang, Yuqi Huo, Jingyuan Wen, Haoyu Lu, Ruihua Song, Xin Gao, Tao Xiang, Hao Sun, Ji-Rong Wen

To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-trained with huge multimodal data, which can be quickly adapted for various downstream cognitive tasks.

Image Classification Reading Comprehension +2

SOFT: Softmax-free Transformer with Linear Complexity

2 code implementations NeurIPS 2021 Jiachen Lu, Jinghan Yao, Junge Zhang, Xiatian Zhu, Hang Xu, Weiguo Gao, Chunjing Xu, Tao Xiang, Li Zhang

Crucially, with a linear complexity, much longer token sequences are permitted in SOFT, resulting in superior trade-off between accuracy and complexity.

Text-Based Person Search with Limited Data

1 code implementation20 Oct 2021 Xiao Han, Sen He, Li Zhang, Tao Xiang

Firstly, to fully utilize the existing small-scale benchmarking datasets for more discriminative feature learning, we introduce a cross-modal momentum contrastive learning framework to enrich the training data for a given mini-batch.

Ranked #8 on Text based Person Retrieval on CUHK-PEDES (using extra training data)

Benchmarking Contrastive Learning +7

Few-Shot Temporal Action Localization with Query Adaptive Transformer

1 code implementation20 Oct 2021 Sauradip Nag, Xiatian Zhu, Tao Xiang

Further, a novel FS-TAL model is proposed which maximizes the knowledge transfer from training classes whilst enabling the model to be dynamically adapted to both the new class and each video of that class simultaneously.

Action Segmentation Few Shot Temporal Action Localization +4

Temporal Action Localization with Global Segmentation Mask Transformers

no code implementations29 Sep 2021 Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

In this paper, to address the above two challenges, a novel {\em Global Segmentation Mask Transformer} (GSMT) is proposed.

object-detection Object Detection +1

SketchODE: Learning neural sketch representation in continuous time

no code implementations ICLR 2022 Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Learning meaningful representations for chirographic drawing data such as sketches, handwriting, and flowcharts is a gateway for understanding and emulating human creative expression.

Data Augmentation

Disentangled Lifespan Face Synthesis

no code implementations ICCV 2021 Sen He, Wentong Liao, Michael Ying Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

The generated face image given a target age code is expected to be age-sensitive reflected by bio-plausible transformations of shape and texture, while being identity preserving.

Face Generation

Global Aggregation then Local Distribution for Scene Parsing

1 code implementation28 Jul 2021 Xiangtai Li, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Xiatian Zhu, Tao Xiang

Modelling long-range contextual relationships is critical for pixel-wise prediction tasks such as semantic segmentation.

Scene Parsing Segmentation +1

MixStyle Neural Networks for Domain Generalization and Adaptation

2 code implementations5 Jul 2021 Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang

MixStyle is easy to implement with a few lines of code, does not require modification to training objectives, and can fit a variety of learning paradigms including supervised domain generalization, semi-supervised domain generalization, and unsupervised domain adaptation.

Data Augmentation Domain Generalization +6

L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing

2 code implementations CVPR 2021 Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang

To overcome these limitations, we propose a novel latent space factorization model, called L2M-GAN, which is learned end-to-end and effective for editing both local and global attributes.


Domain Generalization with MixStyle

3 code implementations ICLR 2021 Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang

Our method, termed MixStyle, is motivated by the observation that visual domain is closely related to image style (e. g., photo vs.~sketch images).

Domain Generalization Retrieval

Cloud2Curve: Generation and Vectorization of Parametric Sketches

1 code implementation CVPR 2021 Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations.

StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

no code implementations CVPR 2021 Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

With this meta-learning framework, our model can not only disentangle the cross-modal shared semantic content for SBIR, but can adapt the disentanglement to any unseen user style as well, making the SBIR model truly style-agnostic.

Disentanglement Meta-Learning +2

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

1 code implementation CVPR 2021 Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song

A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs.

Cross-Modal Retrieval Retrieval +2

Context-Aware Layout to Image Generation with Enhanced Object Appearance

1 code implementation CVPR 2021 Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.

Layout-to-Image Generation

Domain Generalization: A Survey

2 code implementations3 Mar 2021 Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, Chen Change Loy

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce.

Action Recognition Data Augmentation +8

Contrastive Prototype Learning with Augmented Embeddings for Few-Shot Learning

no code implementations23 Jan 2021 Yizhao Gao, Nanyi Fei, Guangzhen Liu, Zhiwu Lu, Tao Xiang, Songfang Huang

First, data augmentations are introduced to both the support and query sets with each sample now being represented as an augmented embedding (AE) composed of concatenated embeddings of both the original and augmented versions.

Few-Shot Learning

Few-shot Action Recognition with Prototype-centered Attentive Learning

1 code implementation20 Jan 2021 Xiatian Zhu, Antoine Toisoul, Juan-Manuel Perez-Rua, Li Zhang, Brais Martinez, Tao Xiang

Extensive experiments on four standard few-shot action benchmarks show that our method clearly outperforms previous state-of-the-art methods, with the improvement particularly significant (10+\%) on the most challenging fine-grained action recognition benchmark.

Contrastive Learning Few-Shot action recognition +3

Local Black-box Adversarial Attacks: A Query Efficient Approach

no code implementations4 Jan 2021 Tao Xiang, Hangcheng Liu, Shangwei Guo, Tianwei Zhang, Xiaofeng Liao

Based on this property, we identify the discriminative areas of a given clean example easily for local perturbations.

Z-Score Normalization, Hubness, and Few-Shot Learning

no code implementations ICCV 2021 Nanyi Fei, Yizhao Gao, Zhiwu Lu, Tao Xiang

This means that these methods are prone to the hubness problem, that is, a certain class prototype becomes the nearest neighbor of many test instances regardless which classes they belong to.

Few-Shot Learning

MELR: Meta-Learning via Modeling Episode-Level Relationships for Few-Shot Learning

no code implementations ICLR 2021 Nanyi Fei, Zhiwu Lu, Tao Xiang, Songfang Huang

Most recent few-shot learning (FSL) approaches are based on episodic training whereby each episode samples few training instances (shots) per class to imitate the test condition.

Few-Shot Learning

IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning

1 code implementation ICLR 2021 Manli Zhang, Jianhong Zhang, Zhiwu Lu, Tao Xiang, Mingyu Ding, Songfang Huang

Importantly, at the episode-level, two SSL-FSL hybrid learning objectives are devised: (1) The consistency across the predictions of an FSL classifier from different extended episodes is maximized as an episode-level pretext task.

Few-Shot Learning Self-Supervised Learning +1

Self-Supervised Video Representation Learning with Constrained Spatiotemporal Jigsaw

no code implementations1 Jan 2021 Yuqi Huo, Mingyu Ding, Haoyu Lu, Zhiwu Lu, Tao Xiang, Ji-Rong Wen, Ziyuan Huang, Jianwen Jiang, Shiwei Zhang, Mingqian Tang, Songfang Huang, Ping Luo

With the constrained jigsaw puzzles, instead of solving them directly, which could still be extremely hard, we carefully design four surrogate tasks that are more solvable but meanwhile still ensure that the learned representation is sensitive to spatiotemporal continuity at both the local and global levels.

Representation Learning

Margin-Based Transfer Bounds for Meta Learning with Deep Feature Embedding

no code implementations2 Dec 2020 Jiechao Guan, Zhiwu Lu, Tao Xiang, Timothy Hospedales

By transferring knowledge learned from seen/previous tasks, meta learning aims to generalize well to unseen/future tasks.

Classification General Classification +2

Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

1 code implementation29 Jul 2020 Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

In this paper, we study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail -- a person typically sketches up to various extents of detail to depict an object.

Retrieval Sketch-Based Image Retrieval

On Learning Semantic Representations for Million-Scale Free-Hand Sketches

1 code implementation7 Jul 2020 Peng Xu, Yongye Huang, Tongtong Yuan, Tao Xiang, Timothy M. Hospedales, Yi-Zhe Song, Liang Wang

Specifically, we use our dual-branch architecture as a universal representation framework to design two sketch-specific deep models: (i) We propose a deep hashing model for sketch retrieval, where a novel hashing loss is specifically designed to accommodate both the abstract and messy traits of sketches.

Deep Hashing Learning Semantic Representations +1

BézierSketch: A generative model for scalable vector sketches

1 code implementation ECCV 2020 Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process.

Image Generation

Egocentric Action Recognition by Video Attention and Temporal Context

no code implementations3 Jul 2020 Juan-Manuel Perez-Rua, Antoine Toisoul, Brais Martinez, Victor Escorcia, Li Zhang, Xiatian Zhu, Tao Xiang

In this challenge, action recognition is posed as the problem of simultaneously predicting a single `verb' and `noun' class label given an input trimmed video clip.

Action Recognition

Topology-aware Differential Privacy for Decentralized Image Classification

no code implementations14 Jun 2020 Shangwei Guo, Tianwei Zhang, Guowen Xu, Han Yu, Tao Xiang, Yang Liu

In this paper, we design Top-DP, a novel solution to optimize the differential privacy protection of decentralized image classification systems.

Classification Image Classification

Long-Term Cloth-Changing Person Re-identification

no code implementations26 May 2020 Xuelin Qian, Wenxuan Wang, Li Zhang, Fangrui Zhu, Yanwei Fu, Tao Xiang, Yu-Gang Jiang, xiangyang xue

Specifically, we consider that under cloth-changes, soft-biometrics such as body shape would be more reliable.

Person Re-Identification

Domain-Adaptive Few-Shot Learning

1 code implementation19 Mar 2020 An Zhao, Mingyu Ding, Zhiwu Lu, Tao Xiang, Yulei Niu, Jiechao Guan, Ji-Rong Wen, Ping Luo

Existing few-shot learning (FSL) methods make the implicit assumption that the few target class samples are from the same domain as the source class samples.

Domain Adaptation Few-Shot Learning

Domain Adaptive Ensemble Learning

1 code implementation16 Mar 2020 Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang

Each such classifier is an expert to its own domain and a non-expert to others.

Domain Generalization Ensemble Learning +3

Deep Domain-Adversarial Image Generation for Domain Generalisation

no code implementations12 Mar 2020 Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, Tao Xiang

This is achieved by having a learning objective formulated to ensure that the generated data can be correctly classified by the label classifier while fooling the domain classifier.

Domain Generalization Image Generation

Incremental Few-Shot Object Detection

no code implementations CVPR 2020 Juan-Manuel Perez-Rua, Xiatian Zhu, Timothy Hospedales, Tao Xiang

To this end we propose OpeN-ended Centre nEt (ONCE), a detector designed for incrementally learning to detect novel class objects with few examples.

Few-Shot Learning Few-Shot Object Detection +2

AdarGCN: Adaptive Aggregation GCN for Few-Shot Learning

no code implementations28 Feb 2020 Jianhong Zhang, Manli Zhang, Zhiwu Lu, Tao Xiang, Ji-Rong Wen

To address this problem, we propose a graph convolutional network (GCN)-based label denoising (LDN) method to remove the irrelevant images.

Denoising Few-Shot Learning +1

Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval

1 code implementation24 Feb 2020 Ayan Kumar Bhunia, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch.

Cross-Modal Retrieval On-the-Fly Sketch Based Image Retrieval +1

Byzantine-resilient Decentralized Stochastic Gradient Descent

no code implementations20 Feb 2020 Shangwei Guo, Tianwei Zhang, Han Yu, Xiaofei Xie, Lei Ma, Tao Xiang, Yang Liu

It guarantees that each benign node in a decentralized system can train a correct model under very strong Byzantine attacks with an arbitrary number of faulty nodes.

Edge-computing Image Classification

Meta-Learning across Meta-Tasks for Few-Shot Learning

no code implementations11 Feb 2020 Nanyi Fei, Zhiwu Lu, Yizhao Gao, Jia Tian, Tao Xiang, Ji-Rong Wen

In this paper, we argue that the inter-meta-task relationships should be exploited and those tasks are sampled strategically to assist in meta-learning.

Domain Adaptation Few-Shot Learning +1

Few-Shot Learning as Domain Adaptation: Algorithm and Analysis

no code implementations6 Feb 2020 Jiechao Guan, Zhiwu Lu, Tao Xiang, Ji-Rong Wen

Specifically, armed with a set transformer based attention module, we construct each episode with two sub-episodes without class overlap on the seen classes to simulate the domain shift between the seen and unseen classes.

Domain Adaptation Few-Shot Image Classification +1

Deep Learning for Person Re-identification: A Survey and Outlook

5 code implementations13 Jan 2020 Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, Steven C. H. Hoi

The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets.

Cross-Modal Person Re-Identification Metric Learning +2

Deep Learning for Free-Hand Sketch: A Survey

2 code implementations8 Jan 2020 Peng Xu, Timothy M. Hospedales, Qiyue Yin, Yi-Zhe Song, Tao Xiang, Liang Wang

Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present.

Torchreid: A Library for Deep Learning Person Re-Identification in Pytorch

7 code implementations22 Oct 2019 Kaiyang Zhou, Tao Xiang

Person re-identification (re-ID), which aims to re-identify people across different camera views, has been significantly advanced by deep learning in recent years, particularly with convolutional neural networks (CNNs).

Benchmarking Person Re-Identification

Learning Generalisable Omni-Scale Representations for Person Re-Identification

7 code implementations15 Oct 2019 Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, Tao Xiang

An effective person re-identification (re-ID) model should learn feature representations that are both discriminative, for distinguishing similar-looking people, and generalisable, for deployment across datasets without any adaptation.

Unsupervised Domain Adaptation Unsupervised Person Re-Identification

Simple and Effective Stochastic Neural Networks

no code implementations25 Sep 2019 Tianyuan Yu, Yongxin Yang, Da Li, Timothy Hospedales, Tao Xiang

Stochastic neural networks (SNNs) are currently topical, with several paradigms being actively investigated including dropout, Bayesian neural networks, variational information bottleneck (VIB) and noise regularized learning.

Adversarial Attack Adversarial Defense

Few-Shot Learning with Global Class Representations

2 code implementations ICCV 2019 Tiange Luo, Aoxue Li, Tao Xiang, Weiran Huang, Li-Wei Wang

In this paper, we propose to tackle the challenging few-shot learning (FSL) problem by learning global class representations using both base and novel class training samples.

Few-Shot Learning Generalized Few-Shot Classification +1

Goal-Driven Sequential Data Abstraction

no code implementations ICCV 2019 Umar Riaz Muhammad, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

In the former one asks whether a machine can `understand' enough about the meaning of input data to produce a meaningful but more compact abstraction.

Benchmarking General Reinforcement Learning +2

Omni-Scale Feature Learning for Person Re-Identification

13 code implementations ICCV 2019 Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, Tao Xiang

As an instance-level recognition problem, person re-identification (ReID) relies on discriminative features, which not only capture different spatial scales but also encapsulate an arbitrary combination of multiple scales.

Person Re-Identification

Compressing deep neural networks by matrix product operators

1 code implementation11 Apr 2019 Ze-Feng Gao, Song Cheng, Rong-Qiang He, Z. Y. Xie, Hui-Hai Zhao, Zhong-Yi Lu, Tao Xiang

A deep neural network is a parametrization of a multilayer mapping of signals in terms of many alternatively arranged linear and nonlinear transformations.

Differentiable Programming Tensor Networks

1 code implementation22 Mar 2019 Hai-Jun Liao, Jin-Guo Liu, Lei Wang, Tao Xiang

Differentiable programming is a fresh programming paradigm which composes parameterized algorithmic components and trains them using automatic differentiation (AD).

Strongly Correlated Electrons Quantum Physics

Tree Tensor Networks for Generative Modeling

no code implementations8 Jan 2019 Song Cheng, Lei Wang, Tao Xiang, Pan Zhang

Matrix product states (MPS), a tensor network designed for one-dimensional quantum systems, has been recently proposed for generative modeling of natural data (such as images) in terms of `Born machine'.

BIG-bench Machine Learning Tensor Networks

Zero-Shot Learning with Sparse Attribute Propagation

no code implementations11 Dec 2018 Nanyi Fei, Jiechao Guan, Zhiwu Lu, Tao Xiang, Ji-Rong Wen

The standard approach to ZSL requires a set of training images annotated with seen class labels and a semantic descriptor for seen/unseen classes (attribute vector is the most widely used).

Image Retrieval Zero-Shot Learning

Disjoint Label Space Transfer Learning with Common Factorised Space

no code implementations6 Dec 2018 Xiaobin Chang, Yongxin Yang, Tao Xiang, Timothy M. Hospedales

In this paper, a unified approach is presented to transfer learning that addresses several source and target domain label-space and annotation assumptions with a single model.

Transfer Learning Unsupervised Domain Adaptation

Zero and Few Shot Learning with Semantic Feature Synthesis and Competitive Learning

no code implementations19 Oct 2018 Zhiwu Lu, Jiechao Guan, Aoxue Li, Tao Xiang, An Zhao, Ji-Rong Wen

Specifically, we assume that each synthesised data point can belong to any unseen class; and the most likely two class candidates are exploited to learn a robust projection function in a competitive fashion.

Few-Shot Learning Zero-Shot Learning

Transferrable Feature and Projection Learning with Class Hierarchy for Zero-Shot Learning

no code implementations19 Oct 2018 Aoxue Li, Zhiwu Lu, Jiechao Guan, Tao Xiang, Li-Wei Wang, Ji-Rong Wen

Inspired by the fact that an unseen class is not exactly `unseen' if it belongs to the same superclass as a seen class, we propose a novel inductive ZSL model that leverages superclasses as the bridge between seen and unseen classes to narrow the domain gap.

Clustering Few-Shot Learning +1

SketchyScene: Richly-Annotated Scene Sketches

2 code implementations ECCV 2018 Changqing Zou, Qian Yu, Ruofei Du, Haoran Mo, Yi-Zhe Song, Tao Xiang, Chengying Gao, Baoquan Chen, Hao Zhang

We contribute the first large-scale dataset of scene sketches, SketchyScene, with the goal of advancing research on sketch understanding at both the object and scene level.

Colorization Image Retrieval +2

Deep Factorised Inverse-Sketching

no code implementations ECCV 2018 Kaiyue Pang, Da Li, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Instead there is a fundamental process of abstraction and iconic rendering, where overall geometry is warped and salient details are selectively included.

Retrieval Sketch-Based Image Retrieval +1

Person Re-Identification in Identity Regression Space

no code implementations25 Jun 2018 Hanxiao Wang, Xiatian Zhu, Shaogang Gong, Tao Xiang

Most existing person re-identification (re-id) methods are unsuitable for real-world deployment due to two reasons: Unscalability to large population size, and Inadaptability over time.

Benchmarking Incremental Learning +2

Learning to Sketch with Shortcut Cycle Consistency

no code implementations CVPR 2018 Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales

In this paper, we present a novel approach for translating an object photo to a sketch, mimicking the human sketching process.

Multi-Task Learning Retrieval +1

Learning Deep Sketch Abstraction

no code implementations CVPR 2018 Umar Riaz Muhammad, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Human free-hand sketches have been studied in various contexts including sketch recognition, synthesis and fine-grained sketch-based image retrieval (FG-SBIR).

Retrieval Sketch-Based Image Retrieval +1

SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval

1 code implementation CVPR 2018 Peng Xu, Yongye Huang, Tongtong Yuan, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Zhanyu Ma, Jun Guo

Key to our network design is the embedding of unique characteristics of human sketch, where (i) a two-branch CNN-RNN architecture is adapted to explore the temporal ordering of strokes, and (ii) a novel hashing loss is specifically designed to accommodate both the temporal and abstract traits of sketches.

Deep Hashing Sketch Recognition

Multi-Level Factorisation Net for Person Re-Identification

no code implementations CVPR 2018 Xiaobin Chang, Timothy M. Hospedales, Tao Xiang

Key to effective person re-identification (Re-ID) is modelling discriminative and view-invariant factors of person appearance at both high and low semantic levels.

Person Re-Identification

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward

6 code implementations29 Dec 2017 Kaiyang Zhou, Yu Qiao, Tao Xiang

Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos.

Decision Making reinforcement-learning +3

Pose-Normalized Image Generation for Person Re-identification

2 code implementations ECCV 2018 Xuelin Qian, Yanwei Fu, Tao Xiang, Wenxuan Wang, Jie Qiu, Yang Wu, Yu-Gang Jiang, xiangyang xue

Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations.

Image Generation Person Re-Identification +1

Learning to Compare: Relation Network for Few-Shot Learning

12 code implementations CVPR 2018 Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H. S. Torr, Timothy M. Hospedales

Once trained, a RN is able to classify images of new classes by computing relation scores between query images and the few examples of each new class without further updating the network.

Few-Shot Image Classification Few-Shot Learning +2

Recent Advances in Zero-shot Recognition

no code implementations13 Oct 2017 Yanwei Fu, Tao Xiang, Yu-Gang Jiang, xiangyang xue, Leonid Sigal, Shaogang Gong

With the recent renaissance of deep convolution neural networks, encouraging breakthroughs have been achieved on the supervised recognition tasks, where each class has sufficient training data and fully annotated training data.

Open Set Learning Zero-Shot Learning

Multi-scale Deep Learning Architectures for Person Re-identification

no code implementations ICCV 2017 Xuelin Qian, Yanwei Fu, Yu-Gang Jiang, Tao Xiang, xiangyang xue

Our model is able to learn deep discriminative feature representations at different scales and automatically determine the most suitable scales for matching.

Person Re-Identification

Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation

no code implementations5 Sep 2017 Yulei Niu, Zhiwu Lu, Ji-Rong Wen, Tao Xiang, Shih-Fu Chang

In this paper, we address two main issues in large-scale image annotation: 1) how to learn a rich feature representation suitable for predicting a diverse set of visual concepts ranging from object, scene to abstract concept; 2) how to annotate an image with the optimal number of class labels.

Weakly Supervised Image Annotation and Segmentation with Objects and Attributes

no code implementations8 Aug 2017 Zhiyuan Shi, Yongxin Yang, Timothy M. Hospedales, Tao Xiang

We propose to model complex visual scenes using a non-parametric Bayesian model learned from weakly labelled images abundant on media sharing sites such as Flickr.

object-detection Object Detection +3

Scalable and Effective Deep CCA via Soft Decorrelation

no code implementations CVPR 2018 Xiaobin Chang, Tao Xiang, Timothy M. Hospedales

Specifically, exact decorrelation is replaced by soft decorrelation via a mini-batch based Stochastic Decorrelation Loss (SDL) to be optimised jointly with the other training objectives.


Zero-Shot Fine-Grained Classification by Deep Feature Learning with Semantics

no code implementations4 Jul 2017 Aoxue Li, Zhiwu Lu, Li-Wei Wang, Tao Xiang, Xinqi Li, Ji-Rong Wen

In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i. e. zero-shot fine-grained classification.

Classification Domain Adaptation +4

Actor-Critic Sequence Training for Image Captioning

no code implementations29 Jun 2017 Li Zhang, Flood Sung, Feng Liu, Tao Xiang, Shaogang Gong, Yongxin Yang, Timothy M. Hospedales

Generating natural language descriptions of images is an important capability for a robot or other visual-intelligence driven AI agent that may need to communicate with human users about what it is seeing.

Image Captioning reinforcement-learning +1

Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images

no code implementations19 Jun 2017 Zhiyuan Shi, Timothy M. Hospedales, Tao Xiang

We address the problem of localisation of objects as bounding boxes in images and videos with weak labels.

Domain Adaptation Transfer Learning

Transferring a Semantic Representation for Person Re-Identification and Search

no code implementations CVPR 2015 Zhiyuan Shi, Timothy M. Hospedales, Tao Xiang

Learning semantic attributes for person re-identification and description-based person search has gained increasing interest due to attributes' great potential as a pose and view-invariant representation.

Person Re-Identification Person Search

Deep Mutual Learning

8 code implementations CVPR 2018 Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu

Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network.

Person Re-Identification

Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation

no code implementations9 May 2017 Zhiyuan Shi, Timothy M. Hospedales, Tao Xiang

(3) Our model can be learned with a mixture of weakly labelled and unlabelled data, allowing the large volume of unlabelled images on the Internet to be exploited for learning.

Transfer Learning by Ranking for Weakly Supervised Object Annotation

no code implementations2 May 2017 Zhiyuan Shi, Parthipan Siva, Tao Xiang

Most existing approaches to training object detectors rely on fully supervised learning, which requires the tedious manual annotation of object location in a training set.

Learning-To-Rank Transfer Learning

Semantic Autoencoder for Zero-Shot Learning

4 code implementations CVPR 2017 Elyor Kodirov, Tao Xiang, Shaogang Gong

We show that with this additional reconstruction constraint, the learned projection function from the seen classes is able to generalise better to the new unseen classes.

Clustering Test +1

Equivalence of restricted Boltzmann machines and tensor network states

1 code implementation17 Jan 2017 Jing Chen, Song Cheng, Haidong Xie, Lei Wang, Tao Xiang

Conversely, we give sufficient and necessary conditions to determine whether a TNS can be transformed into an RBM of given architectures.

Recommendation Systems

Highly Efficient Regression for Scalable Person Re-Identification

no code implementations5 Dec 2016 Hanxiao Wang, Shaogang Gong, Tao Xiang

Existing person re-identification models are poor for scaling up to large data required in real-world applications due to: (1) Complexity: They employ complex models for optimal performance resulting in high computational cost for training at a large scale; (2) Inadaptability: Once trained, they are unsuitable for incremental update to incorporate any new data available.

Active Learning Person Re-Identification +1

Human-In-The-Loop Person Re-Identification

no code implementations5 Dec 2016 Hanxiao Wang, Shaogang Gong, Xiatian Zhu, Tao Xiang

Current person re-identification (re-id) methods assume that (1) pre-labelled training data are available for every camera pair, (2) the gallery size for re-identification is moderate.

Ensemble Learning Incremental Learning +1

Deep Transfer Learning for Person Re-identification

1 code implementation16 Nov 2016 Mengyue Geng, Yao-Wei Wang, Tao Xiang, Yonghong Tian

Second, a two-stepped fine-tuning strategy is developed to transfer knowledge from auxiliary datasets.

General Classification Image Classification +2