Search Results for author: Yi-Zhe Song

Found 121 papers, 62 papers with code

Sketch-a-Net that Beats Humans

2 code implementations30 Jan 2015 Qian Yu, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales

We propose a multi-scale multi-channel deep neural network framework that, for the first time, yields sketch recognition performance surpassing that of humans.

Sketch Recognition

Making Better Use of Edges via Perceptual Grouping

no code implementations CVPR 2015 Yonggang Qi, Yi-Zhe Song, Tao Xiang, Honggang Zhang, Timothy Hospedales, Yi Li, Jun Guo

We propose a perceptual grouping framework that organizes image edges into meaningful structures and demonstrate its usefulness on various computer vision tasks.

Learning-To-Rank Retrieval +1

Free-hand Sketch Synthesis with Deformable Stroke Models

no code implementations9 Oct 2015 Yi Li, Yi-Zhe Song, Timothy Hospedales, Shaogang Gong

We present a generative model which can automatically summarize the stroke composition of free-hand sketches of a given category.

Sketch Me That Shoe

no code implementations CVPR 2016 Qian Yu, Feng Liu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Chen-Change Loy

We investigate the problem of fine-grained sketch-based image retrieval (SBIR), where free-hand human sketches are used as queries to perform instance-level retrieval of images.

Data Augmentation Retrieval +1

Deeper, Broader and Artier Domain Generalization

6 code implementations ICCV 2017 Da Li, Yongxin Yang, Yi-Zhe Song, Timothy M. Hospedales

In this paper, we make two main contributions: Firstly, we build upon the favorable domain shift-robust properties of deep learning methods, and develop a low-rank parameterized CNN model for end-to-end DG learning.

Domain Generalization

SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval

1 code implementation CVPR 2018 Peng Xu, Yongye Huang, Tongtong Yuan, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Zhanyu Ma, Jun Guo

Key to our network design is the embedding of unique characteristics of human sketch, where (i) a two-branch CNN-RNN architecture is adapted to explore the temporal ordering of strokes, and (ii) a novel hashing loss is specifically designed to accommodate both the temporal and abstract traits of sketches.

Deep Hashing Sketch Recognition

Learning Deep Sketch Abstraction

no code implementations CVPR 2018 Umar Riaz Muhammad, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Human free-hand sketches have been studied in various contexts including sketch recognition, synthesis and fine-grained sketch-based image retrieval (FG-SBIR).

Retrieval Sketch-Based Image Retrieval +1

Learning to Sketch with Shortcut Cycle Consistency

no code implementations CVPR 2018 Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales

In this paper, we present a novel approach for translating an object photo to a sketch, mimicking the human sketching process.

Multi-Task Learning Retrieval +1

SketchyScene: Richly-Annotated Scene Sketches

2 code implementations ECCV 2018 Changqing Zou, Qian Yu, Ruofei Du, Haoran Mo, Yi-Zhe Song, Tao Xiang, Chengying Gao, Baoquan Chen, Hao Zhang

We contribute the first large-scale dataset of scene sketches, SketchyScene, with the goal of advancing research on sketch understanding at both the object and scene level.

Colorization Image Retrieval +2

Deep Factorised Inverse-Sketching

no code implementations ECCV 2018 Kaiyue Pang, Da Li, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Instead there is a fundamental process of abstraction and iconic rendering, where overall geometry is warped and salient details are selectively included.

Retrieval Sketch-Based Image Retrieval +1

Episodic Training for Domain Generalization

2 code implementations ICCV 2019 Da Li, Jianshu Zhang, Yongxin Yang, Cong Liu, Yi-Zhe Song, Timothy M. Hospedales

In this paper, we build on this strong baseline by designing an episodic training procedure that trains a single deep network in a way that exposes it to the domain shift that characterises a novel domain at runtime.

Domain Generalization

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval

1 code implementation CVPR 2019 Sounak Dey, Pau Riba, Anjan Dutta, Josep Llados, Yi-Zhe Song

Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic.

Retrieval Sketch-Based Image Retrieval

Goal-Driven Sequential Data Abstraction

no code implementations ICCV 2019 Umar Riaz Muhammad, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

In the former one asks whether a machine can `understand' enough about the meaning of input data to produce a meaningful but more compact abstraction.

Benchmarking General Reinforcement Learning +2

Semi-Heterogeneous Three-Way Joint Embedding Network for Sketch-Based Image Retrieval

no code implementations10 Nov 2019 Jianjun Lei, Yuxin Song, Bo Peng, Zhanyu Ma, Ling Shao, Yi-Zhe Song

How to align abstract sketches and natural images into a common high-level semantic space remains a key problem in SBIR.

Retrieval Sketch-Based Image Retrieval

Deep Learning for Free-Hand Sketch: A Survey

2 code implementations8 Jan 2020 Peng Xu, Timothy M. Hospedales, Qiyue Yin, Yi-Zhe Song, Tao Xiang, Liang Wang

Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present.

SketchDesc: Learning Local Sketch Descriptors for Multi-view Correspondence

no code implementations16 Jan 2020 Deng Yu, Lei LI, Youyi Zheng, Manfred Lau, Yi-Zhe Song, Chiew-Lan Tai, Hongbo Fu

In this paper, we study the problem of multi-view sketch correspondence, where we take as input multiple freehand sketches with different views of the same object and predict as output the semantic correspondence among the sketches.

Semantic correspondence

Deep Self-Supervised Representation Learning for Free-Hand Sketch

1 code implementation3 Feb 2020 Peng Xu, Zeyu Song, Qiyue Yin, Yi-Zhe Song, Liang Wang

In this paper, we tackle for the first time, the problem of self-supervised representation learning for free-hand sketches.

Representation Learning Retrieval +1

Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval

1 code implementation24 Feb 2020 Ayan Kumar Bhunia, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch.

Cross-Modal Retrieval On-the-Fly Sketch Based Image Retrieval +1

Mind the Gap: Enlarging the Domain Gap in Open Set Domain Adaptation

2 code implementations8 Mar 2020 Dongliang Chang, Aneeshan Sain, Zhanyu Ma, Yi-Zhe Song, Jun Guo

The key insight lies with how we exploit the mutually beneficial information between two networks; (a) to separate samples of known and unknown classes, (b) to maximize the domain confusion between source and target domain without the influence of unknown samples.

Unsupervised Domain Adaptation

Sequential Learning for Domain Generalization

no code implementations3 Apr 2020 Da Li, Yongxin Yang, Yi-Zhe Song, Timothy Hospedales

In DG this means encountering a sequence of domains and at each step training to maximise performance on the next domain.

Domain Generalization Meta-Learning

BézierSketch: A generative model for scalable vector sketches

1 code implementation ECCV 2020 Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process.

Image Generation

On Learning Semantic Representations for Million-Scale Free-Hand Sketches

1 code implementation7 Jul 2020 Peng Xu, Yongye Huang, Tongtong Yuan, Tao Xiang, Timothy M. Hospedales, Yi-Zhe Song, Liang Wang

Specifically, we use our dual-branch architecture as a universal representation framework to design two sketch-specific deep models: (i) We propose a deep hashing model for sketch retrieval, where a novel hashing loss is specifically designed to accommodate both the abstract and messy traits of sketches.

Deep Hashing Learning Semantic Representations +1

Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

1 code implementation29 Jul 2020 Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

In this paper, we study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail -- a person typically sketches up to various extents of detail to depict an object.

Retrieval Sketch-Based Image Retrieval

Deep Sketch-Based Modeling: Tips and Tricks

1 code implementation12 Nov 2020 Yue Zhong, Yulia Gryaditskaya, Honggang Zhang, Yi-Zhe Song

Deep image-based modeling received lots of attention in recent years, yet the parallel problem of sketch-based modeling has only been briefly studied, often as a potential application.

Your "Flamingo" is My "Bird": Fine-Grained, or Not

1 code implementation CVPR 2021 Dongliang Chang, Kaiyue Pang, Yixiao Zheng, Zhanyu Ma, Yi-Zhe Song, Jun Guo

For that, we re-envisage the traditional setting of FGVC, from single-label classification, to that of top-down traversal of a pre-defined coarse-to-fine label hierarchy -- so that our answer becomes "bird"-->"Phoenicopteriformes"-->"Phoenicopteridae"-->"flamingo".

Disentanglement Fine-Grained Image Classification +1

SketchAA: Abstract Representation for Abstract Sketches

no code implementations ICCV 2021 Lan Yang, Kaiyue Pang, Honggang Zhang, Yi-Zhe Song

The superiority of explicitly abstracting sketch representation is empirically validated on a number of sketch analysis tasks, including sketch recognition, fine-grained sketch-based image retrieval, and generative sketch healing.

Retrieval Sketch-Based Image Retrieval +1

Context-Aware Layout to Image Generation with Enhanced Object Appearance

1 code implementation CVPR 2021 Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.

Layout-to-Image Generation Object

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

1 code implementation CVPR 2021 Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song

A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs.

Cross-Modal Retrieval Retrieval +2

Cloud2Curve: Generation and Vectorization of Parametric Sketches

1 code implementation CVPR 2021 Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations.

StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

no code implementations CVPR 2021 Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

With this meta-learning framework, our model can not only disentangle the cross-modal shared semantic content for SBIR, but can adapt the disentanglement to any unseen user style as well, making the SBIR model truly style-agnostic.

Disentanglement Meta-Learning +2

MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

1 code implementation CVPR 2021 Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

In this paper, we take a completely different perspective -- we work on the assumption that there is always a new style that is drastically different, and that we will only have very limited data during testing to perform adaptation.

Handwritten Text Recognition HTR +1

PQA: Perceptual Question Answering

1 code implementation CVPR 2021 Yonggang Qi, Kai Zhang, Aneeshan Sain, Yi-Zhe Song

Perceptual organization remains one of the very few established theories on the human visual system.

Question Answering

Towards Unsupervised Sketch-based Image Retrieval

no code implementations18 May 2021 Conghui Hu, Yongxin Yang, Yunpeng Li, Timothy M. Hospedales, Yi-Zhe Song

The practical value of existing supervised sketch-based image retrieval (SBIR) algorithms is largely limited by the requirement for intensive data collection and labeling.

Representation Learning Retrieval +1

Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation

no code implementations ICCV 2021 Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

In this paper, for the first time, we argue for their unification -- we aim for a single model that can compete favourably with two separate state-of-the-art STR and HTR models.

Handwriting Recognition HTR +2

Towards the Unseen: Iterative Text Recognition by Distilling from Errors

no code implementations ICCV 2021 Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

Our framework is iterative in nature, in that it utilises predicted knowledge of character sequences from a previous iteration, to augment the main network in improving the next prediction.

Disentangled Lifespan Face Synthesis

no code implementations ICCV 2021 Sen He, Wentong Liao, Michael Ying Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

The generated face image given a target age code is expected to be age-sensitive reflected by bio-plausible transformations of shape and texture, while being identity preserving.

Face Generation

SketchLattice: Latticed Representation for Sketch Manipulation

no code implementations ICCV 2021 Yonggang Qi, Guoyao Su, Pinaki Nath Chowdhury, Mingkang Li, Yi-Zhe Song

The key challenge in designing a sketch representation lies with handling the abstract and iconic nature of sketches.

SketchODE: Learning neural sketch representation in continuous time

no code implementations ICLR 2022 Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Learning meaningful representations for chirographic drawing data such as sketches, handwriting, and flowcharts is a gateway for understanding and emulating human creative expression.

Data Augmentation

Temporal Action Localization with Global Segmentation Mask Transformers

no code implementations29 Sep 2021 Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

In this paper, to address the above two challenges, a novel {\em Global Segmentation Mask Transformer} (GSMT) is proposed.

Object object-detection +2

Fine-Grained Image Analysis with Deep Learning: A Survey

no code implementations11 Nov 2021 Xiu-Shen Wei, Yi-Zhe Song, Oisin Mac Aodha, Jianxin Wu, Yuxin Peng, Jinhui Tang, Jian Yang, Serge Belongie

Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications.

Fine-Grained Image Recognition Image Retrieval +1

Clue Me In: Semi-Supervised FGVC with Out-of-Distribution Data

1 code implementation6 Dec 2021 Ruoyi Du, Dongliang Chang, Zhanyu Ma, Yi-Zhe Song, Jun Guo

Despite great strides made on fine-grained visual classification (FGVC), current methods are still heavily reliant on fully-supervised paradigms where ample expert labels are called for.

Fine-Grained Image Classification

Hybrid Graph Neural Networks for Few-Shot Learning

no code implementations13 Dec 2021 Tianyuan Yu, Sen He, Yi-Zhe Song, Tao Xiang

This is because they use an instance GNN as a label propagation/classification module, which is jointly meta-learned with a feature embedding network.

Few-Shot Learning

One Sketch for All: One-Shot Personalized Sketch Segmentation

no code implementations20 Dec 2021 Anran Qi, Yulia Gryaditskaya, Tao Xiang, Yi-Zhe Song

We aim to segment all sketches belonging to the same category provisioned with a single sketch with a given part annotation while (i) preserving the parts semantics embedded in the exemplar, and (ii) being robust to input style and abstraction.

Segmentation

Finding Badly Drawn Bunnies

1 code implementation CVPR 2022 Lan Yang, Kaiyue Pang, Honggang Zhang, Yi-Zhe Song

Our key discovery lies in exploiting the magnitude (L2 norm) of a sketch feature as a quantitative quality metric.

Dynamic Instance Domain Adaptation

1 code implementation9 Mar 2022 Zhongying Deng, Kaiyang Zhou, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang

In this paper, we address both single-source and multi-source UDA from a completely different perspective, which is to view each instance as a fine domain.

Unsupervised Domain Adaptation

Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches

no code implementations CVPR 2022 Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

In this paper, we push the boundary further for FSCIL by addressing two key questions that bottleneck its ubiquitous application (i) can the model learn from diverse modalities other than just photo (as humans do), and (ii) what if photos are not readily accessible (due to ethical and privacy constraints).

Few-Shot Class-Incremental Learning Graph Attention +2

Sketch3T: Test-Time Training for Zero-Shot SBIR

no code implementations CVPR 2022 Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i. e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a result.

Meta-Learning Retrieval +1

Style-Based Global Appearance Flow for Virtual Try-On

3 code implementations CVPR 2022 Sen He, Yi-Zhe Song, Tao Xiang

To achieve this, a key step is garment warping which spatially aligns the target garment with the corresponding body parts in the person image.

Virtual Try-on

UIGR: Unified Interactive Garment Retrieval

1 code implementation6 Apr 2022 Xiao Han, Sen He, Li Zhang, Yi-Zhe Song, Tao Xiang

In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR.

Retrieval

Adaptive Fine-Grained Sketch-Based Image Retrieval

1 code implementation4 Jul 2022 Ayan Kumar Bhunia, Aneeshan Sain, Parth Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

To solve this new problem, we introduce a novel model-agnostic meta-learning (MAML) based framework with several key modifications: (1) As a retrieval task with a margin-based contrastive loss, we simplify the MAML training in the inner loop to make it more stable and tractable.

Meta-Learning Retrieval +1

Semi-Supervised Temporal Action Detection with Proposal-Free Masking

1 code implementation14 Jul 2022 Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

Such a novel design effectively eliminates the dependence between localization and classification by cutting off the route for error propagation in-between.

Action Detection General Classification +1

FashionViL: Fashion-Focused Vision-and-Language Representation Learning

1 code implementation17 Jul 2022 Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

We thus propose a Multi-View Contrastive Learning task for pulling closer the visual representation of one image to the compositional multimodal representation of another image+text.

Contrastive Learning Image Retrieval +2

Zero-Shot Temporal Action Detection via Vision-Language Prompting

1 code implementation17 Jul 2022 Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

Such a novel design effectively eliminates the dependence between localization and classification by breaking the route for error propagation in-between.

Action Detection Classification +3

SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling

1 code implementation14 Aug 2022 Chenjian Gao, Qian Yu, Lu Sheng, Yi-Zhe Song, Dong Xu

Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.

3D Reconstruction

DifferSketching: How Differently Do People Sketch 3D Objects?

1 code implementation19 Sep 2022 Chufeng Xiao, Wanchao Su, Jing Liao, Zhouhui Lian, Yi-Zhe Song, Hongbo Fu

We invited 70 novice users and 38 expert users to sketch 136 3D objects, which were presented as 362 images rendered from multiple views.

3D Reconstruction

Structure-Aware 3D VR Sketch to 3D Shape Retrieval

1 code implementation19 Sep 2022 Ling Luo, Yulia Gryaditskaya, Tao Xiang, Yi-Zhe Song

In particular, we propose to use a triplet loss with an adaptive margin value driven by a "fitting gap", which is the similarity of two shapes under structure-preserving deformations.

3D Shape Retrieval Retrieval

Towards 3D VR-Sketch to 3D Shape Retrieval

1 code implementation20 Sep 2022 Ling Luo, Yulia Gryaditskaya, Yongxin Yang, Tao Xiang, Yi-Zhe Song

In this paper, we offer a different perspective towards answering these questions -- we study the use of 3D sketches as an input modality and advocate a VR-scenario where retrieval is conducted.

3D Shape Retrieval Retrieval

Fine-Grained VR Sketching: Dataset and Insights

1 code implementation20 Sep 2022 Ling Luo, Yulia Gryaditskaya, Yongxin Yang, Tao Xiang, Yi-Zhe Song

We then, for the first time, study the scenario of fine-grained 3D VR sketch to 3D shape retrieval, as a novel VR sketching application and a proving ground to drive out generic insights to inform future research.

3D Shape Reconstruction 3D Shape Retrieval +1

Robust Target Training for Multi-Source Domain Adaptation

1 code implementation4 Oct 2022 Zhongying Deng, Da Li, Yi-Zhe Song, Tao Xiang

Given any existing fully-trained one-step MSDA model, BORT$^2$ turns it to a labeling function to generate pseudo-labels for the target data and trains a target model using pseudo-labeled target data only.

Domain Adaptation

Learning to Augment via Implicit Differentiation for Domain Generalization

no code implementations25 Oct 2022 Tingwei Wang, Da Li, Kaiyang Zhou, Tao Xiang, Yi-Zhe Song

Machine learning models are intrinsically vulnerable to domain shift between training and testing data, resulting in poor performance in novel domains.

Data Augmentation Domain Generalization +1

Single Stage Multi-Pose Virtual Try-On

no code implementations19 Nov 2022 Sen He, Yi-Zhe Song, Tao Xiang

Key to our model is a parallel flow estimation module that predicts the flow fields for both person and garment images conditioned on the target pose.

Pose Transfer Virtual Try-on

Post-Processing Temporal Action Detection

1 code implementation CVPR 2023 Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

To address this problem, in this work we introduce a novel model-agnostic post-processing method without model redesign and retraining.

Action Classification Action Detection +1

Multi-Modal Few-Shot Temporal Action Detection

1 code implementation27 Nov 2022 Sauradip Nag, Mengmeng Xu, Xiatian Zhu, Juan-Manuel Perez-Rua, Bernard Ghanem, Yi-Zhe Song, Tao Xiang

In this work, we introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD by leveraging few-shot support videos and new class names jointly.

Action Detection Few-Shot Object Detection +3

Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification

1 code implementation30 Nov 2022 Jijie Wu, Dongliang Chang, Aneeshan Sain, Xiaoxu Li, Zhanyu Ma, Jie Cao, Jun Guo, Yi-Zhe Song

Conventional few-shot learning methods however cannot be naively adopted for this fine-grained setting -- a quick pilot study reveals that they in fact push for the opposite (i. e., lower inter-class variations and higher intra-class variations).

Few-Shot Image Classification Few-Shot Learning +2

An Erudite Fine-Grained Visual Classification Model

no code implementations CVPR 2023 Dongliang Chang, Yujun Tong, Ruoyi Du, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma

Therefore, we first propose a feature disentanglement module and a feature re-fusion module to reduce negative transfer and boost positive transfer between different datasets.

Classification Disentanglement +2

Controllable Person Image Synthesis with Pose-Constrained Latent Diffusion

no code implementations ICCV 2023 Xiao Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang

Controllable person image synthesis aims at rendering a source image based on user-specified changes in body pose or appearance.

Denoising Image Generation

Photo Pre-Training, but for Sketch

1 code implementation CVPR 2023 Ke Li, Kaiyue Pang, Yi-Zhe Song

This lack of sketch data has imposed on the community a few "peculiar" design choices -- the most representative of them all is perhaps the coerced utilisation of photo-based pre-training (i. e., no sketch), for many core tasks that otherwise dictates specific sketch understanding.

Sketch-Based Image Retrieval

Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting

no code implementations ICCV 2023 Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

We perform pivoting on two existing datasets, each from a distant research domain to the other: 2D sketch and photo pairs from the sketch-based image retrieval field (SBIR), and 3D shapes from ShapeNet.

3D Shape Retrieval Retrieval +1

Unsupervised Hashing with Similarity Distribution Calibration

1 code implementation15 Feb 2023 Kam Woh Ng, Xiatian Zhu, Jiun Tian Hoe, Chee Seng Chan, Tianyu Zhang, Yi-Zhe Song, Tao Xiang

However, these methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space, due to the limited similarity range of hash codes.

Deep Hashing Image Retrieval

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

1 code implementation CVPR 2023 Xiao Han, Xiatian Zhu, Licheng Yu, Li Zhang, Yi-Zhe Song, Tao Xiang

In the fashion domain, there exists a variety of vision-and-language (V+L) tasks, including cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image captioning.

Cross-Modal Retrieval Image Captioning +4

Generative Model Based Noise Robust Training for Unsupervised Domain Adaptation

no code implementations10 Mar 2023 Zhongying Deng, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang

D-CFA minimizes the domain gap by augmenting the source data with distribution-sampled target features, and trains a noise-robust discriminative classifier by using target domain knowledge from the generative models.

Unsupervised Domain Adaptation

Data-Free Sketch-Based Image Retrieval

1 code implementation CVPR 2023 Abhra Chaudhuri, Ayan Kumar Bhunia, Yi-Zhe Song, Anjan Dutta

For the first time, we identify that for data-scarce tasks like Sketch-Based Image Retrieval (SBIR), where the difficulty in acquiring paired photos and hand-drawn sketches limits data-dependent cross-modal learning algorithms, DFL can prove to be a much more practical paradigm.

Retrieval Sketch-Based Image Retrieval

Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

no code implementations CVPR 2023 Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

We further introduce specific designs to tackle the abstract nature of human sketches, including a fine-grained discriminative loss on the back of a trained sketch-photo retrieval model, and a partial-aware sketch augmentation strategy.

Image Generation Retrieval +1

Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR

no code implementations CVPR 2023 Aneeshan Sain, Ayan Kumar Bhunia, Subhadeep Koley, Pinaki Nath Chowdhury, Soumitri Chattopadhyay, Tao Xiang, Yi-Zhe Song

This paper advances the fine-grained sketch-based image retrieval (FG-SBIR) literature by putting forward a strong baseline that overshoots prior state-of-the-arts by ~11%.

Knowledge Distillation Sketch-Based Image Retrieval

Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

1 code implementation CVPR 2023 Fengyin Lin, Mingkang Li, Da Li, Timothy Hospedales, Yi-Zhe Song, Yonggang Qi

This paper studies the problem of zero-short sketch-based image retrieval (ZS-SBIR), however with two significant differentiators to prior art (i) we tackle all variants (inter-category, intra-category, and cross datasets) of ZS-SBIR with just one network (``everything''), and (ii) we would really like to understand how this sketch-photo matching operates (``explainable'').

Relation Network Retrieval +1

What Can Human Sketches Do for Object Detection?

no code implementations CVPR 2023 Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

In particular, we first perform independent prompting on both sketch and photo branches of an SBIR model to build highly generalisable sketch and photo encoders on the back of the generalisation ability of CLIP.

Object object-detection +3

DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion

1 code implementation ICCV 2023 Sauradip Nag, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang

Concretely, we establish the denoising process in the Transformer decoder (e. g., DETR) by introducing a temporal location query design with faster convergence in training.

Action Detection Denoising

ChiroDiff: Modelling chirographic data with Diffusion Models

no code implementations7 Apr 2023 Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Such strictly-ordered discrete factorization however falls short of capturing key properties of chirographic data -- it fails to build holistic understanding of the temporal concept due to one-way visibility (causality).

Denoising

SketchXAI: A First Look at Explainability for Human Sketches

no code implementations CVPR 2023 Zhiyu Qu, Yulia Gryaditskaya, Ke Li, Kaiyue Pang, Tao Xiang, Yi-Zhe Song

Following this, we design a simple explainability-friendly sketch encoder that accommodates the intrinsic properties of strokes: shape, location, and order.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +1

SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation

1 code implementation27 Aug 2023 Zhiyu Qu, Tao Xiang, Yi-Zhe Song

Through this work, we hope to aspire the way we create visual content, democratise the creative process, and inspire further research in enhancing human creativity in AIGC.

Sketch-based Video Object Segmentation: Benchmark and Analysis

no code implementations13 Nov 2023 Ruolin Yang, Da Li, Conghui Hu, Timothy Hospedales, Honggang Zhang, Yi-Zhe Song

Reference-based video object segmentation is an emerging topic which aims to segment the corresponding target object in each video frame referred by a given reference, such as a language expression or a photo mask.

Object Segmentation +3

DemoFusion: Democratising High-Resolution Image Generation With No $$$

1 code implementation24 Nov 2023 Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma

High-resolution image generation with Generative Artificial Intelligence (GenAI) has immense potential but, due to the enormous capital investment required for training, it is increasingly centralised to a few large corporations, and hidden behind paywalls.

Image Generation

Wired Perspectives: Multi-View Wire Art Embraces Generative AI

no code implementations26 Nov 2023 Zhiyu Qu, Lan Yang, Honggang Zhang, Tao Xiang, Kaiyue Pang, Yi-Zhe Song

Creating multi-view wire art (MVWA), a static 3D sculpture with diverse interpretations from different viewpoints, is a complex task even for skilled artists.

Knowledge Distillation

DreamCreature: Crafting Photorealistic Virtual Creatures from Imagination

1 code implementation27 Nov 2023 Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

To bridge this gap, we introduce a novel task, Virtual Creatures Generation: Given a set of unlabeled images of the target concepts (e. g., 200 bird species), we aim to train a T2I model capable of creating new, hybrid concepts within diverse backgrounds and contexts.

Disentanglement Novel Concepts

Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes

no code implementations7 Dec 2023 Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Ayan Kumar Bhunia, Yi-Zhe Song

In this paper, we democratise 3D content creation, enabling precise generation of 3D shapes from abstract sketches while overcoming limitations tied to drawing skills.

Position

DemoCaricature: Democratising Caricature Generation with a Rough Sketch

no code implementations7 Dec 2023 Dar-Yen Chen, Ayan Kumar Bhunia, Subhadeep Koley, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

In this paper, we democratise caricature generation, empowering individuals to effortlessly craft personalised caricatures with just a photo and a conceptual sketch.

Caricature Model Editing

Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

no code implementations12 Mar 2024 Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

This paper, for the first time, explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR).

Retrieval Sketch-Based Image Retrieval

It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

1 code implementation12 Mar 2024 Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

This paper unravels the potential of sketches for diffusion models, addressing the deceptive promise of direct sketch control in generative AI.

Retrieval Sketch-Based Image Retrieval

What Sketch Explainability Really Means for Downstream Tasks

no code implementations14 Mar 2024 Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

In this paper, we explore the unique modality of sketch for explainability, emphasising the profound impact of human strokes compared to conventional pixel-oriented studies.

Retrieval

SketchINR: A First Look into Sketches as Implicit Neural Representations

no code implementations14 Mar 2024 Hmrishav Bandyopadhyay, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Tao Xiang, Timothy Hospedales, Yi-Zhe Song

(ii) SketchINR's auto-decoder provides a much higher-fidelity representation than other learned vector sketch representations, and is uniquely able to scale to complex vector sketches such as FS-COCO.

Data Compression

A Tree-Structured Decoder for Image-to-Markup Generation

1 code implementation ICML 2020 Jianshu Zhang, Jun Du, Yongxin Yang, Yi-Zhe Song, Si Wei, Li-Rong Dai

Recent encoder-decoder approaches typically employ string decoders to convert images into serialized strings for image-to-markup.

Math

Cannot find the paper you are looking for? You can Submit a new open access paper.