Search Results for author: Yi-Zhe Song

Found 121 papers, 62 papers with code

A Survey on Heterogeneous Face Recognition: Sketch, Infra-red, 3D and Low-resolution

no code implementations • 17 Sep 2014 • Shuxin Ouyang, Timothy Hospedales, Yi-Zhe Song, Xueming Li

Heterogeneous face recognition (HFR) refers to matching face imagery across different domains.

Face Recognition Heterogeneous Face Recognition

Paper
Add Code

Sketch-a-Net that Beats Humans

2 code implementations • 30 Jan 2015 • Qian Yu, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales

We propose a multi-scale multi-channel deep neural network framework that, for the first time, yields sketch recognition performance surpassing that of humans.

Sketch Recognition

Paper
Code

Making Better Use of Edges via Perceptual Grouping

no code implementations • CVPR 2015 • Yonggang Qi, Yi-Zhe Song, Tao Xiang, Honggang Zhang, Timothy Hospedales, Yi Li, Jun Guo

We propose a perceptual grouping framework that organizes image edges into meaningful structures and demonstrate its usefulness on various computer vision tasks.

Learning-To-Rank Retrieval +1

Paper
Add Code

Free-hand Sketch Synthesis with Deformable Stroke Models

no code implementations • 9 Oct 2015 • Yi Li, Yi-Zhe Song, Timothy Hospedales, Shaogang Gong

We present a generative model which can automatically summarize the stroke composition of free-hand sketches of a given category.

Paper
Add Code

Sketch Me That Shoe

no code implementations • CVPR 2016 • Qian Yu, Feng Liu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Chen-Change Loy

We investigate the problem of fine-grained sketch-based image retrieval (SBIR), where free-hand human sketches are used as queries to perform instance-level retrieval of images.

Ranked #3 on Sketch-Based Image Retrieval on Chairs

Data Augmentation Retrieval +1

Paper
Add Code

ForgetMeNot: Memory-Aware Forensic Facial Sketch Matching

no code implementations • CVPR 2016 • Shuxin Ouyang, Timothy M. Hospedales, Yi-Zhe Song, Xueming Li

Based on this database we build a model to reverse the forgetting process.

Sketch Recognition

Paper
Add Code

Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval

no code implementations • 28 May 2017 • Peng Xu, Qiyue Yin, Yongye Huang, Yi-Zhe Song, Zhanyu Ma, Liang Wang, Tao Xiang, W. Bastiaan Kleijn, Jun Guo

Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo.

Ranked #5 on Sketch-Based Image Retrieval on Chairs

Image-text matching Retrieval +2

Paper
Add Code

Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval

no code implementations • ICCV 2017 • Jifei Song, Qian Yu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Human sketches are unique in being able to capture both the spatial topology of a visual object, as well as its subtle appearance details.

Ranked #2 on Sketch-Based Image Retrieval on Handbags

Feature Correlation Representation Learning +2

Paper
Add Code

Deeper, Broader and Artier Domain Generalization

6 code implementations • ICCV 2017 • Da Li, Yongxin Yang, Yi-Zhe Song, Timothy M. Hospedales

In this paper, we make two main contributions: Firstly, we build upon the favorable domain shift-robust properties of deep learning methods, and develop a low-rank parameterized CNN model for end-to-end DG learning.

Ranked #117 on Domain Generalization on PACS

Domain Generalization

1,323

Paper
Code

Learning to Generalize: Meta-Learning for Domain Generalization

5 code implementations • 10 Oct 2017 • Da Li, Yongxin Yang, Yi-Zhe Song, Timothy M. Hospedales

We propose a novel {meta-learning} method for domain generalization.

Ranked #114 on Domain Generalization on PACS

Domain Generalization Image Classification +1

3,120

Paper
Code

The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching

no code implementations • 22 Nov 2017 • Qian Yu, Xiaobin Chang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Many vision problems require matching images of object instances across different domains.

Person Re-Identification Retrieval +1

Paper
Add Code

SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval

1 code implementation • CVPR 2018 • Peng Xu, Yongye Huang, Tongtong Yuan, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Zhanyu Ma, Jun Guo

Key to our network design is the embedding of unique characteristics of human sketch, where (i) a two-branch CNN-RNN architecture is adapted to explore the temporal ordering of strokes, and (ii) a novel hashing loss is specifically designed to accommodate both the temporal and abstract traits of sketches.

Deep Hashing Sketch Recognition

Paper
Code

Learning Deep Sketch Abstraction

no code implementations • CVPR 2018 • Umar Riaz Muhammad, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Human free-hand sketches have been studied in various contexts including sketch recognition, synthesis and fine-grained sketch-based image retrieval (FG-SBIR).

Retrieval Sketch-Based Image Retrieval +1

Paper
Add Code

Sketch-a-Classifier: Sketch-based Photo Classifier Generation

no code implementations • CVPR 2018 • Conghui Hu, Da Li, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Contemporary deep learning techniques have made image recognition a reasonably reliable technology.

Transfer Learning Zero-Shot Learning

Paper
Add Code

Learning to Sketch with Shortcut Cycle Consistency

no code implementations • CVPR 2018 • Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales

In this paper, we present a novel approach for translating an object photo to a sketch, mimicking the human sketching process.

Multi-Task Learning Retrieval +1

Paper
Add Code

SketchyScene: Richly-Annotated Scene Sketches

2 code implementations • ECCV 2018 • Changqing Zou, Qian Yu, Ruofei Du, Haoran Mo, Yi-Zhe Song, Tao Xiang, Chengying Gao, Baoquan Chen, Hao Zhang

We contribute the first large-scale dataset of scene sketches, SketchyScene, with the goal of advancing research on sketch understanding at both the object and scene level.

Colorization Image Retrieval +2

101

Paper
Code

Universal Perceptual Grouping

1 code implementation • 7 Aug 2018 • Ke Li, Kaiyue Pang, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Honggang Zhang

In this work we aim to develop a universal sketch grouper.

Object Retrieval +1

Paper
Code

Deep Factorised Inverse-Sketching

no code implementations • ECCV 2018 • Kaiyue Pang, Da Li, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales

Instead there is a fundamental process of abstraction and iconic rendering, where overall geometry is warped and salient details are selectively included.

Retrieval Sketch-Based Image Retrieval +1

Paper
Add Code

Universal Sketch Perceptual Grouping

no code implementations • ECCV 2018 • Ke Li, Kaiyue Pang, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Honggang Zhang

In this work we aim to develop a universal sketch grouper.

Object Retrieval +1

Paper
Add Code

Episodic Training for Domain Generalization

2 code implementations • ICCV 2019 • Da Li, Jianshu Zhang, Yongxin Yang, Cong Liu, Yi-Zhe Song, Timothy M. Hospedales

In this paper, we build on this strong baseline by designing an episodic training procedure that trains a single deep network in a way that exposes it to the domain shift that characterises a novel domain at runtime.

Ranked #76 on Domain Generalization on PACS

Domain Generalization

Paper
Code

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval

1 code implementation • CVPR 2019 • Sounak Dey, Pau Riba, Anjan Dutta, Josep Llados, Yi-Zhe Song

Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic.

Retrieval Sketch-Based Image Retrieval

Paper
Code

Goal-Driven Sequential Data Abstraction

no code implementations • ICCV 2019 • Umar Riaz Muhammad, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

In the former one asks whether a machine can `understand' enough about the meaning of input data to produce a meaningful but more compact abstraction.

Benchmarking General Reinforcement Learning +2

Paper
Add Code

Semi-Heterogeneous Three-Way Joint Embedding Network for Sketch-Based Image Retrieval

no code implementations • 10 Nov 2019 • Jianjun Lei, Yuxin Song, Bo Peng, Zhanyu Ma, Ling Shao, Yi-Zhe Song

How to align abstract sketches and natural images into a common high-level semantic space remains a key problem in SBIR.

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

Deep Learning for Free-Hand Sketch: A Survey

2 code implementations • 8 Jan 2020 • Peng Xu, Timothy M. Hospedales, Qiyue Yin, Yi-Zhe Song, Tao Xiang, Liang Wang

Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present.

5,932

Paper
Code

SketchDesc: Learning Local Sketch Descriptors for Multi-view Correspondence

no code implementations • 16 Jan 2020 • Deng Yu, Lei LI, Youyi Zheng, Manfred Lau, Yi-Zhe Song, Chiew-Lan Tai, Hongbo Fu

In this paper, we study the problem of multi-view sketch correspondence, where we take as input multiple freehand sketches with different views of the same object and predict as output the semantic correspondence among the sketches.

Semantic correspondence

Paper
Add Code

Deep Self-Supervised Representation Learning for Free-Hand Sketch

1 code implementation • 3 Feb 2020 • Peng Xu, Zeyu Song, Qiyue Yin, Yi-Zhe Song, Liang Wang

In this paper, we tackle for the first time, the problem of self-supervised representation learning for free-hand sketches.

Representation Learning Retrieval +1

Paper
Code

The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification

3 code implementations • 11 Feb 2020 • Dongliang Chang, Yifeng Ding, Jiyang Xie, Ayan Kumar Bhunia, Xiaoxu Li, Zhanyu Ma, Ming Wu, Jun Guo, Yi-Zhe Song

The proposed loss function, termed as mutual-channel loss (MC-Loss), consists of two channel-specific components: a discriminality component and a diversity component.

Ranked #29 on Fine-Grained Image Classification on FGVC Aircraft

Fine-Grained Image Classification General Classification +1

299

Paper
Code

Fine-Grained Instance-Level Sketch-Based Video Retrieval

no code implementations • 21 Feb 2020 • Peng Xu, Kun Liu, Tao Xiang, Timothy M. Hospedales, Zhanyu Ma, Jun Guo, Yi-Zhe Song

Existing sketch-analysis work studies sketches depicting static objects or scenes.

Cross-Modal Retrieval Image Retrieval +2

Paper
Add Code

Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval

1 code implementation • 24 Feb 2020 • Ayan Kumar Bhunia, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch.

Cross-Modal Retrieval On-the-Fly Sketch Based Image Retrieval +1

Paper
Code

Mind the Gap: Enlarging the Domain Gap in Open Set Domain Adaptation

2 code implementations • 8 Mar 2020 • Dongliang Chang, Aneeshan Sain, Zhanyu Ma, Yi-Zhe Song, Jun Guo

The key insight lies with how we exploit the mutually beneficial information between two networks; (a) to separate samples of known and unknown classes, (b) to maximize the domain confusion between source and target domain without the influence of unknown samples.

Unsupervised Domain Adaptation

Paper
Code

Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches

5 code implementations • ECCV 2020 • Ruoyi Du, Dongliang Chang, Ayan Kumar Bhunia, Jiyang Xie, Zhanyu Ma, Yi-Zhe Song, Jun Guo

In this work, we propose a novel framework for fine-grained visual classification to tackle these problems.

Ranked #17 on Fine-Grained Image Classification on Stanford Cars

Classification Fine-Grained Image Classification +1

211

Paper
Code

Sequential Learning for Domain Generalization

no code implementations • 3 Apr 2020 • Da Li, Yongxin Yang, Yi-Zhe Song, Timothy Hospedales

In DG this means encountering a sequence of domains and at each step training to maximise performance on the next domain.

Ranked #76 on Domain Generalization on PACS

Domain Generalization Meta-Learning

Paper
Add Code

BézierSketch: A generative model for scalable vector sketches

1 code implementation • ECCV 2020 • Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process.

Image Generation

Paper
Code

On Learning Semantic Representations for Million-Scale Free-Hand Sketches

1 code implementation • 7 Jul 2020 • Peng Xu, Yongye Huang, Tongtong Yuan, Tao Xiang, Timothy M. Hospedales, Yi-Zhe Song, Liang Wang

Specifically, we use our dual-branch architecture as a universal representation framework to design two sketch-specific deep models: (i) We propose a deep hashing model for sketch retrieval, where a novel hashing loss is specifically designed to accommodate both the abstract and messy traits of sketches.

Deep Hashing Learning Semantic Representations +1

Paper
Code

Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

1 code implementation • 29 Jul 2020 • Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

In this paper, we study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail -- a person typically sketches up to various extents of detail to depict an object.

Retrieval Sketch-Based Image Retrieval

Paper
Code

Deep Sketch-Based Modeling: Tips and Tricks

1 code implementation • 12 Nov 2020 • Yue Zhong, Yulia Gryaditskaya, Honggang Zhang, Yi-Zhe Song

Deep image-based modeling received lots of attention in recent years, yet the parallel problem of sketch-based modeling has only been briefly studied, often as a potential application.

Paper
Code

Your "Flamingo" is My "Bird": Fine-Grained, or Not

1 code implementation • CVPR 2021 • Dongliang Chang, Kaiyue Pang, Yixiao Zheng, Zhanyu Ma, Yi-Zhe Song, Jun Guo

For that, we re-envisage the traditional setting of FGVC, from single-label classification, to that of top-down traversal of a pre-defined coarse-to-fine label hierarchy -- so that our answer becomes "bird"-->"Phoenicopteriformes"-->"Phoenicopteridae"-->"flamingo".

Ranked #16 on Fine-Grained Image Classification on FGVC Aircraft

Disentanglement Fine-Grained Image Classification +1

Paper
Code

SketchAA: Abstract Representation for Abstract Sketches

no code implementations • ICCV 2021 • Lan Yang, Kaiyue Pang, Honggang Zhang, Yi-Zhe Song

The superiority of explicitly abstracting sketch representation is empirically validated on a number of sketch analysis tasks, including sketch recognition, fine-grained sketch-based image retrieval, and generative sketch healing.

Retrieval Sketch-Based Image Retrieval +1

Paper
Add Code

Context-Aware Layout to Image Generation with Enhanced Object Appearance

1 code implementation • CVPR 2021 • Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.

Ranked #1 on Layout-to-Image Generation on COCO-Stuff 128x128

Layout-to-Image Generation Object

Paper
Code

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song

A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs.

Cross-Modal Retrieval Retrieval +2

Paper
Code

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting

1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

This data is uniquely characterised by its existence in dual modalities of rasterized images and vector coordinate sequences.

Self-Supervised Learning Translation

Paper
Code

Cloud2Curve: Generation and Vectorization of Parametric Sketches

1 code implementation • CVPR 2021 • Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations.

Paper
Code

StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

no code implementations • CVPR 2021 • Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

With this meta-learning framework, our model can not only disentangle the cross-modal shared semantic content for SBIR, but can adapt the disentanglement to any unseen user style as well, making the SBIR model truly style-agnostic.

Disentanglement Meta-Learning +2

Paper
Add Code

MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

In this paper, we take a completely different perspective -- we work on the assumption that there is always a new style that is drastically different, and that we will only have very limited data during testing to perform adaptation.

Handwritten Text Recognition HTR +1

Paper
Code

PQA: Perceptual Question Answering

1 code implementation • CVPR 2021 • Yonggang Qi, Kai Zhang, Aneeshan Sain, Yi-Zhe Song

Perceptual organization remains one of the very few established theories on the human visual system.

Question Answering

Paper
Code

Towards Unsupervised Sketch-based Image Retrieval

no code implementations • 18 May 2021 • Conghui Hu, Yongxin Yang, Yunpeng Li, Timothy M. Hospedales, Yi-Zhe Song

The practical value of existing supervised sketch-based image retrieval (SBIR) algorithms is largely limited by the requirement for intensive data collection and labeling.

Representation Learning Retrieval +1

Paper
Add Code

Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation

no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

In this paper, for the first time, we argue for their unification -- we aim for a single model that can compete favourably with two separate state-of-the-art STR and HTR models.

Handwriting Recognition HTR +2

Paper
Add Code

Towards the Unseen: Iterative Text Recognition by Distilling from Errors

no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

Our framework is iterative in nature, in that it utilises predicted knowledge of character sequences from a previous iteration, to augment the main network in improving the next prediction.

Paper
Add Code

Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition

no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Aneeshan Sain, Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Yi-Zhe Song

In this paper, we argue that semantic information offers a complementary role in addition to visual only.

Rolling Shutter Correction

Paper
Add Code

Disentangled Lifespan Face Synthesis

no code implementations • ICCV 2021 • Sen He, Wentong Liao, Michael Ying Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

The generated face image given a target age code is expected to be age-sensitive reflected by bio-plausible transformations of shape and texture, while being identity preserving.

Face Generation

Paper
Add Code

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer

1 code implementation • ICCV 2021 • Zhihe Lu, Sen He, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

A few-shot semantic segmentation model is typically composed of a CNN encoder, a CNN decoder and a simple classifier (separating foreground and background pixels).

Ranked #9 on Few-Shot Semantic Segmentation on COCO-20i -> Pascal VOC (5-shot)

Few-Shot Semantic Segmentation Meta-Learning +1

126

Paper
Code

SketchLattice: Latticed Representation for Sketch Manipulation

no code implementations • ICCV 2021 • Yonggang Qi, Guoyao Su, Pinaki Nath Chowdhury, Mingkang Li, Yi-Zhe Song

The key challenge in designing a sketch representation lies with handling the abstract and iconic nature of sketches.

Paper
Add Code

One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective

2 code implementations • NeurIPS 2021 • Jiun Tian Hoe, Kam Woh Ng, Tianyu Zhang, Chee Seng Chan, Yi-Zhe Song, Tao Xiang

In this work, we propose a novel deep hashing model with only a single learning objective.

Deep Hashing Multi-Label Classification +1

Paper
Code

SketchODE: Learning neural sketch representation in continuous time

no code implementations • ICLR 2022 • Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Learning meaningful representations for chirographic drawing data such as sketches, handwriting, and flowcharts is a gateway for understanding and emulating human creative expression.

Data Augmentation

Paper
Add Code

Temporal Action Localization with Global Segmentation Mask Transformers

no code implementations • 29 Sep 2021 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

In this paper, to address the above two challenges, a novel {\em Global Segmentation Mask Transformer} (GSMT) is proposed.

Object object-detection +2

Paper
Add Code

Fine-Grained Image Analysis with Deep Learning: A Survey

no code implementations • 11 Nov 2021 • Xiu-Shen Wei, Yi-Zhe Song, Oisin Mac Aodha, Jianxin Wu, Yuxin Peng, Jinhui Tang, Jian Yang, Serge Belongie

Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications.

Fine-Grained Image Recognition Image Retrieval +1

Paper
Add Code

Clue Me In: Semi-Supervised FGVC with Out-of-Distribution Data

1 code implementation • 6 Dec 2021 • Ruoyi Du, Dongliang Chang, Zhanyu Ma, Yi-Zhe Song, Jun Guo

Despite great strides made on fine-grained visual classification (FGVC), current methods are still heavily reliant on fully-supervised paradigms where ample expert labels are called for.

Fine-Grained Image Classification

Paper
Code

Making a Bird AI Expert Work for You and Me

1 code implementation • 6 Dec 2021 • Dongliang Chang, Kaiyue Pang, Ruoyi Du, Zhanyu Ma, Yi-Zhe Song, Jun Guo

1 lays out our approach in answering this question.

Fine-Grained Image Classification

Paper
Code

Hybrid Graph Neural Networks for Few-Shot Learning

no code implementations • 13 Dec 2021 • Tianyuan Yu, Sen He, Yi-Zhe Song, Tao Xiang

This is because they use an instance GNN as a label propagation/classification module, which is jointly meta-learned with a feature embedding network.

Few-Shot Learning

Paper
Add Code

One Sketch for All: One-Shot Personalized Sketch Segmentation

no code implementations • 20 Dec 2021 • Anran Qi, Yulia Gryaditskaya, Tao Xiang, Yi-Zhe Song

We aim to segment all sketches belonging to the same category provisioned with a single sketch with a given part annotation while (i) preserving the parts semantics embedded in the exemplar, and (ii) being robust to input style and abstraction.

Segmentation

Paper
Add Code

Finding Badly Drawn Bunnies

1 code implementation • CVPR 2022 • Lan Yang, Kaiyue Pang, Honggang Zhang, Yi-Zhe Song

Our key discovery lies in exploiting the magnitude (L2 norm) of a sketch feature as a quantitative quality metric.

Paper
Code

FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context

1 code implementation • 4 Mar 2022 • Pinaki Nath Chowdhury, Aneeshan Sain, Ayan Kumar Bhunia, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song

We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO.

Image Captioning Image Retrieval +2

Paper
Code

Dynamic Instance Domain Adaptation

1 code implementation • 9 Mar 2022 • Zhongying Deng, Kaiyang Zhou, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang

In this paper, we address both single-source and multi-source UDA from a completely different perspective, which is to view each instance as a fine domain.

Unsupervised Domain Adaptation

Paper
Code

Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches

no code implementations • CVPR 2022 • Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

In this paper, we push the boundary further for FSCIL by addressing two key questions that bottleneck its ubiquitous application (i) can the model learn from diverse modalities other than just photo (as humans do), and (ii) what if photos are not readily accessible (due to ethical and privacy constraints).

Few-Shot Class-Incremental Learning Graph Attention +2

Paper
Add Code

Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval

1 code implementation • CVPR 2022 • Ayan Kumar Bhunia, Subhadeep Koley, Abdullah Faiz Ur Rahman Khilji, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

We first conducted a pilot study that revealed the secret lies in the existence of noisy strokes, but not so much of the "I can't sketch".

Retrieval Sketch-Based Image Retrieval

Paper
Code

Partially Does It: Towards Scene-Level FG-SBIR with Partial Input

no code implementations • CVPR 2022 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

We scrutinise an important observation plaguing scene-level sketch research -- that a significant portion of scene sketches are "partial".

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

Sketch3T: Test-Time Training for Zero-Shot SBIR

no code implementations • CVPR 2022 • Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i. e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a result.

Meta-Learning Retrieval +1

Paper
Add Code

Style-Based Global Appearance Flow for Virtual Try-On

3 code implementations • CVPR 2022 • Sen He, Yi-Zhe Song, Tao Xiang

To achieve this, a key step is garment warping which spatially aligns the target garment with the corresponding body parts in the person image.

Ranked #1 on Virtual Try-on on VITON

Virtual Try-on

266

Paper
Code

UIGR: Unified Interactive Garment Retrieval

1 code implementation • 6 Apr 2022 • Xiao Han, Sen He, Li Zhang, Yi-Zhe Song, Tao Xiang

In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR.

Retrieval

Paper
Code

SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text

no code implementations • CVPR 2023 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

In this paper, we extend scene understanding to include that of human sketch.

Image Retrieval Retrieval +1

Paper
Add Code

Adaptive Fine-Grained Sketch-Based Image Retrieval

1 code implementation • 4 Jul 2022 • Ayan Kumar Bhunia, Aneeshan Sain, Parth Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

To solve this new problem, we introduce a novel model-agnostic meta-learning (MAML) based framework with several key modifications: (1) As a retrieval task with a margin-based contrastive loss, we simplify the MAML training in the inner loop to make it more stable and tractable.

Meta-Learning Retrieval +1

Paper
Code

Semi-Supervised Temporal Action Detection with Proposal-Free Masking

1 code implementation • 14 Jul 2022 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

Such a novel design effectively eliminates the dependence between localization and classification by cutting off the route for error propagation in-between.

Ranked #1 on Semi-Supervised Action Detection on ActivityNet-1.3

Action Detection General Classification +1

Paper
Code

Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning

2 code implementations • 14 Jul 2022 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

Existing temporal action detection (TAD) methods rely on generating an overwhelmingly large number of proposals per video.

Ranked #15 on Temporal Action Localization on ActivityNet-1.3

Action Detection Representation Learning +1

Paper
Code

FashionViL: Fashion-Focused Vision-and-Language Representation Learning

1 code implementation • 17 Jul 2022 • Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

We thus propose a Multi-View Contrastive Learning task for pulling closer the visual representation of one image to the compositional multimodal representation of another image+text.

Contrastive Learning Image Retrieval +2

Paper
Code

Zero-Shot Temporal Action Detection via Vision-Language Prompting

1 code implementation • 17 Jul 2022 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

Such a novel design effectively eliminates the dependence between localization and classification by breaking the route for error propagation in-between.

Ranked #1 on Zero-Shot Action Detection on THUMOS' 14

Action Detection Classification +3

Paper
Code

SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling

1 code implementation • 14 Aug 2022 • Chenjian Gao, Qian Yu, Lu Sheng, Yi-Zhe Song, Dong Xu

Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.

3D Reconstruction

Paper
Code

DifferSketching: How Differently Do People Sketch 3D Objects?

1 code implementation • 19 Sep 2022 • Chufeng Xiao, Wanchao Su, Jing Liao, Zhouhui Lian, Yi-Zhe Song, Hongbo Fu

We invited 70 novice users and 38 expert users to sketch 136 3D objects, which were presented as 362 images rendered from multiple views.

3D Reconstruction

Paper
Code

Structure-Aware 3D VR Sketch to 3D Shape Retrieval

1 code implementation • 19 Sep 2022 • Ling Luo, Yulia Gryaditskaya, Tao Xiang, Yi-Zhe Song

In particular, we propose to use a triplet loss with an adaptive margin value driven by a "fitting gap", which is the similarity of two shapes under structure-preserving deformations.

3D Shape Retrieval Retrieval

Paper
Code

Towards 3D VR-Sketch to 3D Shape Retrieval

1 code implementation • 20 Sep 2022 • Ling Luo, Yulia Gryaditskaya, Yongxin Yang, Tao Xiang, Yi-Zhe Song

In this paper, we offer a different perspective towards answering these questions -- we study the use of 3D sketches as an input modality and advocate a VR-scenario where retrieval is conducted.

3D Shape Retrieval Retrieval

Paper
Code

Fine-Grained VR Sketching: Dataset and Insights

1 code implementation • 20 Sep 2022 • Ling Luo, Yulia Gryaditskaya, Yongxin Yang, Tao Xiang, Yi-Zhe Song

We then, for the first time, study the scenario of fine-grained 3D VR sketch to 3D shape retrieval, as a novel VR sketching application and a proving ground to drive out generic insights to inform future research.

3D Shape Reconstruction 3D Shape Retrieval +1

Paper
Code

Robust Target Training for Multi-Source Domain Adaptation

1 code implementation • 4 Oct 2022 • Zhongying Deng, Da Li, Yi-Zhe Song, Tao Xiang

Given any existing fully-trained one-step MSDA model, BORT$^2$ turns it to a labeling function to generate pseudo-labels for the target data and trains a target model using pseudo-labeled target data only.

Domain Adaptation

Paper
Code

Prediction Calibration for Generalized Few-shot Semantic Segmentation

no code implementations • 15 Oct 2022 • Zhihe Lu, Sen He, Da Li, Yi-Zhe Song, Tao Xiang

To ensure that the fused scores are not biased to either the base or novel classes, a new Transformer-based calibration module is introduced.

Ranked #3 on Generalized Few-Shot Semantic Segmentation on PASCAL-5i (5-Shot)

Generalized Few-Shot Semantic Segmentation Semantic Segmentation

Paper
Add Code

Learning to Augment via Implicit Differentiation for Domain Generalization

no code implementations • 25 Oct 2022 • Tingwei Wang, Da Li, Kaiyang Zhou, Tao Xiang, Yi-Zhe Song

Machine learning models are intrinsically vulnerable to domain shift between training and testing data, resulting in poor performance in novel domains.

Data Augmentation Domain Generalization +1

Paper
Add Code

Single Stage Multi-Pose Virtual Try-On

no code implementations • 19 Nov 2022 • Sen He, Yi-Zhe Song, Tao Xiang

Key to our model is a parallel flow estimation module that predicts the flow fields for both person and garment images conditioned on the target pose.

Pose Transfer Virtual Try-on

Paper
Add Code

Post-Processing Temporal Action Detection

1 code implementation • CVPR 2023 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

To address this problem, in this work we introduce a novel model-agnostic post-processing method without model redesign and retraining.

Action Classification Action Detection +1

Paper
Code

Multi-Modal Few-Shot Temporal Action Detection

1 code implementation • 27 Nov 2022 • Sauradip Nag, Mengmeng Xu, Xiatian Zhu, Juan-Manuel Perez-Rua, Bernard Ghanem, Yi-Zhe Song, Tao Xiang

In this work, we introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD by leveraging few-shot support videos and new class names jointly.

Action Detection Few-Shot Object Detection +3

Paper
Code

Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification

1 code implementation • 30 Nov 2022 • Jijie Wu, Dongliang Chang, Aneeshan Sain, Xiaoxu Li, Zhanyu Ma, Jie Cao, Jun Guo, Yi-Zhe Song

Conventional few-shot learning methods however cannot be naively adopted for this fine-grained setting -- a quick pilot study reveals that they in fact push for the opposite (i. e., lower inter-class variations and higher intra-class variations).

Few-Shot Image Classification Few-Shot Learning +2

Paper
Code

An Erudite Fine-Grained Visual Classification Model

no code implementations • CVPR 2023 • Dongliang Chang, Yujun Tong, Ruoyi Du, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma

Therefore, we first propose a feature disentanglement module and a feature re-fusion module to reduce negative transfer and boost positive transfer between different datasets.

Classification Disentanglement +2

Paper
Add Code

Controllable Person Image Synthesis with Pose-Constrained Latent Diffusion

no code implementations • ICCV 2023 • Xiao Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang

Controllable person image synthesis aims at rendering a source image based on user-specified changes in body pose or appearance.

Denoising Image Generation

Paper
Add Code

Photo Pre-Training, but for Sketch

1 code implementation • CVPR 2023 • Ke Li, Kaiyue Pang, Yi-Zhe Song

This lack of sketch data has imposed on the community a few "peculiar" design choices -- the most representative of them all is perhaps the coerced utilisation of photo-based pre-training (i. e., no sketch), for many core tasks that otherwise dictates specific sketch understanding.

Sketch-Based Image Retrieval

Paper
Code

Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting

no code implementations • ICCV 2023 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

We perform pivoting on two existing datasets, each from a distant research domain to the other: 2D sketch and photo pairs from the sketch-based image retrieval field (SBIR), and 3D shapes from ShapeNet.

3D Shape Retrieval Retrieval +1

Paper
Add Code

On-the-Fly Category Discovery

1 code implementation • CVPR 2023 • Ruoyi Du, Dongliang Chang, Kongming Liang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma

Our code is available at https://github. com/PRIS-CV/On-the-fly-Category-Discovery.

Disentanglement Novel Class Discovery

Paper
Code

Task-aware Adaptive Learning for Cross-domain Few-shot Learning

no code implementations • ICCV 2023 • Yurong Guo, Ruoyi Du, Yuan Dong, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma

In this paper, we first observe the dependence of task-specific parameter configuration on the target task.

cross-domain few-shot learning Domain Generalization +1

Paper
Add Code

Unsupervised Hashing with Similarity Distribution Calibration

1 code implementation • 15 Feb 2023 • Kam Woh Ng, Xiatian Zhu, Jiun Tian Hoe, Chee Seng Chan, Tianyu Zhang, Yi-Zhe Song, Tao Xiang

However, these methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space, due to the limited similarity range of hash codes.

Deep Hashing Image Retrieval

Paper
Code

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

1 code implementation • CVPR 2023 • Xiao Han, Xiatian Zhu, Licheng Yu, Li Zhang, Yi-Zhe Song, Tao Xiang

In the fashion domain, there exists a variety of vision-and-language (V+L) tasks, including cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image captioning.

Cross-Modal Retrieval Image Captioning +4

Paper
Code

Generative Model Based Noise Robust Training for Unsupervised Domain Adaptation

no code implementations • 10 Mar 2023 • Zhongying Deng, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang

D-CFA minimizes the domain gap by augmenting the source data with distribution-sampled target features, and trains a noise-robust discriminative classifier by using target domain knowledge from the generative models.

Unsupervised Domain Adaptation

Paper
Add Code

Data-Free Sketch-Based Image Retrieval

1 code implementation • CVPR 2023 • Abhra Chaudhuri, Ayan Kumar Bhunia, Yi-Zhe Song, Anjan Dutta

For the first time, we identify that for data-scarce tasks like Sketch-Based Image Retrieval (SBIR), where the difficulty in acquiring paired photos and hand-drawn sketches limits data-dependent cross-modal learning algorithms, DFL can prove to be a much more practical paradigm.

Retrieval Sketch-Based Image Retrieval

Paper
Code

Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

no code implementations • CVPR 2023 • Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

We further introduce specific designs to tackle the abstract nature of human sketches, including a fine-grained discriminative loss on the back of a trained sketch-photo retrieval model, and a partial-aware sketch augmentation strategy.

Image Generation Retrieval +1

Paper
Add Code

Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings

no code implementations • CVPR 2023 • Ayan Kumar Bhunia, Subhadeep Koley, Amandeep Kumar, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

Human sketch has already proved its worth in various visual understanding tasks (e. g., retrieval, segmentation, image-captioning, etc).

Image Captioning Retrieval +1

Paper
Add Code

CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not

no code implementations • CVPR 2023 • Aneeshan Sain, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

At the very core of our solution is a prompt learning setup.

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR

no code implementations • CVPR 2023 • Aneeshan Sain, Ayan Kumar Bhunia, Subhadeep Koley, Pinaki Nath Chowdhury, Soumitri Chattopadhyay, Tao Xiang, Yi-Zhe Song

This paper advances the fine-grained sketch-based image retrieval (FG-SBIR) literature by putting forward a strong baseline that overshoots prior state-of-the-arts by ~11%.

Knowledge Distillation Sketch-Based Image Retrieval

Paper
Add Code

Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

1 code implementation • CVPR 2023 • Fengyin Lin, Mingkang Li, Da Li, Timothy Hospedales, Yi-Zhe Song, Yonggang Qi

This paper studies the problem of zero-short sketch-based image retrieval (ZS-SBIR), however with two significant differentiators to prior art (i) we tackle all variants (inter-category, intra-category, and cross datasets) of ZS-SBIR with just one network (``everything''), and (ii) we would really like to understand how this sketch-photo matching operates (``explainable'').

Relation Network Retrieval +1

Paper
Code

What Can Human Sketches Do for Object Detection?

no code implementations • CVPR 2023 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

In particular, we first perform independent prompting on both sketch and photo branches of an SBIR model to build highly generalisable sketch and photo encoders on the back of the generalisation ability of CLIP.

Object object-detection +3

Paper
Add Code

DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion

1 code implementation • ICCV 2023 • Sauradip Nag, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang

Concretely, we establish the denoising process in the Transformer decoder (e. g., DETR) by introducing a temporal location query design with faster convergence in training.

Action Detection Denoising

Paper
Code

ChiroDiff: Modelling chirographic data with Diffusion Models

no code implementations • 7 Apr 2023 • Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song

Such strictly-ordered discrete factorization however falls short of capturing key properties of chirographic data -- it fails to build holistic understanding of the temporal concept due to one-way visibility (causality).

Denoising

Paper
Add Code

SketchXAI: A First Look at Explainability for Human Sketches

no code implementations • CVPR 2023 • Zhiyu Qu, Yulia Gryaditskaya, Ke Li, Kaiyue Pang, Tao Xiang, Yi-Zhe Song

Following this, we design a simple explainability-friendly sketch encoder that accommodates the intrinsic properties of strokes: shape, location, and order.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +1

Paper
Add Code

3D VR Sketch Guided 3D Shape Prototyping and Exploration

1 code implementation • ICCV 2023 • Ling Luo, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song, Yulia Gryaditskaya

3D shape modeling is labor-intensive, time-consuming, and requires years of expertise.

3D Shape Generation 3D Shape Modeling +1

Paper
Code

SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation

1 code implementation • 27 Aug 2023 • Zhiyu Qu, Tao Xiang, Yi-Zhe Song

Through this work, we hope to aspire the way we create visual content, democratise the creative process, and inspire further research in enhancing human creativity in AIGC.

Paper
Code

Sketch-based Video Object Segmentation: Benchmark and Analysis

no code implementations • 13 Nov 2023 • Ruolin Yang, Da Li, Conghui Hu, Timothy Hospedales, Honggang Zhang, Yi-Zhe Song

Reference-based video object segmentation is an emerging topic which aims to segment the corresponding target object in each video frame referred by a given reference, such as a language expression or a photo mask.

Object Segmentation +3

Paper
Add Code

DemoFusion: Democratising High-Resolution Image Generation With No $$$

1 code implementation • 24 Nov 2023 • Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma

High-resolution image generation with Generative Artificial Intelligence (GenAI) has immense potential but, due to the enormous capital investment required for training, it is increasingly centralised to a few large corporations, and hidden behind paywalls.

Image Generation

1,853

Paper
Code

Wired Perspectives: Multi-View Wire Art Embraces Generative AI

no code implementations • 26 Nov 2023 • Zhiyu Qu, Lan Yang, Honggang Zhang, Tao Xiang, Kaiyue Pang, Yi-Zhe Song

Creating multi-view wire art (MVWA), a static 3D sculpture with diverse interpretations from different viewpoints, is a complex task even for skilled artists.

Knowledge Distillation

Paper
Add Code

DreamCreature: Crafting Photorealistic Virtual Creatures from Imagination

1 code implementation • 27 Nov 2023 • Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

To bridge this gap, we introduce a novel task, Virtual Creatures Generation: Given a set of unlabeled images of the target concepts (e. g., 200 bird species), we aim to train a T2I model capable of creating new, hybrid concepts within diverse backgrounds and contexts.

Disentanglement Novel Concepts

Paper
Code

Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes

no code implementations • 7 Dec 2023 • Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Ayan Kumar Bhunia, Yi-Zhe Song

In this paper, we democratise 3D content creation, enabling precise generation of 3D shapes from abstract sketches while overcoming limitations tied to drawing skills.

Position

Paper
Add Code

DemoCaricature: Democratising Caricature Generation with a Rough Sketch

no code implementations • 7 Dec 2023 • Dar-Yen Chen, Ayan Kumar Bhunia, Subhadeep Koley, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

In this paper, we democratise caricature generation, empowering individuals to effortlessly craft personalised caricatures with just a photo and a conceptual sketch.

Caricature Model Editing

Paper
Add Code

How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

no code implementations • 11 Mar 2024 • Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

@q loss to inject that understanding into the system.

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

no code implementations • 12 Mar 2024 • Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

This paper, for the first time, explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR).

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

1 code implementation • 12 Mar 2024 • Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

This paper unravels the potential of sketches for diffusion models, addressing the deceptive promise of direct sketch control in generative AI.

Retrieval Sketch-Based Image Retrieval

Paper
Code

You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

no code implementations • 12 Mar 2024 • Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

Two primary input modalities prevail in image retrieval: sketch and text.

Attribute Image Retrieval +1

Paper
Add Code

What Sketch Explainability Really Means for Downstream Tasks

no code implementations • 14 Mar 2024 • Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

In this paper, we explore the unique modality of sketch for explainability, emphasising the profound impact of human strokes compared to conventional pixel-oriented studies.

Retrieval

Paper
Add Code

SketchINR: A First Look into Sketches as Implicit Neural Representations

no code implementations • 14 Mar 2024 • Hmrishav Bandyopadhyay, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Tao Xiang, Timothy Hospedales, Yi-Zhe Song

(ii) SketchINR's auto-decoder provides a much higher-fidelity representation than other learned vector sketch representations, and is uniquely able to scale to complex vector sketches such as FS-COCO.

Data Compression

Paper
Add Code

A Tree-Structured Decoder for Image-to-Markup Generation

1 code implementation • ICML 2020 • Jianshu Zhang, Jun Du, Yongxin Yang, Yi-Zhe Song, Si Wei, Li-Rong Dai

Recent encoder-decoder approaches typically employ string decoders to convert images into serialized strings for image-to-markup.

Math

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.