In this paper, we introduced the novel concept of advisor network to address the problem of noisy labels in image classification.
Ranked #4 on Image Classification on Clothing1M (using extra training data)
The proposed method is based on an initial training stage where a simple combination of visual and textual features is used, to fine-tune the CLIP text encoder.
Ranked #1 on Text-Image Retrieval on Fashion IQ
the visual content of the query image.
Ranked #2 on Text-Image Retrieval on CIRR
To understand human behavior we must not just recognize individual actions but model possibly complex group activity and interactions.
Ranked #8 on Group Activity Recognition on Volleyball
In this paper, we address the problem of image retrieval by learning images representation based on the activations of a Convolutional Neural Network.
In this paper we deal with the problem of predicting action progress in videos.
Automatic image annotation is among the fundamental problems in computer vision and pattern recognition, and it is becoming increasingly important in order to develop algorithms that are able to search and browse large-scale image collections.
Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image.