Overall, we integrate in a state of the art garment recommendation framework a style classifier and an event classifier in order to condition recommendation on a given query.
The use of Deep Learning and Computer Vision in the Cultural Heritage domain is becoming highly relevant in the last few years with lots of applications about audio smart guides, interactive museums and augmented reality.
Several unsupervised and self-supervised approaches have been developed in recent years to learn visual features from large-scale unlabeled datasets.
Effective modeling of human interactions is of utmost importance when forecasting behaviors such as future trajectories.
In particular, we aim at retrieving a variety of modalities in which a certain garment can be combined.
Trajectory prediction is an important task, especially in autonomous driving.
In this paper we present an event aggregation strategy to convert the output of an event camera into frames processable by traditional Computer Vision algorithms.
Ranked #1 on Gesture Recognition on DVS128 Gesture (using extra training data)
Autonomous vehicles are expected to drive in complex scenarios with several independent non cooperating agents.
Current deep learning based autonomous driving approaches yield impressive results also leading to in-production deployment in certain controlled scenarios.
This will turn the classic audio guide into a smart personal instructor with which the visitor can interact by asking for explanations focused on specific interests.
Autonomous driving is becoming a reality, yet vehicles still need to rely on complex sensor fusion to understand the scene they act in.
In this paper we deal with the problem of predicting action progress in videos.
In this paper we present a simple yet effective approach to extend without supervision any object proposal from static images to videos.