Highlight detection models are typically trained to identify cues that make
visual content appealing or interesting for the general public, with the
objective of reducing a video to such moments. However, the "interestingness"
of a video segment or image is subjective...
Thus, such highlight models provide
results of limited relevance for the individual user. On the other hand,
training one model per user is inefficient and requires large amounts of
personal information which is typically not available. To overcome these
limitations, we present a global ranking model which conditions on each
particular user's interests. Rather than training one model per user, our model
is personalized via its inputs, which allows it to effectively adapt its
predictions, given only a few user-specific examples. To train this model, we
create a large-scale dataset of users and the GIFs they created, giving us an
accurate indication of their interests. Our experiments show that using the
user history substantially improves the prediction accuracy. On our test set of
850 videos, our model improves the recall by 8% with respect to generic
highlight detectors. Furthermore, our method proves more precise than the
user-agnostic baselines even with just one person-specific example.