Action spotting in soccer videos is the task of identifying the specific time when a certain key action of the game occurs.
In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels.
Sounds are an important source of information on our daily interactions with objects.
Event boundaries play a crucial role as a pre-processing step for detection, localization, and recognition tasks of human activities in videos.
Wearable cameras can gather large a\-mounts of image data that provide rich visual information about the daily activities of the wearer.
Recently, there has been a growing interest in analyzing human daily activities from data collected by wearable cameras.
However, one of its main technical challenges is to deal with the low frame rate of wearable photo-cameras, which causes abrupt appearance changes between consecutive frames.
Recognizing Activities of Daily Living (ADLs) has a large number of health applications, such as characterize lifestyle for habit improvement, nursing and rehabilitation services.