Based on the observations, we propose a scheme to fuse global and local motion patterns (MPs) and key visual information (KVI) for semantic event recognition in basketball videos.
Under the new schema, the proposed method can achieve superior accuracy (WIDER FACE Val/Test -- Easy: 0. 910/0. 896, Medium: 0. 881/0. 865, Hard: 0. 780/0. 770; FDDB -- discontinuous: 0. 973, continuous: 0. 724).
Ranked #7 on Face Detection on FDDB
Therefore, a semantic event in broadcast basketball videos is closely related to both the global motion (camera motion) and the collective motion.