Jointly Localizing and Describing Events for Dense Video Captioning

CVPR 2018 Yehao LiTing YaoYingwei PanHongyang ChaoTao Mei

Automatically describing a video with natural language is regarded as a fundamental challenge in computer vision. The problem nevertheless is not trivial especially when a video contains multiple events to be worthy of mention, which often happens in real videos... (read more)

PDF Abstract


No code implementations yet. Submit your code now

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.