Existing methods for arbitrary-shaped text detection in natural scenes face two critical issues, i. e., 1) fracture detections at the gaps in a text instance; and 2) inaccurate detections of arbitrary-shaped text instances with diverse background context.
Ranked #1 on Scene Text Detection on IC19-Art (Recall metric)
Multi-source neural machine translation aims to translate from parallel sources of information (e. g. languages, images, etc.)
More specifically, we propose to perceive texts from three levels of feature representations, i. e., character-, word- and global-level, and then introduce a novel text representation fusion technique to help achieve robust arbitrary text detection.
Ranked #1 on Scene Text Detection on ICDAR 2015