Recent advances in Large Language Models (LLMs) have stimulated a surge of research aimed at extending their applications to the visual domain.
Since only a single point is required to recognize the text, the proposed method enables text spotting without an arbitrarily-shaped detector or bounding polygon annotations.
Ranked #7 on Text Spotting on Total-Text
Existing methods learn to disentangle style and content elements by developing a universal style representation for each font style.
MX-Font extracts multiple style features not explicitly conditioned on component labels, but automatically by multiple experts to represent different local concepts, e. g., left-side sub-glyph.
However, learning component-wise styles solely from reference glyphs is infeasible in the few-shot font generation scenario, when a target script has a large number of components, e. g., over 200 for Chinese.
By utilizing the compositionality of compositional scripts, we propose a novel font generation framework, named Dual Memory-augmented Font Generation Network (DM-Font), which enables us to generate a high-quality font library with only a few samples.
OCR is inevitably linked to NLP since its final output is in text.
Parsing textual information embedded in images is important for various down- stream tasks.
Scene text detection methods based on neural networks have emerged recently and have shown promising results.
Ranked #1 on Scene Text Detection on ICDAR 2013 (Precision metric)