A TWO-STAGE FRAMEWORK FOR MATHEMATICAL EXPRESSION RECOGNITION

25 Sep 2019  ·  Jin Zhang, Weipeng Ming, PengFei Liu ·

Although mathematical expressions (MEs) recognition have achieved great progress, the development of MEs recognition in real scenes is still unsatisfactory. Inspired by the recent work of neutral network, this paper proposes a novel two-stage approach which takes a printed mathematical expression image as input and generates LaTeX sequence as output. In the first stage, this method locates and recognizes the math symbols of input image by object detection algorithm. In the second stage, it translates math symbols with position information into LaTeX sequences by seq2seq model equipped with attention mechanism. In particular, the detection of mathematical symbols and the structural analysis of mathematical formulas are carried out separately in two steps, which effectively improves the recognition accuracy and enhances the generalization ability. The experiment demonstrates that the two-stage method significantly outperforms the end-to-end method. Especially, the ExpRate(expression recognition rate) of our model is 74.1%, 20.3 percentage points higher than that of the end-to-end model on the test data that doesn’t come from the same source as training data.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here