BezierAlign

Introduced by Liu et al. in ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

BezierAlign is a feature sampling method for arbitrarily-shaped scene text recognition that exploits parameterization nature of a compact Bezier curve bounding box. Unlike RoIAlign, the shape of sampling grid of BezierAlign is not rectangular. Instead, each column of the arbitrarily-shaped grid is orthogonal to the Bezier curve boundary of the text. The sampling points have equidistant interval in width and height, respectively, which are bilinear interpolated with respect to the coordinates.

Formally given an input feature map and Bezier curve control points, we concurrently process all the output pixels of the rectangular output feature map with size $h_{\text {out }} \times w_{\text {out }}$. Taking pixel $g_{i}$ with position $\left(g_{i w}, g_{i h}\right)$ (from output feature map) as an example, we calculate $t$ by:

$$ t=\frac{g_{i w}}{w_{o u t}} $$

We then calculate the point of upper Bezier curve boundary $tp$ and lower Bezier curve boundary $bp$. Using $tp$ and $bp$, we can linearly index the sampling point $op$ by:

$$ op=bp \cdot \frac{g_{i h}}{h_{\text {out }}}+tp \cdot\left(1-\frac{g_{i h}}{h_{\text {out }}}\right) $$

With the position of $op$, we can easily apply bilinear interpolation to calculate the result. Comparisons among previous sampling methods and BezierAlign are shown in the Figure.

Source: ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Text Spotting	2	40.00%
Document Layout Analysis	1	20.00%
Scene Text Detection	1	20.00%
Text Detection	1	20.00%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

RoI Feature Extractors