Matching two texts is a fundamental problem in many natural language
processing tasks. An effective way is to extract meaningful matching patterns
from words, phrases, and sentences to produce the matching score. Inspired by
the success of convolutional neural network in image recognition, where neurons
can capture many complicated patterns based on the extracted elementary visual
patterns such as oriented edges and corners, we propose to model text matching
as the problem of image recognition. Firstly, a matching matrix whose entries
represent the similarities between words is constructed and viewed as an image.
Then a convolutional neural network is utilized to capture rich matching
patterns in a layer-by-layer way. We show that by resembling the compositional
hierarchies of patterns in image recognition, our model can successfully
identify salient signals such as n-gram and n-term matchings. Experimental
results demonstrate its superiority against the baselines.