Leveraging Entanglement Entropy for Deep Understanding of Attention Matrix in Text Matching

25 Sep 2019 · Peng Zhang, Xiaoliu Mao, Xindian Ma, Benyou Wang, Jing Zhang, Jun Wang, Dawei Song ·

The formal understanding of deep learning has made great progress based on quantum many-body physics. For example, the entanglement entropy in quantum many-body systems can interpret the inductive bias of neural network and then guide the design of network structure and parameters for certain tasks. However, there are two unsolved problems in the current study of entanglement entropy, which limits its application potential. First, the theoretical benefits of entanglement entropy was only investigated in the representation of a single object (e.g., an image or a sentence), but has not been well studied in the matching of two objects (e.g., question-answering pairs). Second, the entanglement entropy can not be qualitatively calculated since the exponentially increasing dimension of the matching matrix. In this paper, we are trying to address these two problem by investigating the fundamental connections between the entanglement entropy and the attention matrix. We prove that by a mapping (via the trace operator) on the high-dimensional matching matrix, a low-dimensional attention matrix can be derived. Based on such a attention matrix, we can provide a feasible solution to the entanglement entropy that describes the correlation between the two objects in matching tasks. Inspired by the theoretical property of the entanglement entropy, we can design the network architecture adaptively in a typical text matching task, i.e., question-answering task.

PDF Abstract