SAPPHIRE: Simple Aligner for Phrasal Paraphrase with Hierarchical Representation

We present SAPPHIRE, a Simple Aligner for Phrasal Paraphrase with HIerarchical REpresentation. Monolingual phrase alignment is a fundamental problem in natural language understanding and also a crucial technique in various applications such as natural language inference and semantic textual similarity assessment. Previous methods for monolingual phrase alignment are language-resource intensive; they require large-scale synonym/paraphrase lexica and high-quality parsers. Different from them, SAPPHIRE depends only on a monolingual corpus to train word embeddings. Therefore, it is easily transferable to specific domains and different languages. Specifically, SAPPHIRE first obtains word alignments using pre-trained word embeddings and then expands them to phrase alignments by bilingual phrase extraction methods. To estimate the likelihood of phrase alignments, SAPPHIRE uses phrase embeddings that are hierarchically composed of word embeddings. Finally, SAPPHIRE searches for a set of consistent phrase alignments on a lattice of phrase alignment candidates. It achieves search-efficiency by constraining the lattice so that all the paths go through a phrase alignment pair with the highest alignment score. Experimental results using the standard dataset for phrase alignment evaluation show that SAPPHIRE outperforms the previous method and establishes the state-of-the-art performance.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here