Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees

TACL 2016 Ehsan ShareghiMatthias PetriGholamreza HaffariTrevor Cohn

Efficient methods for storing and querying are critical for scaling high-order n-gram language models to large corpora. We propose a language model based on compressed suffix trees, a representation that is highly compact and can be easily held in memory, while supporting queries needed in computing language model probabilities on-the-fly... (read more)

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.