A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for the discovery of hidden semantic structures in a text body.
Large corpora are ubiquitous in today’s world and memory quickly becomes the limiting factor in practical applications of the Vector Space Model (VSM).
Distributed dense word vectors have been shown to be effective at capturing token-level semantic and syntactic regularities in language, while topic models can form interpretable representations over documents.
In order to relieve burdens of software engineers without knowledge of Bayesian networks, Familia is able to conduct automatic parameter inference for a variety of topic models.
These two types of recommendations are referred to as substitutes and complements: substitutes are products that can be purchased instead of each other, while complements are products that can be purchased in addition to each other.
Distributed representations of documents and words have gained popularity due to their ability to capture semantics of words and documents.
LEMMATIZATION SEMANTIC SIMILARITY SEMANTIC TEXTUAL SIMILARITY TOPIC MODELS
We validate this framework on two very different text modelling applications, generative document modelling and supervised question answering.
Ranked #1 on
Question Answering
on QASent
ANSWER SELECTION LATENT VARIABLE MODELS TOPIC MODELS VARIATIONAL INFERENCE
They all cover the same content, but the linguistic differences make it impossible to use traditional, bag-of-word-based topic models.
TOPIC MODELS TRANSFER LEARNING VARIATIONAL INFERENCE ZERO-SHOT LEARNING
Topic models extract meaningful groups of words from documents, allowing for a better understanding of data.
To this end, we develop the Embedded Topic Model (ETM), a generative model of documents that marries traditional topic models with word embeddings.
Unlike topic models which typically assume independently generated words, word embedding models encourage words that appear in similar contexts to be located close to each other in the embedding space.
ASPECT EXTRACTION DOMAIN ADAPTATION TOPIC MODELS WORD EMBEDDINGS