Document Embedding
25 papers with code • 0 benchmarks • 2 datasets
Benchmarks
These leaderboards are used to track progress in Document Embedding
Most implemented papers
Document Embedding with Paragraph Vectors
Paragraph Vectors has been recently proposed as an unsupervised method for learning distributed representations for pieces of texts.
An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation
Recently, Le and Mikolov (2014) proposed doc2vec as an extension to word2vec (Mikolov et al., 2013a) to learn document-level embeddings.
BERTopic: Neural topic modeling with a class-based TF-IDF procedure
BERTopic generates coherent topics and remains competitive across a variety of benchmarks involving classical models and those that follow the more recent clustering approach of topic modeling.
Sentiment Classification Using Document Embeddings Trained with Cosine Similarity
In document-level sentiment classification, each document must be mapped to a fixed length vector.
Neural Document Embeddings for Intensive Care Patient Mortality Prediction
We present an automatic mortality prediction scheme based on the unstructured textual content of clinical notes.
hyperdoc2vec: Distributed Representations of Hypertext Documents
Hypertext documents, such as web pages and academic papers, are of great importance in delivering information in our daily life.
Word Mover's Embedding: From Word2Vec to Document Embedding
While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings.
Learning Outside the Box: Discourse-level Features Improve Metaphor Identification
Most current approaches to metaphor identification use restricted linguistic contexts, e. g. by considering only a verb's arguments or the sentence containing a phrase.
Crosslingual Document Embedding as Reduced-Rank Ridge Regression
Finally, although not trained for embedding sentences and words, it also achieves competitive performance on crosslingual sentence and word retrieval tasks.
Unsupervised Learning of Discourse-Aware Text Representation for Essay Scoring
Existing document embedding approaches mainly focus on capturing sequences of words in documents.