Document Embedding
22 papers with code • 0 benchmarks • 2 datasets
Benchmarks
These leaderboards are used to track progress in Document Embedding
Most implemented papers
STRASS: A Light and Effective Method for Extractive Summarization Based on Sentence Embeddings
Our method creates an extractive summary by selecting the sentences with the closest embeddings to the document embedding.
Can x2vec Save Lives? Integrating Graph and Language Embeddings for Automatic Mental Health Classification
Visualizing graph embeddings annotated with predictions of potentially suicidal individuals shows the integrated model could classify such individuals even if they are positioned far from the support group.
Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks
We first build individual graphs for each document and then use GNN to learn the fine-grained word representations based on their local structures, which can also effectively produce embeddings for unseen words in the new document.
SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling
Canonical automatic summary evaluation metrics, such as ROUGE, focus on lexical similarity which cannot well capture semantics nor linguistic quality and require a reference summary which is costly to obtain.
Unsupervised Document Embedding via Contrastive Augmentation
We present a contrasting learning approach with data augmentation techniques to learn document representations in an unsupervised manner.
Multifaceted Domain-Specific Document Embeddings
Current document embeddings require large training corpora but fail to learn high-quality representations when confronted with a small number of domain-specific documents and rare terms.
Unsupervised Keyphrase Extraction by Jointly Modeling Local and Global Context
In terms of the local view, we first build a graph structure based on the document where phrases are regarded as vertices and the edges are similarities between vertices.
MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction
In this work, we propose a novel unsupervised embedding-based KPE approach, Masked Document Embedding Rank (MDERank), to address this problem by leveraging a mask strategy and ranking candidates by the similarity between embeddings of the source document and the masked document.
CODER: An efficient framework for improving retrieval through COntextual Document Embedding Reranking
Contrastive learning has been the dominant approach to training dense retrieval models.
Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings
Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge lies in creating positive and negative training samples that encode the desired similarity semantics.