Document Embedding

22 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Latest papers with no code

Word Embeddings Revisited: Do LLMs Offer Something New?

no code yet • 16 Feb 2024

Learning meaningful word embeddings is key to training a robust language model.

KeyGen2Vec: Learning Document Embedding via Multi-label Keyword Generation in Question-Answering

no code yet • 30 Oct 2023

Interestingly, although in general the absolute advantage of learning embeddings through label supervision is highly positive across evaluation datasets, KeyGen2Vec is shown to be competitive with classifier that exploits topic label supervision in Yahoo!

The Role of Document Embedding in Research Paper Recommender Systems: To Breakdown or to Bolster Disciplinary Borders?

no code yet • 26 Sep 2023

In the extensive recommender systems literature, novelty and diversity have been identified as key properties of useful recommendations.

A Novel Method of Fuzzy Topic Modeling based on Transformer Processing

no code yet • 18 Sep 2023

Topic modeling is admittedly a convenient way to monitor markets trend.

Shuffle & Divide: Contrastive Learning for Long Text

no code yet • 19 Apr 2023

We propose a self-supervised learning method for long text documents based on contrastive learning.

Caching Historical Embeddings in Conversational Search

no code yet • 25 Nov 2022

Our achieved high cache hit rates significantly improve the responsiveness of conversational systems while likewise reducing the number of queries managed on the search back-end.

Assessing the trade-off between prediction accuracy and interpretability for topic modeling on energetic materials corpora

no code yet • 1 Jun 2022

With our accuracy results, we also introduce local interpretability model-agnostic explanations (LIME) of each prediction to provide a localized understanding of each prediction and to validate classifier decisions with our team of energetics experts.

Academic Resource Text Level Multi-label Classification based on Attention

no code yet • 21 Mar 2022

We propose an attention-based hierarchical multi-label classification algorithm of academic texts (AHMCA) by integrating features such as text, keywords, and hierarchical structure, the academic documents are classified into the most relevant categories.

Sentiment Analysis on Brazilian Portuguese User Reviews

no code yet • 10 Dec 2021

Sentiment Analysis is one of the most classical and primarily studied natural language processing tasks.

MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction

no code yet • ACL ARR November 2021

In this work, we propose a novel unsupervised embedding-based KPE approach, Masked Document Embedding Rank (MDERank), to address this problem by leveraging a mask strategy and ranking candidates by the similarity between embeddings of the source document and the masked document.