Word Embeddings

1108 papers with code • 0 benchmarks • 52 datasets

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Techniques for learning word embeddings can include Word2Vec, GloVe, and other neural network-based approaches that train on an NLP task such as language modeling or document classification.

( Image credit: Dynamic Word Embedding for Evolving Semantic Discovery )

Latest papers with no code

Machine Learning to Promote Translational Research: Predicting Patent and Clinical Trial Inclusion in Dementia Research

no code yet • 10 Jan 2024

Projected to impact 1. 6 million people in the UK by 2040 and costing {\pounds}25 billion annually, dementia presents a growing challenge to society.

Estimating Text Similarity based on Semantic Concept Embeddings

no code yet • 9 Jan 2024

Due to their ease of use and high accuracy, Word2Vec (W2V) word embeddings enjoy great success in the semantic representation of words, sentences, and whole documents as well as for semantic similarity estimation.

MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer

no code yet • 9 Jan 2024

In this paper, we introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer), a novel and challenging task that is especially relevant to low-resource languages for which static word embeddings are available.

An Analysis of Embedding Layers and Similarity Scores using Siamese Neural Networks

no code yet • 31 Dec 2023

Using medical data, we have analyzed similarity scores of each embedding layer, observing differences in performance among each algorithm.

Effect of dimensionality change on the bias of word embeddings

no code yet • 28 Dec 2023

First, there is a significant variation in the bias of word embeddings with the dimensionality change.

Zur Darstellung eines mehrstufigen Prototypbegriffs in der multilingualen automatischen Sprachgenerierung: vom Korpus über word embeddings bis hin zum automatischen Wörterbuch

no code yet • 26 Dec 2023

The multilingual dictionary of noun valency Portlex is considered to be the trigger for the creation of the automatic language generators Xera and Combinatoria, whose development and use is presented in this paper.

Multi-level biomedical NER through multi-granularity embeddings and enhanced labeling

no code yet • 24 Dec 2023

These results illustrate the proficiency of our proposed model in performing biomedical Named Entity Recognition.

Diffusion-EXR: Controllable Review Generation for Explainable Recommendation via Diffusion Models

no code yet • 24 Dec 2023

Denoising Diffusion Probabilistic Model (DDPM) has shown great competence in image and audio generation tasks.

Multi-Modal Cognitive Maps based on Neural Networks trained on Successor Representations

no code yet • 22 Dec 2023

Cognitive maps, as represented by the entorhinal-hippocampal complex in the brain, organize and retrieve context from memories, suggesting that large language models (LLMs) like ChatGPT could harness similar architectures to function as a high-level processing center, akin to how the hippocampus operates within the cortex hierarchy.

Disentangling continuous and discrete linguistic signals in transformer-based sentence embeddings

no code yet • 18 Dec 2023

We explore whether we can compress transformer-based sentence embeddings into a representation that separates different linguistic signals -- in particular, information relevant to subject-verb agreement and verb alternations.