Word Embeddings

1106 papers with code • 0 benchmarks • 52 datasets

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Techniques for learning word embeddings can include Word2Vec, GloVe, and other neural network-based approaches that train on an NLP task such as language modeling or document classification.

( Image credit: Dynamic Word Embedding for Evolving Semantic Discovery )

Benchmarks

Add a Result

These leaderboards are used to track progress in Word Embeddings

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

Decoupled Textual Embeddings for Customized Image Generation

PrototypeNx/DETEX • • 19 Dec 2023

To decouple irrelevant attributes (i. e., background and pose) from the subject embedding, we further present several attribute mappers that encode each image as several image-specific subject-unrelated embeddings.

19 Dec 2023

Paper
Code

Def2Vec: Extensible Word Embeddings from Dictionary Definitions

vincenzo-scotti/def_2_vec • ICNLSP 2023

Def2Vec introduces a novel paradigm for word embeddings, leveraging dictionary definitions to learn semantic representations.

16 Dec 2023

Paper
Code

Robust Concept Erasure via Kernelized Rate-Distortion Maximization

brcsomnath/kram • • NeurIPS 2023

Distributed representations provide a vector space that captures meaningful relationships between data instances.

30 Nov 2023

Paper
Code

Quantifying the redundancy between prosody and text

lu-wo/quantifying-redundancy • • 28 Nov 2023

Using a large spoken corpus of English audiobooks, we extract prosodic features aligned to individual words and test how well they can be predicted from LLM embeddings, compared to non-contextual word embeddings.

28 Nov 2023

Paper
Code

OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining

cisnlp/ofa • • 15 Nov 2023

Instead of pretraining multilingual language models from scratch, a more efficient method is to adapt existing pretrained language models (PLMs) to new languages via vocabulary extension and continued pretraining.

15 Nov 2023

Paper
Code

Solving ARC visual analogies with neural embeddings and vector arithmetic: A generalized method

foger3/arc_deeplearning • • 14 Nov 2023

This project focuses on visual analogical reasoning and applies the initial generalized mechanism used to solve verbal analogies to the visual realm.

14 Nov 2023

Paper
Code

How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure

clay-lab/structural-alternations • • 8 Nov 2023

We find that LLMs perform well in generalizing the distribution of a novel noun argument between related contexts that were seen during pre-training (e. g., the active object and passive subject of the verb spray), succeeding by making use of the semantically-organized structure of the embedding space for word embeddings.

08 Nov 2023

Paper
Code

An Embedded Diachronic Sense Change Model with a Case Study from Ancient Greek

schyanzafar/edisc • 1 Nov 2023

These models represent the senses of a given target word such as "kosmos" (meaning decoration, order or world) as distributions over context words, and sense prevalence as a distribution over senses.

01 Nov 2023

Paper
Code

ProMap: Effective Bilingual Lexicon Induction via Language Model Prompting

4mekki4/promap • 28 Oct 2023

We also demonstrate the effectiveness of ProMap in re-ranking results from other BLI methods such as with aligned static word embeddings.

28 Oct 2023

Paper
Code

MLFMF: Data Sets for Machine Learning for Mathematical Formalization

ul-fmf/mlfmf-data • NeurIPS 2023

The collection includes the largest Lean~4 library Mathlib, and some of the largest Agda libraries: the standard library, the library of univalent mathematics Agda-unimath, and the TypeTopology library.

24 Oct 2023

Paper
Code

Word Embeddings

Benchmarks Add a Result

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result