Word Embeddings

1106 papers with code • 0 benchmarks • 52 datasets

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Techniques for learning word embeddings can include Word2Vec, GloVe, and other neural network-based approaches that train on an NLP task such as language modeling or document classification.

( Image credit: Dynamic Word Embedding for Evolving Semantic Discovery )

Decoupled Textual Embeddings for Customized Image Generation

PrototypeNx/DETEX 19 Dec 2023

To decouple irrelevant attributes (i. e., background and pose) from the subject embedding, we further present several attribute mappers that encode each image as several image-specific subject-unrelated embeddings.

7
19 Dec 2023

Def2Vec: Extensible Word Embeddings from Dictionary Definitions

vincenzo-scotti/def_2_vec ICNLSP 2023

Def2Vec introduces a novel paradigm for word embeddings, leveraging dictionary definitions to learn semantic representations.

0
16 Dec 2023

Robust Concept Erasure via Kernelized Rate-Distortion Maximization

brcsomnath/kram NeurIPS 2023

Distributed representations provide a vector space that captures meaningful relationships between data instances.

1
30 Nov 2023

Quantifying the redundancy between prosody and text

lu-wo/quantifying-redundancy 28 Nov 2023

Using a large spoken corpus of English audiobooks, we extract prosodic features aligned to individual words and test how well they can be predicted from LLM embeddings, compared to non-contextual word embeddings.

8
28 Nov 2023

OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining

cisnlp/ofa 15 Nov 2023

Instead of pretraining multilingual language models from scratch, a more efficient method is to adapt existing pretrained language models (PLMs) to new languages via vocabulary extension and continued pretraining.

11
15 Nov 2023

Solving ARC visual analogies with neural embeddings and vector arithmetic: A generalized method

foger3/arc_deeplearning 14 Nov 2023

This project focuses on visual analogical reasoning and applies the initial generalized mechanism used to solve verbal analogies to the visual realm.

6
14 Nov 2023

How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure

clay-lab/structural-alternations 8 Nov 2023

We find that LLMs perform well in generalizing the distribution of a novel noun argument between related contexts that were seen during pre-training (e. g., the active object and passive subject of the verb spray), succeeding by making use of the semantically-organized structure of the embedding space for word embeddings.

0
08 Nov 2023

An Embedded Diachronic Sense Change Model with a Case Study from Ancient Greek

schyanzafar/edisc 1 Nov 2023

These models represent the senses of a given target word such as "kosmos" (meaning decoration, order or world) as distributions over context words, and sense prevalence as a distribution over senses.

2
01 Nov 2023

ProMap: Effective Bilingual Lexicon Induction via Language Model Prompting

4mekki4/promap 28 Oct 2023

We also demonstrate the effectiveness of ProMap in re-ranking results from other BLI methods such as with aligned static word embeddings.

0
28 Oct 2023

MLFMF: Data Sets for Machine Learning for Mathematical Formalization

ul-fmf/mlfmf-data NeurIPS 2023

The collection includes the largest Lean~4 library Mathlib, and some of the largest Agda libraries: the standard library, the library of univalent mathematics Agda-unimath, and the TypeTopology library.

7
24 Oct 2023