Word Embeddings

1096 papers with code • 0 benchmarks • 52 datasets

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Techniques for learning word embeddings can include Word2Vec, GloVe, and other neural network-based approaches that train on an NLP task such as language modeling or document classification.

( Image credit: Dynamic Word Embedding for Evolving Semantic Discovery )

Def2Vec: Extensible Word Embeddings from Dictionary Definitions

vincenzo-scotti/def_2_vec ICNLSP 2023

Def2Vec introduces a novel paradigm for word embeddings, leveraging dictionary definitions to learn semantic representations.

0
16 Dec 2023

Robust Concept Erasure via Kernelized Rate-Distortion Maximization

brcsomnath/kram NeurIPS 2023

Distributed representations provide a vector space that captures meaningful relationships between data instances.

0
30 Nov 2023

Quantifying the redundancy between prosody and text

lu-wo/quantifying-redundancy 28 Nov 2023

Using a large spoken corpus of English audiobooks, we extract prosodic features aligned to individual words and test how well they can be predicted from LLM embeddings, compared to non-contextual word embeddings.

7
28 Nov 2023

OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining

cisnlp/ofa 15 Nov 2023

Instead of pretraining multilingual language models from scratch, a more efficient method is to adapt existing pretrained language models (PLMs) to new languages via vocabulary extension and continued pretraining.

11
15 Nov 2023

Solving ARC visual analogies with neural embeddings and vector arithmetic: A generalized method

foger3/arc_deeplearning 14 Nov 2023

This project focuses on visual analogical reasoning and applies the initial generalized mechanism used to solve verbal analogies to the visual realm.

5
14 Nov 2023

How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure

clay-lab/structural-alternations 8 Nov 2023

We find that LLMs perform well in generalizing the distribution of a novel noun argument between related contexts that were seen during pre-training (e. g., the active object and passive subject of the verb spray), succeeding by making use of the semantically-organized structure of the embedding space for word embeddings.

0
08 Nov 2023

An Embedded Diachronic Sense Change Model with a Case Study from Ancient Greek

schyanzafar/edisc 1 Nov 2023

These models represent the senses of a given target word such as "kosmos" (meaning decoration, order or world) as distributions over context words, and sense prevalence as a distribution over senses.

2
01 Nov 2023

ProMap: Effective Bilingual Lexicon Induction via Language Model Prompting

4mekki4/promap 28 Oct 2023

We also demonstrate the effectiveness of ProMap in re-ranking results from other BLI methods such as with aligned static word embeddings.

0
28 Oct 2023

GARI: Graph Attention for Relative Isomorphism of Arabic Word Embeddings

asif6827/gari 19 Oct 2023

Bilingual Lexical Induction (BLI) is a core challenge in NLP, it relies on the relative isomorphism of individual embedding spaces.

1
19 Oct 2023

ChatGPT-guided Semantics for Zero-shot Learning

fhshubho/cgs-zsl 18 Oct 2023

Then, we enrich word vectors by combining the word embeddings from class names and descriptions generated by ChatGPT.

1
18 Oct 2023