Search Results for author: Zhenisbek Assylbekov

Found 18 papers, 10 papers with code

Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

no code implementations • 19 Oct 2023 • Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhenisbek Assylbekov

In the novel framework, the hardness of a class is usually quantified by the variance of the gradient with respect to a random choice of a target function.

Paper
Add Code

Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

1 code implementation • 2 Oct 2023 • Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhibek Kadyrsizova, Zhenisbek Assylbekov

The discrete logarithm problem is a fundamental challenge in number theory with significant implications for cryptographic protocols.

Paper
Code

Long-Tail Theory under Gaussian Mixtures

1 code implementation • 20 Jul 2023 • Arman Bolatov, Maxat Tezekbayev, Igor Melnykov, Artur Pak, Vassilina Nikoulina, Zhenisbek Assylbekov

We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020).

Memorization

Paper
Code

From Hyperbolic Geometry Back to Word Embeddings

1 code implementation • RepL4NLP (ACL) 2022 • Sultan Nurmukhamedov, Thomas Mach, Arsen Sheverdin, Zhenisbek Assylbekov

We choose random points in the hyperbolic disc and claim that these points are already word representations.

Word Embeddings

Paper
Code

Speeding Up Entmax

1 code implementation • Findings (NAACL) 2022 • Maxat Tezekbayev, Vassilina Nikoulina, Matthias Gallé, Zhenisbek Assylbekov

Softmax is the de facto standard in modern neural networks for language processing when it comes to normalizing logits.

Machine Translation Text Generation +1

Paper
Code

The Rediscovery Hypothesis: Language Models Need to Meet Linguistics

no code implementations • 2 Mar 2021 • Vassilina Nikoulina, Maxat Tezekbayev, Nuradil Kozhakhmet, Madina Babazhanova, Matthias Gallé, Zhenisbek Assylbekov

In this paper, we study whether linguistic knowledge is a necessary condition for the good performance of modern language models, which we call the \textit{rediscovery hypothesis}.

Language Modelling

Paper
Add Code

Squashed Shifted PMI Matrix: Bridging Word Embeddings and Hyperbolic Spaces

2 code implementations • 27 Feb 2020 • Zhenisbek Assylbekov, Alibi Jangeldin

We show that removing sigmoid transformation in the skip-gram with negative sampling (SGNS) objective does not harm the quality of word vectors significantly and at the same time is related to factorizing a squashed shifted PMI matrix which, in turn, can be treated as a connection probabilities matrix of a random graph.

Clustering Word Embeddings

Paper
Code

Semantics- and Syntax-related Subvectors in the Skip-gram Embeddings

1 code implementation • 23 Dec 2019 • Maxat Tezekbayev, Zhenisbek Assylbekov, Rustem Takhanov

We show that the skip-gram embedding of any word can be decomposed into two subvectors which roughly correspond to semantic and syntactic roles of the word.

Paper
Code

A Critique of the Smooth Inverse Frequency Sentence Embeddings

no code implementations • 30 Sep 2019 • Aidana Karipbayeva, Alena Sorokina, Zhenisbek Assylbekov

We critically review the smooth inverse frequency sentence embedding method of Arora, Liang, and Ma (2017), and show inconsistencies in its setup, derivation, and evaluation.

Sentence Sentence Embedding +1

Paper
Add Code

Low-Rank Approximation of Matrices for PMI-based Word Embeddings

no code implementations • 21 Sep 2019 • Alena Sorokina, Aidana Karipbayeva, Zhenisbek Assylbekov

We perform an empirical evaluation of several methods of low-rank approximation in the problem of obtaining PMI-based word embeddings.

Word Embeddings

Paper
Add Code

Context Vectors are Reflections of Word Vectors in Half the Dimensions

no code implementations • 26 Feb 2019 • Zhenisbek Assylbekov, Rustem Takhanov

This paper takes a step towards theoretical analysis of the relationship between word embeddings and context embeddings in models such as word2vec.

Text Generation Word Embeddings

Paper
Add Code

Fourier Neural Networks: A Comparative Study

no code implementations • 8 Feb 2019 • Abylay Zhumekenov, Malika Uteuliyeva, Olzhas Kabdolov, Rustem Takhanov, Zhenisbek Assylbekov, Alejandro J. Castro

We review neural network architectures which were motivated by Fourier series and integrals and which are referred to as Fourier neural networks.

Paper
Add Code

Reproducing and Regularizing the SCRN Model

1 code implementation • COLING 2018 • Olzhas Kabdolov, Zhenisbek Assylbekov, Rustem Takhanov

We reproduce the Structurally Constrained Recurrent Network (SCRN) model, and then regularize it using the existing widespread techniques, such as naive dropout, variational dropout, and weight tying.

Language Modelling