Search Results for author: Maxat Tezekbayev

Found 7 papers, 5 papers with code

Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

no code implementations19 Oct 2023 Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhenisbek Assylbekov

In the novel framework, the hardness of a class is usually quantified by the variance of the gradient with respect to a random choice of a target function.

Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

1 code implementation2 Oct 2023 Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhibek Kadyrsizova, Zhenisbek Assylbekov

The discrete logarithm problem is a fundamental challenge in number theory with significant implications for cryptographic protocols.

Long-Tail Theory under Gaussian Mixtures

1 code implementation20 Jul 2023 Arman Bolatov, Maxat Tezekbayev, Igor Melnykov, Artur Pak, Vassilina Nikoulina, Zhenisbek Assylbekov

We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020).

Memorization

Autoencoders for a manifold learning problem with a Jacobian rank constraint

1 code implementation25 Jun 2023 Rustem Takhanov, Y. Sultan Abylkairov, Maxat Tezekbayev

This constraint is included in the objective function as a new term, namely a squared Ky-Fan $k$-antinorm of the Jacobian function.

Speeding Up Entmax

1 code implementation Findings (NAACL) 2022 Maxat Tezekbayev, Vassilina Nikoulina, Matthias Gallé, Zhenisbek Assylbekov

Softmax is the de facto standard in modern neural networks for language processing when it comes to normalizing logits.

Machine Translation Text Generation +1

The Rediscovery Hypothesis: Language Models Need to Meet Linguistics

no code implementations2 Mar 2021 Vassilina Nikoulina, Maxat Tezekbayev, Nuradil Kozhakhmet, Madina Babazhanova, Matthias Gallé, Zhenisbek Assylbekov

In this paper, we study whether linguistic knowledge is a necessary condition for the good performance of modern language models, which we call the \textit{rediscovery hypothesis}.

Language Modelling

Semantics- and Syntax-related Subvectors in the Skip-gram Embeddings

1 code implementation23 Dec 2019 Maxat Tezekbayev, Zhenisbek Assylbekov, Rustem Takhanov

We show that the skip-gram embedding of any word can be decomposed into two subvectors which roughly correspond to semantic and syntactic roles of the word.

Cannot find the paper you are looking for? You can Submit a new open access paper.