Search Results for author: Maxat Tezekbayev

Found 7 papers, 5 papers with code

Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

no code implementations • 19 Oct 2023 • Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhenisbek Assylbekov

In the novel framework, the hardness of a class is usually quantified by the variance of the gradient with respect to a random choice of a target function.

Paper
Add Code

Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

1 code implementation • 2 Oct 2023 • Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhibek Kadyrsizova, Zhenisbek Assylbekov

The discrete logarithm problem is a fundamental challenge in number theory with significant implications for cryptographic protocols.

Paper
Code

Long-Tail Theory under Gaussian Mixtures

1 code implementation • 20 Jul 2023 • Arman Bolatov, Maxat Tezekbayev, Igor Melnykov, Artur Pak, Vassilina Nikoulina, Zhenisbek Assylbekov

We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020).

Memorization

Paper
Code

Autoencoders for a manifold learning problem with a Jacobian rank constraint

1 code implementation • 25 Jun 2023 • Rustem Takhanov, Y. Sultan Abylkairov, Maxat Tezekbayev

This constraint is included in the objective function as a new term, namely a squared Ky-Fan $k$-antinorm of the Jacobian function.

Paper
Code

Speeding Up Entmax

1 code implementation • Findings (NAACL) 2022 • Maxat Tezekbayev, Vassilina Nikoulina, Matthias Gallé, Zhenisbek Assylbekov

Softmax is the de facto standard in modern neural networks for language processing when it comes to normalizing logits.

Machine Translation Text Generation +1

Paper
Code

The Rediscovery Hypothesis: Language Models Need to Meet Linguistics

no code implementations • 2 Mar 2021 • Vassilina Nikoulina, Maxat Tezekbayev, Nuradil Kozhakhmet, Madina Babazhanova, Matthias Gallé, Zhenisbek Assylbekov

In this paper, we study whether linguistic knowledge is a necessary condition for the good performance of modern language models, which we call the \textit{rediscovery hypothesis}.

Language Modelling

Paper
Add Code

Semantics- and Syntax-related Subvectors in the Skip-gram Embeddings

1 code implementation • 23 Dec 2019 • Maxat Tezekbayev, Zhenisbek Assylbekov, Rustem Takhanov

We show that the skip-gram embedding of any word can be decomposed into two subvectors which roughly correspond to semantic and syntactic roles of the word.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.