no code implementations • GWC 2016 • Pavel Braslavski, Dmitry Ustalov, Mikhail Mukhin, Yuri Kiselev
YARN (Yet Another RussNet), a project started in 2013, aims at creating a large open WordNet-like thesaurus for Russian by means of crowdsourcing.
no code implementations • GWC 2016 • Yuri Kiselev, Dmitry Ustalov, Sergey Porshnev
Collaboratively created lexical resources is a trending approach to creating high quality thesauri in a short time span at a remarkably low price.
1 code implementation • COLING (TextGraphs) 2022 • Marco Valentino, Deborah Ferreira, Mokanarangan Thayaparan, André Freitas, Dmitry Ustalov
In this summary paper, we present the results of the 1st edition of the NLPS task, providing a description of the evaluation data, and the participating systems.
1 code implementation • NAACL (TextGraphs) 2021 • Peter Jansen, Mokanarangan Thayaparan, Marco Valentino, Dmitry Ustalov
While previous editions of this shared task aimed to evaluate explanatory completeness – finding a set of facts that form a complete inference chain, without gaps, to arrive from question to correct answer, this 2021 instantiation concentrates on the subtask of determining relevance in large multi-hop explanations.
no code implementations • COLING (TextGraphs) 2020 • Peter Jansen, Dmitry Ustalov
In this second iteration of the explanation regeneration shared task, participants are supplied with more than double the training and evaluation data of the first shared task, as well as a knowledge base nearly double in size, both of which expand into more challenging scientific topics that increase the difficulty of the task.
1 code implementation • 15 Dec 2024 • Dmitry Ustalov
The rapid advancement of natural language processing (NLP) technologies, such as instruction-tuned large language models (LLMs), urges the development of modern evaluation protocols with human and machine feedback.
1 code implementation • 28 Sep 2023 • Dmitry Ustalov, Nikita Pavlichenko, Sergey Koshelev, Daniil Likhobaba, Alisa Smirnova
In this paper, we present Toloka Visual Question Answering, a new crowdsourced dataset allowing comparing performance of machine learning systems against human level of expertise in the grounding visual question answering task.
1 code implementation • 23 Sep 2022 • Nikita Pavlichenko, Dmitry Ustalov
Recent progress in generative models, especially in text-guided diffusion models, has enabled the production of aesthetically-pleasing imagery resembling the works of professional human artists.
1 code implementation • 21 Sep 2022 • Daniil Likhobaba, Daniil Fedulov, Dmitry Ustalov
Crowdsourcing allows running simple human intelligence tasks on a large crowd of workers, enabling solving problems for which it is difficult to formulate an algorithm or train a machine learning model in reasonable time.
1 code implementation • 2 Jul 2021 • Nikita Pavlichenko, Ivan Stelmakh, Dmitry Ustalov
The main obstacle towards designing aggregation methods for more advanced applications is the absence of training data, and in this work, we focus on bridging this gap in speech recognition.
no code implementations • NAACL 2021 • Alexey Drutsa, Dmitry Ustalov, Valentina Fedorova, Olga Megorskaya, Daria Baidakova
In this tutorial, we present a portion of unique industry experience in efficient natural language data annotation via crowdsourcing shared by both leading researchers and engineers from Yandex.
no code implementations • LREC 2020 • Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko
We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings by Grave et al. (2018), enabling WSD in these languages.
no code implementations • WS 2019 • Peter Jansen, Dmitry Ustalov
While automated question answering systems are increasingly able to retrieve answers to natural language questions, their ability to generate detailed human-readable explanations for their answers is still quite limited.
1 code implementation • SEMEVAL 2019 • Saba Anwar, Dmitry Ustalov, Nikolay Arefyev, Simone Paolo Ponzetto, Chris Biemann, Alexander Panchenko
We present our system for semantic frame induction that showed the best performance in Subtask B. 1 and finished as the runner-up in Subtask A of the SemEval 2019 Task 2 on unsupervised semantic frame induction (QasemiZadeh et al., 2019).
1 code implementation • 17 Sep 2018 • Dmitry Ustalov, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto
In this paper, we show how unsupervised sense representations can be used to improve hypernymy extraction.
2 code implementations • CL 2019 • Dmitry Ustalov, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto
We present a detailed theoretical and computational analysis of the Watset meta-algorithm for fuzzy graph clustering, which has been found to be widely applicable in a variety of domains.
1 code implementation • ACL 2018 • Dmitry Ustalov, Alexander Panchenko, Andrei Kutuzov, Chris Biemann, Simone Paolo Ponzetto
We use dependency triples automatically extracted from a Web-scale corpus to perform unsupervised semantic frame induction.
1 code implementation • LREC 2018 • Dmitry Ustalov, Denis Teslenko, Alexander Panchenko, Mikhail Chernoskutov, Chris Biemann, Simone Paolo Ponzetto
The sparse mode uses the traditional vector space model to estimate the most similar word sense corresponding to its context.
no code implementations • 15 Mar 2018 • Alexander Panchenko, Anastasiya Lopukhina, Dmitry Ustalov, Konstantin Lopukhin, Nikolay Arefyev, Alexey Leontyev, Natalia Loukachevitch
The paper describes the results of the first shared task on word sense induction (WSI) for the Russian language.
no code implementations • 15 Mar 2018 • Alexander Panchenko, Natalia Loukachevitch, Dmitry Ustalov, Denis Paperno, Christian Meyer, Natalia Konstantinova
The paper gives an overview of the Russian Semantic Similarity Evaluation (RUSSE) shared task held in conjunction with the Dialogue 2015 conference.
1 code implementation • LREC 2018 • Alexander Panchenko, Dmitry Ustalov, Stefano Faralli, Simone P. Ponzetto, Chris Biemann
In this paper, we show how distributionally-induced semantic classes can be helpful for extracting hypernyms.
no code implementations • 31 Aug 2017 • Alexander Panchenko, Dmitry Ustalov, Nikolay Arefyev, Denis Paperno, Natalia Konstantinova, Natalia Loukachevitch, Chris Biemann
On the one hand, humans easily make judgments about semantic relatedness.
no code implementations • 30 Aug 2017 • Dmitry Ustalov, Mikhail Chernoskutov, Chris Biemann, Alexander Panchenko
Graph-based synset induction methods, such as MaxMax and Watset, induce synsets by performing a global clustering of a synonymy graph.
1 code implementation • EMNLP 2017 • Alexander Panchenko, Fide Marten, Eugen Ruppert, Stefano Faralli, Dmitry Ustalov, Simone Paolo Ponzetto, Chris Biemann
In word sense disambiguation (WSD), knowledge-based systems tend to be much more interpretable than knowledge-free counterparts as they rely on the wealth of manually-encoded elements representing word senses, such as hypernyms, usage examples, and images.
1 code implementation • EACL 2017 • Dmitry Ustalov, Nikolay Arefyev, Chris Biemann, Alexander Panchenko
We present a new approach to extraction of hypernyms based on projection learning and word embeddings.
1 code implementation • ACL 2017 • Dmitry Ustalov, Alexander Panchenko, Chris Biemann
This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings.
no code implementations • 19 Aug 2014 • Dmitry Ustalov
Linguistic resources can be populated with data through the use of such approaches as crowdsourcing and gamification when motivated people are involved.