no code implementations • 17 Feb 2025 • Sherzod Hakimov, Lara Pfennigschmidt, David Schlangen
This study utilizes the game Codenames as a benchmarking tool to evaluate large language models (LLMs) with respect to specific linguistic and cognitive skills.
no code implementations • 17 Feb 2025 • Jonathan Jordan, Sherzod Hakimov, David Schlangen
Large language models (LLMs) have risen to prominence as 'chatbots' for users to interact via natural language.
no code implementations • 17 Sep 2024 • Chalamalasetti Kranti, Sherzod Hakimov, David Schlangen
While there has been a lot of research recently on robots in household environments, at the present time, most robots in existence can be found on shop floors, and most interactions between humans and robots happen there.
no code implementations • 1 Jul 2024 • Yi-Sheng Hsu, Nils Feldhus, Sherzod Hakimov
Free-text rationales justify model decisions in natural language and thus become likable and accessible among approaches to explanation across many tasks.
no code implementations • 25 Jun 2024 • Chalamalasetti Kranti, Sherzod Hakimov, David Schlangen
In the Minecraft Collaborative Building Task, two players collaborate: an Architect (A) provides instructions to a Builder (B) to assemble a specified structure using 3D blocks.
no code implementations • 20 Jun 2024 • Sherzod Hakimov, Yerkezhan Abdullayeva, Kushal Koshti, Antonia Schmidt, Yan Weiser, Anne Beyer, David Schlangen
On further analysis, we find that the exceptional deep captioning capabilities of the largest models drive some of the performance.
no code implementations • 20 Jun 2024 • Nidhir Bhavsar, Jonathan Jordan, Sherzod Hakimov, David Schlangen
But what makes the model perform well?
no code implementations • 31 May 2024 • Anne Beyer, Kranti Chalamalasetti, Sherzod Hakimov, Brielen Madureira, Philipp Sadler, David Schlangen
In this paper, we take one of the proposed frameworks for setting up such game-play environments, and further test its usefulness as an evaluation instrument, along a number of dimensions: We show that it can easily keep up with new developments while avoiding data contamination, we show that the tests implemented within it are not yet saturated (human performance is substantially higher than that of even the best models), and we show that it lends itself to investigating additional questions, such as the impact of the prompting language on performance.
1 code implementation • 2 Apr 2024 • Gaurish Thakkar, Sherzod Hakimov, Marko Tadić
In recent years, multimodal natural language processing, aimed at learning from diverse data types, has garnered significant attention.
1 code implementation • 26 Mar 2024 • Philipp Sadler, Sherzod Hakimov, David Schlangen
In collaborative goal-oriented settings, the participants are not only interested in achieving a successful outcome, but do also implicitly negotiate the effort they put into the interaction (by adapting to each other).
no code implementations • 7 Feb 2024 • Philipp Sadler, Sherzod Hakimov, David Schlangen
Albrecht and Stone (2018) state that modeling of changing behaviors remains an open problem "due to the essentially unconstrained nature of what other agents may do".
1 code implementation • 22 Jun 2023 • Sherzod Hakimov, Gullal S. Cheema
The ongoing Russo-Ukrainian conflict has been a subject of intense media coverage worldwide.
1 code implementation • 29 May 2023 • Sahar Tahmasebi, Sherzod Hakimov, Ralph Ewerth, Eric Müller-Budack
To reduce the bias and improve model generalization, we suggest training data augmentation to conduct more meaningful experiments for fake news detection on social media.
1 code implementation • 23 May 2023 • Sherzod Hakimov, David Schlangen
Specifically, we investigate the performance of open-source, open-access language models against GPT-3 on five vision-language tasks when given textually-encoded visual information.
1 code implementation • 22 May 2023 • Kranti Chalamalasetti, Jana Götze, Sherzod Hakimov, Brielen Madureira, Philipp Sadler, David Schlangen
Recent work has proposed a methodology for the systematic evaluation of "Situated Language Understanding Agents"-agents that operate in rich linguistic and non-linguistic contexts-through testing them in carefully constructed interactive settings.
1 code implementation • 22 May 2023 • Philipp Sadler, Sherzod Hakimov, David Schlangen
The ability to pick up on language signals in an ongoing interaction is crucial for future machine learning models to collaborate and interact with humans naturally.
1 code implementation • 15 Nov 2022 • Golsa Tahmasebzadeh, Eric Müller-Budack, Sherzod Hakimov, Ralph Ewerth
In this paper, a novel dataset called Multimodal Focus Location of News (MM-Locate-News) is introduced.
1 code implementation • Findings (NAACL) 2022 • Gullal S. Cheema, Sherzod Hakimov, Abdul Sittar, Eric Müller-Budack, Christian Otto, Ralph Ewerth
In recent years, the problem of misinformation on the web has become widespread across languages, countries, and various social media platforms.
1 code implementation • SemEval (NAACL) 2022 • Sherzod Hakimov, Gullal S. Cheema, Ralph Ewerth
The detection of offensive, hateful content on social media is a challenging problem that affects many online users on a daily basis.
1 code implementation • 9 Dec 2021 • Sherzod Hakimov, Ralph Ewerth
The detection of offensive, hateful and profane language has become a critical challenge since many users in social networks are exposed to cyberbullying activities on a daily basis.
1 code implementation • 16 Jun 2021 • Gullal S. Cheema, Sherzod Hakimov, Eric Müller-Budack, Ralph Ewerth
Opinion and sentiment analysis is a vital task to characterize subjective information in social media posts.
1 code implementation • 26 May 2021 • Hussain Kanafani, Junaid Ahmed Ghauri, Sherzod Hakimov, Ralph Ewerth
Our evaluation shows that we obtain state-of-the-art results on both datasets, while also highlighting the shortcomings of previous work with regard to the evaluation methodology.
1 code implementation • 30 Apr 2021 • Golsa Tahmasebzadeh, Endri Kacupaj, Eric Müller-Budack, Sherzod Hakimov, Jens Lehmann, Ralph Ewerth
The first module is a state-of-the-art model for geolocation estimation of images.
3 code implementations • 23 Apr 2021 • Junaid Ahmed Ghauri, Sherzod Hakimov, Ralph Ewerth
The proposed architecture utilizes an attention mechanism before fusing motion features and features representing the (static) visual content, i. e., derived from an image classification model.
Ranked #1 on
Supervised Video Summarization
on TvSum
1 code implementation • 17 Mar 2021 • Gullal S. Cheema, Sherzod Hakimov, Eric Müller-Budack, Ralph Ewerth
Fake news is a severe problem in social media.
1 code implementation • 9 Nov 2020 • Eric Müller-Budack, Matthias Springstein, Sherzod Hakimov, Kevin Mrutzek, Ralph Ewerth
Event classification can add valuable information for semantic search and the increasingly important topic of fact validation in news.
1 code implementation • 26 Oct 2020 • Junaid Ahmed Ghauri, Sherzod Hakimov, Ralph Ewerth
Our experiments investigate the impact of visual and temporal information, as well as the combination of multimodal features on importance prediction.
1 code implementation • 21 Jul 2020 • Gullal S. Cheema, Sherzod Hakimov, Ralph Ewerth
In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner.
no code implementations • 13 Jul 2020 • Golsa Tahmasebzadeh, Sherzod Hakimov, Eric Müller-Budack, Ralph Ewerth
Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords.
no code implementations • 6 Dec 2018 • Sherzod Hakimov, Soufian Jebbara, Philipp Cimiano
We address the task of answering simple questions, consisting in predicting the subject and predicate of a triple given a question.
1 code implementation • 26 Feb 2018 • Sherzod Hakimov, Soufian Jebbara, Philipp Cimiano
We present the first multilingual QALD pipeline that induces a model from training data for mapping a natural language question into logical form as probabilistic inference.