1 code implementation • LREC 2022 • Marco Bombieri, Marco Rospocher, Simone Paolo Ponzetto, Paolo Fiorini
Robot-Assisted minimally invasive robotic surgery is the gold standard for the surgical treatment of many pathological conditions, and several manuals and academic papers describe how to perform these interventions.
no code implementations • RANLP 2021 • Javier Sánchez-Junquera, Paolo Rosso, Manuel Montes-y-Gómez, Simone Paolo Ponzetto
Hyperpartisan news show an extreme manipulation of reality based on an underlying and extreme ideological orientation.
1 code implementation • ParlaCLARIN (LREC) 2022 • Christopher Klamm, Ines Rehbein, Simone Paolo Ponzetto
In addition, we present a new annotated data set of parliamentary debates, following the coding schema of policy topics developed in the Comparative Agendas Project (CAP), and release models for topic classification in parliamentary debates.
no code implementations • EMNLP 2021 • Ines Rehbein, Simone Paolo Ponzetto, Anna Adendorf, Oke Bahnsen, Lukas Stoetzer, Heiner Stuckenschmidt
In this paper, we introduce the task of political coalition signal prediction from text, that is, the task of recognizing from the news coverage leading up to an election the (un)willingness of political parties to form a government coalition.
1 code implementation • 24 May 2024 • Jonas Belouadi, Simone Paolo Ponzetto, Steffen Eger
Creating high-quality scientific figures can be time-consuming and challenging, even though sketching ideas on paper is relatively easy.
1 code implementation • 8 Mar 2024 • Sotaro Takeshita, Simone Paolo Ponzetto, Kai Eckert
Keywords, that is, content-relevant words in summaries play an important role in efficient information conveyance, making it critical to assess if system-generated summaries contain such informative words during evaluation.
1 code implementation • 8 Mar 2024 • Sotaro Takeshita, Tommaso Green, Ines Reinig, Kai Eckert, Simone Paolo Ponzetto
Extensive efforts in the past have been directed toward the development of summarization datasets.
1 code implementation • 26 Jan 2024 • Marco Bombieri, Paolo Fiorini, Simone Paolo Ponzetto, Marco Rospocher
Large language models (LLMs) have recently revolutionized automated text understanding and generation.
no code implementations • 3 Nov 2023 • Gretel Liz De la Peña Sarracén, Paolo Rosso, Robert Litschko, Goran Glavaš, Simone Paolo Ponzetto
In this work, we resort to data augmentation and continual pre-training for domain adaptation to improve cross-lingual abusive language detection.
1 code implementation • 5 Aug 2023 • Yueling Li, Sebastian Martschat, Simone Paolo Ponzetto
We present a cross-domain approach for automated measurement and context extraction based on pre-trained language models.
1 code implementation • 13 Oct 2022 • Chia-Chien Hung, Anne Lauscher, Dirk Hovy, Simone Paolo Ponzetto, Goran Glavaš
Previous work showed that incorporating demographic factors can consistently improve performance for various NLP tasks with traditional NLP models.
1 code implementation • sdp (COLING) 2022 • Tornike Tsereteli, Yavuz Selim Kartal, Simone Paolo Ponzetto, Andrea Zielinski, Kai Eckert, Philipp Mayr
In this paper, we provide an overview of the SV-Ident shared task as part of the 3rd Workshop on Scholarly Document Processing (SDP) at COLING 2022.
Ranked #1 on Variable Disambiguation on SV-Ident
no code implementations • 14 Sep 2022 • Yavuz Selim Kartal, Sotaro Takeshita, Tornike Tsereteli, Kai Eckert, Henning Kroll, Philipp Mayr, Simone Paolo Ponzetto, Benjamin Zapilko, Andrea Zielinski
Nowadays there is a growing trend in many scientific disciplines to support researchers by providing enhanced information access through linking of publications and underlying datasets, so as to support research with infrastructure to enhance reproducibility and reusability of research results.
no code implementations • 1 Aug 2022 • Tommaso Green, Simone Paolo Ponzetto, Goran Glavaš
While pretrained language models (PLMs) primarily serve as general-purpose text encoders that can be fine-tuned for a wide variety of downstream tasks, recent work has shown that they can also be rewired to produce high-quality word representations (i. e., static word embeddings) and yield good performance in type-level lexical tasks.
1 code implementation • 1 Aug 2022 • Chia-Chien Hung, Anne Lauscher, Dirk Hovy, Simone Paolo Ponzetto, Goran Glavaš
We adapt the language representations for the sociodemographic dimensions of gender and age, using continuous language modeling and dynamic multi-task learning for adaptation, where we couple language modeling with the prediction of a sociodemographic class.
1 code implementation • 30 May 2022 • Sotaro Takeshita, Tommaso Green, Niklas Friedrich, Kai Eckert, Simone Paolo Ponzetto
The number of scientific publications nowadays is rapidly increasing, causing information overload for researchers and making it hard for scholars to keep up to date with current trends and lines of work.
3 code implementations • NAACL (MIA) 2022 • Chia-Chien Hung, Tommaso Green, Robert Litschko, Tornike Tsereteli, Sotaro Takeshita, Marco Bombieri, Goran Glavaš, Simone Paolo Ponzetto
This paper introduces our proposed system for the MIA Shared Task on Cross-lingual Open-retrieval Question Answering (COQA).
1 code implementation • NAACL 2022 • Chia-Chien Hung, Anne Lauscher, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
We then introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks.
1 code implementation • ACL 2022 • Carolin Holtermann, Anne Lauscher, Simone Paolo Ponzetto
We employ our resource to assess the effect of argumentative fine-tuning and debiasing on the intrinsic bias found in transformer-based language models using a lightweight adapter-based approach that is more sustainable and parameter-efficient than full fine-tuning.
no code implementations • 9 Mar 2022 • Patrizio Bellan, Han van der Aa, Mauro Dragoni, Chiara Ghidini, Simone Paolo Ponzetto
Therefore, to bridge this gap, we present the PET dataset, a first corpus of business process descriptions annotated with activities, gateways, actors, and flow information.
1 code implementation • 21 Dec 2021 • Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
In this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a number of diverse language pairs.
1 code implementation • 15 Oct 2021 • Chia-Chien Hung, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš
Recent work has shown that self-supervised dialog-specific pretraining on large conversational datasets yields substantial gains over traditional language modeling (LM) pretraining in downstream task-oriented dialog (TOD).
2 code implementations • 7 Oct 2021 • Patrizio Bellan, Mauro Dragoni, Chiara Ghidini, Han van der Aa, Simone Paolo Ponzetto
The extraction of process models from text refers to the problem of turning the information contained in an unstructured textual process descriptions into a formal representation, i. e., a process model.
1 code implementation • 13 Aug 2021 • Tobias Walter, Celina Kirschner, Steffen Eger, Goran Glavaš, Anne Lauscher, Simone Paolo Ponzetto
We analyze bias in historical corpora as encoded in diachronic distributional semantic models by focusing on two specific forms of bias, namely a political (i. e., anti-communism) and racist (i. e., antisemitism) one.
no code implementations • 4 May 2021 • Petar Ristoski, Stefano Faralli, Simone Paolo Ponzetto, Heiko Paulheim
Taxonomies are an important ingredient of knowledge organization, and serve as a backbone for more sophisticated knowledge representations in intelligent systems, such as formal ontologies.
2 code implementations • EACL 2021 • Niklas Friedrich, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš
In this work, we present DebIE, the first integrated platform for (1) measuring and (2) mitigating bias in word embeddings.
1 code implementation • EACL 2021 • Bilal Ghanem, Simone Paolo Ponzetto, Paolo Rosso, Francisco Rangel
To capture this, we propose in this paper to model the flow of affective information in fake news articles using a neural architecture.
1 code implementation • 21 Jan 2021 • Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
Therefore, in this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a large number of language pairs.
no code implementations • 21 Dec 2020 • Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš, Shigeo Morishima
Providing visual summaries of scientific publications can increase information access for readers and thereby help deal with the exponential growth in the number of scientific publications.
no code implementations • SEMEVAL 2020 • Goran Glava{\v{s}}, Ivan Vuli{\'c}, Anna Korhonen, Simone Paolo Ponzetto
The shared task spans three dimensions: (1) monolingual vs. cross-lingual LE, (2) binary vs. graded LE, and (3) a set of 6 diverse languages (and 15 corresponding language pairs).
no code implementations • COLING (WANLP) 2020 • Anne Lauscher, Rafik Takieddin, Simone Paolo Ponzetto, Goran Glavaš
Our analysis yields several interesting findings, e. g., that implicit gender bias in embeddings trained on Arabic news corpora steadily increases over time (between 2007 and 2017).
no code implementations • LREC 2020 • Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko
We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings by Grave et al. (2018), enabling WSD in these languages.
no code implementations • CONLL 2019 • Gavin Abercrombie, Federico Nanni, Riza Batista-Navarro, Simone Paolo Ponzetto
Debate motions (proposals) tabled in the UK Parliament contain information about the stated policy preferences of the Members of Parliament who propose them, and are key to the analysis of all subsequent speeches given in response to them.
no code implementations • IJCNLP 2019 • Fabian David Schmidt, Markus Dietsche, Simone Paolo Ponzetto, Goran Glava{\v{s}}
We introduce Seagle, a platform for comparative evaluation of semantic text encoding models on information retrieval (IR) tasks.
no code implementations • 15 Oct 2019 • Bilal Ghanem, Simone Paolo Ponzetto, Paolo Rosso
We present an approach to detect fake news in Twitter at the account level using a neural recurrent model and a variety of different semantic and stylistic features.
4 code implementations • 13 Sep 2019 • Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić
Moreover, we successfully transfer debiasing models, by means of cross-lingual embedding spaces, and remove or attenuate biases in distributional word vector spaces of languages that lack readily available bias specifications.
no code implementations • ACL 2019 • Ivan Vuli{\'c}, Simone Paolo Ponzetto, Goran Glava{\v{s}}
Starting from HyperLex, the only available GR-LE dataset in English, we construct new monolingual GR-LE datasets for three other languages, and combine those to create a set of six cross-lingual GR-LE datasets termed CL-HYPERLEX.
no code implementations • ACL 2019 • Goran Glava{\v{s}}, Federico Nanni, Simone Paolo Ponzetto
Political scientists created resources and used available NLP methods to process textual data largely in isolation from the NLP community.
no code implementations • 11 Jun 2019 • Javier Sánchez-Junquera, Paolo Rosso, Manuel Montes-y-Gómez, Simone Paolo Ponzetto
We present experiments on detecting hyperpartisanship in news using a 'masking' method that allows us to assess the role of style vs. content for the task at hand.
1 code implementation • SEMEVAL 2019 • Saba Anwar, Dmitry Ustalov, Nikolay Arefyev, Simone Paolo Ponzetto, Chris Biemann, Alexander Panchenko
We present our system for semantic frame induction that showed the best performance in Subtask B. 1 and finished as the runner-up in Subtask A of the SemEval 2019 Task 2 on unsupervised semantic frame induction (QasemiZadeh et al., 2019).
no code implementations • 18 Apr 2019 • Lydia Weiland, Ioana Hulpus, Simone Paolo Ponzetto, Wolfgang Effelsberg, Laura Dietz
We investigate the problem of understanding the message (gist) conveyed by images and their captions as found, for instance, on websites or news articles.
2 code implementations • 12 Apr 2019 • Federico Nanni, Goran Glavas, Ines Rehbein, Simone Paolo Ponzetto, Heiner Stuckenschmidt
During the last fifteen years, automatic text scaling has become one of the key tools of the Text as Data community in political science.
no code implementations • 8 Apr 2019 • Marco Rovera, Federico Nanni, Simone Paolo Ponzetto
The progressive digitization of historical archives provides new, often domain specific, textual resources that report on facts and events which have happened in the past; among these, memoirs are a very common type of primary source.
no code implementations • WS 2018 • Anne Lauscher, Goran Glava{\v{s}}, Simone Paolo Ponzetto
We analyze the annotated argumentative structures and investigate the relations between argumentation and other rhetorical aspects of scientific writing, such as discourse roles and citation contexts.
no code implementations • EMNLP 2018 • Anne Lauscher, Goran Glava{\v{s}}, Simone Paolo Ponzetto, Kai Eckert
Exponential growth in the number of scientific publications yields the need for effective automatic analysis of rhetorical aspects of scientific writing.
1 code implementation • 17 Sep 2018 • Dmitry Ustalov, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto
In this paper, we show how unsupervised sense representations can be used to improve hypernymy extraction.
2 code implementations • CL 2019 • Dmitry Ustalov, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto
We present a detailed theoretical and computational analysis of the Watset meta-algorithm for fuzzy graph clustering, which has been found to be widely applicable in a variety of domains.
no code implementations • SEMEVAL 2018 • Thorsten Keiper, Zhonghao Lyu, Sara Pooladzadeh, Yuan Xu, Jingyi Zhang, Anne Lauscher, Simone Paolo Ponzetto
Large repositories of scientific literature call for the development of robust methods to extract information from scholarly papers.
1 code implementation • ACL 2018 • Dmitry Ustalov, Alexander Panchenko, Andrei Kutuzov, Chris Biemann, Simone Paolo Ponzetto
We use dependency triples automatically extracted from a Web-scale corpus to perform unsupervised semantic frame induction.
1 code implementation • 2 May 2018 • Robert Litschko, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) which requires no bilingual data at all.
1 code implementation • LREC 2018 • Dmitry Ustalov, Denis Teslenko, Alexander Panchenko, Mikhail Chernoskutov, Chris Biemann, Simone Paolo Ponzetto
The sparse mode uses the traditional vector space model to estimate the most similar word sense corresponding to its context.
no code implementations • LREC 2018 • Stefano Faralli, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto
We introduce a new lexical resource that enriches the Framester knowledge graph, which links Framnet, WordNet, VerbNet and other resources, with semantic features from text corpora.
1 code implementation • 19 Jan 2018 • Goran Glavaš, Marc Franco-Salvador, Simone Paolo Ponzetto, Paolo Rosso
In contrast, we propose an unsupervised and a very resource-light approach for measuring semantic similarity between texts in different languages.
Cross-Lingual Information Retrieval Cross-Lingual Semantic Textual Similarity +9
no code implementations • 23 Dec 2017 • Chris Biemann, Stefano Faralli, Alexander Panchenko, Simone Paolo Ponzetto
While both kinds of semantic resources are available with high lexical coverage, our aligned resource combines the domain specificity and availability of contextual information from distributional models with the conciseness and high quality of manually crafted lexical networks.
no code implementations • LREC 2018 • Alexander Panchenko, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann
We present DepCC, the largest-to-date linguistically analyzed corpus in English including 365 million documents, composed of 252 billion tokens and 7. 5 billion of named entity occurrences in 14. 3 billion sentences from a web-scale crawl of the \textsc{Common Crawl} project.
no code implementations • WS 2017 • Sanja {\v{S}}tajner, Victoria Yaneva, Ruslan Mitkov, Simone Paolo Ponzetto
Eye tracking studies from the past few decades have shaped the way we think of word complexity and cognitive load: words that are long, rare and ambiguous are more difficult to read.
no code implementations • EMNLP 2017 • Stefano Menini, Federico Nanni, Simone Paolo Ponzetto, Sara Tonelli
We present a topic-based analysis of agreement and disagreement in political manifestos, which relies on a new method for topic detection based on key concept clustering.
no code implementations • EMNLP 2017 • Goran Glava{\v{s}}, Simone Paolo Ponzetto
Detection of lexico-semantic relations is one of the central tasks of computational semantics.
no code implementations • WS 2017 • Goran Glava{\v{s}}, Federico Nanni, Simone Paolo Ponzetto
In this paper, we propose an approach for cross-lingual topical coding of sentences from electoral manifestos of political parties in different languages.
1 code implementation • EMNLP 2017 • Alexander Panchenko, Fide Marten, Eugen Ruppert, Stefano Faralli, Dmitry Ustalov, Simone Paolo Ponzetto, Chris Biemann
In word sense disambiguation (WSD), knowledge-based systems tend to be much more interpretable than knowledge-free counterparts as they rely on the wealth of manually-encoded elements representing word senses, such as hypernyms, usage examples, and images.
1 code implementation • ACL 2017 • Sergiu Nisioi, Sanja {\v{S}}tajner, Simone Paolo Ponzetto, Liviu P. Dinu
Unlike the previously proposed automated TS systems, our neural text simplification (NTS) systems are able to simultaneously perform lexical simplification and content reduction.
Ranked #14 on Text Simplification on TurkCorpus
no code implementations • ACL 2017 • Sanja {\v{S}}tajner, Marc Franco-Salvador, Simone Paolo Ponzetto, Paolo Rosso, Heiner Stuckenschmidt
We provide several methods for sentence-alignment of texts with different complexity levels.
no code implementations • WS 2017 • Alex Panchenko, er, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann
We introduce a new method for unsupervised knowledge-based word sense disambiguation (WSD) based on a resource that links two types of sense-aware lexical networks: one is induced from a corpus using distributional semantics, the other is manually constructed.
no code implementations • EACL 2017 • Patrick Klein, Simone Paolo Ponzetto, Goran Glava{\v{s}}
We exploit multilingual synsets from BabelNet to translate English triples to other languages and then augment the reference knowledge base with cross-lingual triples.
1 code implementation • EACL 2017 • Goran Glava{\v{s}}, Federico Nanni, Simone Paolo Ponzetto
Political text scaling aims to linearly order parties and politicians across political dimensions (e. g., left-to-right ideology) based on textual content (e. g., politician speeches or party manifestos).
no code implementations • EACL 2017 • Stefano Faralli, Alex Panchenko, er, Chris Biemann, Simone Paolo Ponzetto
In this paper, we present ContrastMedium, an algorithm that transforms noisy semantic networks into full-fledged, clean taxonomies.
no code implementations • EACL 2017 • Alex Panchenko, er, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann
On the example of word sense induction and disambiguation (WSID), we show that it is possible to develop an interpretable model that matches the state-of-the-art models in accuracy.
no code implementations • SEMEVAL 2016 • Alex Panchenko, er, Stefano Faralli, Eugen Ruppert, Steffen Remus, Hubert Naets, C{\'e}drick Fairon, Simone Paolo Ponzetto, Chris Biemann
no code implementations • LREC 2016 • Julian Seitner, Christian Bizer, Kai Eckert, Stefano Faralli, Robert Meusel, Heiko Paulheim, Simone Paolo Ponzetto
Hypernymy relations (those where an hyponym term shares a {``}isa{''} relationship with his hypernym) play a key role for many Natural Language Processing (NLP) tasks, e. g. ontology learning, automatically building or extending knowledge bases, or word sense disambiguation and induction.
no code implementations • LREC 2014 • Gregor Titze, Volha Bryl, C{\"a}cilia Zirn, Simone Paolo Ponzetto
We present an approach for augmenting DBpedia, a very large ontology lying at the heart of the Linked Open Data (LOD) cloud, with domain information.