1 code implementation • EMNLP 2021 • Miloš Stanojević, Shay B. Cohen
We describe two approaches to single-root dependency parsing that yield significant speed ups in such parsing.
no code implementations • CL (ACL) 2021 • Jiangming Liu, Shay B. Cohen, Mirella Lapata, Johan Bos
Abstract We consider the task of crosslingual semantic parsing in the style of Discourse Representation Theory (DRT) where knowledge from annotated corpora in a resource-rich language is transferred via bitext to guide learning in other languages.
1 code implementation • Findings (EMNLP) 2021 • Mohammad Javad Hosseini, Shay B. Cohen, Mark Johnson, Mark Steedman
In this paper, we introduce the new task of open-domain contextual link prediction which has access to both the textual context and the KG structure to perform link prediction.
no code implementations • 14 Jan 2025 • Yifu Qiu, Varun Embar, Yizhe Zhang, Navdeep Jaitly, Shay B. Cohen, Benjamin Han
Recent advancements in long-context language models (LCLMs) promise to transform Retrieval-Augmented Generation (RAG) by simplifying pipelines.
1 code implementation • 18 Nov 2024 • Weixian Waylon Li, Yftah Ziser, Yifei Xie, Shay B. Cohen, Tiejun Ma
TSPRank reframes the ranking problem as a Travelling Salesman Problem (TSP), a well-known combinatorial optimisation challenge that has been extensively studied for its numerous solution algorithms and applications.
no code implementations • 25 Oct 2024 • Zheng Zhao, Yftah Ziser, Shay B. Cohen
This study investigates the task-specific information encoded in pre-trained LLMs and the effects of instruction tuning on their representations across a diverse set of over 60 NLP tasks.
1 code implementation • 14 Oct 2024 • Joshua Ong Jun Leang, Aryo Pradipta Gema, Shay B. Cohen
Mathematical reasoning remains a significant challenge for large language models (LLMs), despite progress in prompting techniques such as Chain-of-Thought (CoT).
no code implementations • 14 Oct 2024 • Mengyu Wang, Shay B. Cohen, Tiejun Ma
The diffusion of financial news into market prices is a complex process, making it challenging to evaluate the connections between news events and market movements.
1 code implementation • 11 Oct 2024 • Tingchen Fu, Mrinank Sharma, Philip Torr, Shay B. Cohen, David Krueger, Fazl Barez
Preference learning is a central component for aligning current LLMs, but this process can be vulnerable to data poisoning attacks.
no code implementations • 20 Aug 2024 • Nickil Maveli, Antonio Vergari, Shay B. Cohen
Code-LLMs, LLMs pre-trained on large code corpora, have shown great progress in learning rich representations of the structure and syntax of code, successfully using it to generate or classify code fragments.
1 code implementation • 3 Jul 2024 • Guojun Wu, Shay B. Cohen, Rico Sennrich
We introduce a dataset comprising commercial machine translations, gathered weekly over six years across 12 translation directions.
1 code implementation • 31 May 2024 • Linus Ericsson, Miguel Espinosa, Chenhongyi Yang, Antreas Antoniou, Amos Storkey, Shay B. Cohen, Steven McDonagh, Elliot J. Crowley
Using this search space, we perform experiments to find novel architectures as well as improvements on existing ones on the diverse Unseen NAS datasets.
1 code implementation • 15 May 2024 • Yifu Qiu, Zheng Zhao, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen
Large language models (LLMs) often exhibit undesirable behaviours, such as generating untruthful or biased content.
1 code implementation • 20 Mar 2024 • Dongwei Jiang, Marcio Fonseca, Shay B. Cohen
Large language models (LLMs) often struggle with complex logical reasoning due to logical inconsistencies and the inherent difficulty of such reasoning.
1 code implementation • 11 Mar 2024 • Balint Gyevnar, Stephanie Droop, Tadeg Quillien, Shay B. Cohen, Neil R. Bramley, Christopher G. Lucas, Stefano V. Albrecht
Based on insights from cognitive science, we propose a framework of explanatory modes to analyze how people frame explanations, whether mechanistic, teleological, or counterfactual.
no code implementations • 23 Feb 2024 • Clement Neo, Shay B. Cohen, Fazl Barez
Understanding the inner workings of large language models (LLMs) is crucial for advancing their theoretical foundations and real-world applications.
no code implementations • 16 Feb 2024 • Ronald Cardenas, Matthias Galle, Shay B. Cohen
Extractive summaries are usually presented as lists of sentences with no expected cohesion between them.
no code implementations • 18 Jan 2024 • Marcio Fonseca, Shay B. Cohen
Also, we show that we can improve the controllability of LLMs with keyword-based classifier-free guidance (CFG) while achieving lexical overlap comparable to strong fine-tuned baselines on arXiv and PubMed.
1 code implementation • 3 Jan 2024 • Michelle Lo, Shay B. Cohen, Fazl Barez
This demonstrates that models exhibit polysemantic capacities and can blend old and new concepts in individual neurons.
1 code implementation • 6 Dec 2023 • Jonas Groschwitz, Shay B. Cohen, Lucia Donatelli, Meaghan Fowlie
We present the Granular AMR Parsing Evaluation Suite (GrAPES), a challenge set for Abstract Meaning Representation (AMR) parsing with accompanying evaluation metrics.
1 code implementation • 16 Nov 2023 • Yifu Qiu, Varun Embar, Shay B. Cohen, Benjamin Han
Knowledge-to-text generators often struggle to faithfully generate descriptions for the input facts: they may produce hallucinations that contradict the input, or describe facts not present in the input.
1 code implementation • 15 Nov 2023 • Marcio Fonseca, Shay B. Cohen
Although large language models (LLMs) exhibit remarkable capacity to leverage in-context demonstrations, it is still unclear to what extent they can learn new concepts or facts from ground-truth labels.
1 code implementation • 14 Nov 2023 • Yifu Qiu, Zheng Zhao, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen
Instead, we provide LLMs with textual narratives and probe them with respect to their common-sense knowledge of the structure and duration of events, their ability to order events along a timeline, and self-consistency within their temporal model (e. g., temporal relations such as after and before are mutually exclusive for any pair of events).
1 code implementation • 24 Oct 2023 • Zheng Zhao, Yftah Ziser, Bonnie Webber, Shay B. Cohen
Using this tool, we study to what extent and how morphosyntactic features are reflected in the representations learned by multilingual pre-trained models.
1 code implementation • 31 May 2023 • Paul Darm, Antonio Valerio Miceli-Barone, Shay B. Cohen, Annalisa Riccardi
In this work we present a system, developed for the European Space Agency (ESA), that can answer complex natural language queries, to support engineers in accessing the information contained in a KB that models the orbital space debris environment.
1 code implementation • 26 May 2023 • Matt Grenander, Shay B. Cohen, Mark Steedman
We propose a sentence-incremental neural coreference resolution system which incrementally builds clusters after marking mention boundaries in a shift-reduce method.
1 code implementation • 24 May 2023 • Antonio Valerio Miceli-Barone, Fazl Barez, Ioannis Konstas, Shay B. Cohen
Large Language Models (LLMs) have successfully been applied to code generation tasks, raising the question of how well these models understand programming.
1 code implementation • 23 May 2023 • Yifu Qiu, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen
With the existing faithful metrics focusing on English, even measuring the extent of this phenomenon in cross-lingual settings is hard.
1 code implementation • 15 May 2023 • Ashok Urlana, Pinzhen Chen, Zheng Zhao, Shay B. Cohen, Manish Shrivastava, Barry Haddow
This paper introduces PMIndiaSum, a multilingual and massively parallel summarization corpus focused on languages in India.
1 code implementation • 21 Feb 2023 • Balint Gyevnar, Cheng Wang, Christopher G. Lucas, Shay B. Cohen, Stefano V. Albrecht
We present CEMA: Causal Explanations in Multi-Agent systems; a framework for creating causal natural language explanations of an agent's decisions in dynamic sequential multi-agent systems to build more trustworthy autonomous agents.
1 code implementation • 18 Feb 2023 • Weixian Waylon Li, Yftah Ziser, Maximin Coavoux, Shay B. Cohen
While the first decoding method matches a proof to a statement without being aware of other statements or proofs, the second method treats the task as a global matching problem.
1 code implementation • 17 Nov 2022 • Yifu Qiu, Shay B. Cohen
Sequential abstractive neural summarizers often do not use the underlying structure in the input article or dependencies between the input sentences.
1 code implementation • 22 Oct 2022 • Zheng Zhao, Yftah Ziser, Shay B. Cohen
We investigate how different domains are encoded in modern neural network architectures.
1 code implementation • 25 May 2022 • Marcio Fonseca, Yftah Ziser, Shay B. Cohen
We argue that disentangling content selection from the budget used to cover salient content improves the performance and applicability of abstractive summarizers.
Ranked #1 on
Text Summarization
on GovReport
1 code implementation • 20 May 2022 • Ronald Cardenas, Matthias Galle, Shay B. Cohen
Extractive summaries are usually presented as lists of sentences with no expected cohesion between them and with plenty of redundant information if not accounted for.
1 code implementation • 15 Mar 2022 • Shun Shao, Yftah Ziser, Shay B. Cohen
We describe a simple and effective method (Spectral Attribute removaL; SAL) to remove private or guarded information from neural representations.
1 code implementation • Findings (ACL) 2022 • Nickil Maveli, Shay B. Cohen
We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence.
Ranked #7 on
Constituency Grammar Induction
on PTB Diagnostic ECG Database
(using extra training data)
no code implementations • NAACL 2021 • Jiangming Liu, Shay B. Cohen, Mirella Lapata
We propose neural models to generate text from formal meaning representations based on Discourse Representation Structures (DRSs).
no code implementations • 16 Apr 2021 • Ronald Cardenas, Matthias Galle, Shay B. Cohen
We introduce a wide range of heuristics that leverage cognitive representations of content units and how these are retained or forgotten in human memory.
Extractive Summarization
Unsupervised Extractive Summarization
2 code implementations • 3 Feb 2021 • Maximin Coavoux, Shay B. Cohen
The task is designed to improve the processing of research-level mathematical texts.
no code implementations • 17 Jan 2021 • Nikos Papasarantopoulos, Shay B. Cohen
Research on text generation from multimodal inputs has largely focused on static images, and less on video data.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Yuan He, Shay B. Cohen
Approaching named entities transliteration as a Neural Machine Translation (NMT) problem is common practice.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Ida Szubert, Marco Damonte, Shay B. Cohen, Mark Steedman
Abstract Meaning Representation (AMR) parsing aims at converting sentences into AMR representations.
no code implementations • EMNLP 2021 • Chunchuan Lyu, Shay B. Cohen, Ivan Titov
In contrast, we treat both alignment and segmentation as latent variables in our model and induce them as part of end-to-end training.
Ranked #22 on
AMR Parsing
on LDC2017T10
1 code implementation • EMNLP 2020 • Yan Zhang, Zhijiang Guo, Zhiyang Teng, Wei Lu, Shay B. Cohen, Zuozhu Liu, Lidong Bing
With the help of these strategies, we are able to train a model with fewer parameters while maintaining the model capacity.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zheng Zhao, Shay B. Cohen, Bonnie Webber
It is well-known that abstractive summaries are subject to hallucination---including material that is not supported by the original text.
1 code implementation • 17 Aug 2020 • Zhunxuan Wang, Linyun He, Chunchuan Lyu, Shay B. Cohen
We describe an algorithm that learns two-layer residual units using rectified linear unit (ReLU) activation: suppose the input $\mathbf{x}$ is from a distribution with support space $\mathbb{R}^d$ and the ground-truth generative model is a residual unit of this type, given by $\mathbf{y} = \boldsymbol{B}^\ast\left[\left(\boldsymbol{A}^\ast\mathbf{x}\right)^+ + \mathbf{x}\right]$, where ground-truth network parameters $\boldsymbol{A}^\ast \in \mathbb{R}^{d\times d}$ represent a full-rank matrix with nonnegative entries and $\boldsymbol{B}^\ast \in \mathbb{R}^{m\times d}$ is full-rank with $m \geq d$ and for $\boldsymbol{c} \in \mathbb{R}^d$, $[\boldsymbol{c}^{+}]_i = \max\{0, c_i\}$.
1 code implementation • ACL 2020 • Or Honovich, Lucas Torroba Hennigen, Omri Abend, Shay B. Cohen
Machine reading is an ambitious goal in NLP that subsumes a wide range of text understanding capabilities.
no code implementations • ACL 2020 • Jiangming Liu, Shay B. Cohen, Mirella Lapata
Discourse representation structures (DRSs) are scoped semantic representations for texts of arbitrary length.
no code implementations • ACL 2020 • Gabriel Gordon-Hall, Philip John Gorinski, Shay B. Cohen
Deep reinforcement learning is a promising approach to training a dialog manager, but current methods struggle with the large state and action spaces of multi-domain dialog systems.
no code implementations • EMNLP 2020 • Jiangming Liu, Matt Gardner, Shay B. Cohen, Mirella Lapata
Complex reasoning over text requires understanding and chaining together free-form predicates and logical connectives.
1 code implementation • ICLR 2020 • Yi Ren, Shangmin Guo, Matthieu Labeau, Shay B. Cohen, Simon Kirby
The principle of compositionality, which enables natural language to represent complex concepts via a structured combination of simpler ones, allows us to convey an open-ended set of messages using a limited vocabulary.
no code implementations • IJCNLP 2019 • Matthieu Labeau, Shay B. Cohen
In this paper, we experiment with several families (alpha, beta and gamma) of power divergences, generalized from the KL divergence, for learning language models with an objective different than standard MLE.
no code implementations • IJCNLP 2019 • Nikos Papasarantopoulos, Lea Frermann, Mirella Lapata, Shay B. Cohen
Multi-view learning algorithms are powerful representation learning tools, often exploited in the context of multimodal problems.
1 code implementation • IJCNLP 2019 • Chunchuan Lyu, Shay B. Cohen, Ivan Titov
Modern state-of-the-art Semantic Role Labeling (SRL) methods rely on expressive sentence encoders (e. g., multi-layer LSTMs) but tend to model only local (if any) interactions between individual argument labeling decisions.
no code implementations • WS 2019 • Johanna Bj{\"o}rklund, Shay B. Cohen, Frank Drewes, Giorgio Satta
We propose a formal model for translating unranked syntactic trees, such as dependency trees, into semantic graphs.
1 code implementation • 19 Jul 2019 • Shashi Narayan, Shay B. Cohen, Mirella Lapata
We introduce 'extreme summarization', a new single-document summarization task which aims at creating a short, one-sentence news summary answering the question ``What is the article about?''.
no code implementations • ACL 2019 • John Torr, Milos Stanojevic, Mark Steedman, Shay B. Cohen
Minimalist Grammars (Stabler, 1997) are a computationally oriented, and rigorous formalisation of many aspects of Chomsky{'}s (1995) Minimalist Program.
1 code implementation • ACL 2019 • Mohammad Javad Hosseini, Shay B. Cohen, Mark Johnson, Mark Steedman
The new entailment score outperforms prior state-of-the-art results on a standard entialment dataset and the new link prediction scores show improvements over the raw link prediction scores.
no code implementations • ACL 2019 • Jiangming Liu, Shay B. Cohen, Mirella Lapata
We introduce a novel semantic parsing task based on Discourse Representation Theory (DRT; Kamp and Reyle 1993).
no code implementations • WS 2019 • Jiangming Liu, Shay B. Cohen, Mirella Lapata
Our best system achieves a score of 84. 8{\%} F1 in the DRS parsing shared task.
Ranked #2 on
DRS Parsing
on PMB-2.2.0
1 code implementation • WS 2020 • Zhifeng Hu, Serhii Havrylov, Ivan Titov, Shay B. Cohen
We introduce an idea for a privacy-preserving transformation on natural language data, inspired by homomorphic encryption.
1 code implementation • NAACL 2019 • Afonso Mendes, Shashi Narayan, Sebastião Miranda, Zita Marinho, André F. T. Martins, Shay B. Cohen
We present a new neural model for text summarization that first extracts sentences from a document and then compresses them.
1 code implementation • NAACL 2019 • Maximin Coavoux, Shay B. Cohen
We introduce a novel transition system for discontinuous constituency parsing.
2 code implementations • NAACL 2019 • Marco Damonte, Shay B. Cohen
AMR-to-text generation is a problem recently introduced to the NLP community, in which the goal is to generate sentences from Abstract Meaning Representation (AMR) graphs.
Ranked #2 on
Graph-to-Sequence
on LDC2015E86:
1 code implementation • TACL 2019 • Maximin Coavoux, Benoît Crabbé, Shay B. Cohen
Lexicalized parsing models are based on the assumptions that (i) constituents are organized around a lexical head (ii) bilexical statistics are crucial to solve ambiguities.
2 code implementations • EMNLP 2018 • Sebastião Miranda, Artūrs Znotiņš, Shay B. Cohen, Guntis Barzdins
Clustering news across languages enables efficient media monitoring by aggregating articles from multilingual sources into coherent stories.
1 code implementation • EMNLP 2018 • Maximin Coavoux, Shashi Narayan, Shay B. Cohen
This article deals with adversarial attacks towards deep learning systems for Natural Language Processing (NLP), in the context of privacy protection.
3 code implementations • EMNLP 2018 • Shashi Narayan, Shay B. Cohen, Mirella Lapata
We introduce extreme summarization, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling approach.
Ranked #9 on
Text Summarization
on X-Sum
no code implementations • COLING 2018 • Joana Ribeiro, Shashi Narayan, Shay B. Cohen, Xavier Carreras
We show that the general problem of string transduction can be reduced to the problem of sequence labeling.
1 code implementation • ACL 2018 • Shashi Narayan, Ronald Cardenas, Nikos Papasarantopoulos, Shay B. Cohen, Mirella Lapata, Jiangsheng Yu, Yi Chang
Document modeling is essential to a variety of natural language understanding tasks.
1 code implementation • ACL 2018 • Jiangming Liu, Shay B. Cohen, Mirella Lapata
We introduce an open-domain neural semantic parser which generates formal meaning representations in the style of Discourse Representation Theory (DRT; Kamp and Reyle 1993).
1 code implementation • ACL 2018 • Yumo Xu, Shay B. Cohen
Stock movement prediction is a challenging problem: the market is highly stochastic, and we make temporally-dependent predictions from chaotic data.
Ranked #2 on
Stock Market Prediction
on stocknet
(using extra training data)
no code implementations • NAACL 2018 • Fuad Issa, Marco Damonte, Shay B. Cohen, Xiaohui Yan, Yi Chang
Abstract Meaning Representation (AMR) parsing aims at abstracting away from the syntactic realization of a sentence, and denote only its meaning in a canonical form.
1 code implementation • NAACL 2018 • Shashi Narayan, Shay B. Cohen, Mirella Lapata
In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective.
Ranked #13 on
Extractive Text Summarization
on CNN / Daily Mail
1 code implementation • TACL 2018 • Mohammad Javad Hosseini, Nathanael Chambers, Siva Reddy, Xavier R. Holt, Shay B. Cohen, Mark Johnson, Mark Steedman
We instead propose a scalable method that learns globally consistent similarity scores based on new soft constraints that consider both the structures across typed entailment graphs and inside each graph.
1 code implementation • TACL 2018 • Lea Frermann, Shay B. Cohen, Mirella Lapata
In this paper we argue that crime drama exemplified in television programs such as CSI:Crime Scene Investigation is an ideal testbed for approximating real-world natural language understanding and the complex inferences associated with it.
2 code implementations • EMNLP 2017 • Shashi Narayan, Claire Gardent, Shay B. Cohen, Anastasia Shimorina
We propose a new sentence simplification task (Split-and-Rephrase) where the aim is to split a complex sentence into a meaning preserving sequence of shorter sentences.
1 code implementation • NAACL 2018 • Marco Damonte, Shay B. Cohen
Abstract Meaning Representation (AMR) annotation efforts have mostly focused on English.
1 code implementation • 14 Apr 2017 • Shashi Narayan, Nikos Papasarantopoulos, Shay B. Cohen, Mirella Lapata
Most extractive summarization methods focus on the main body of the document from which sentences need to be extracted.
no code implementations • EACL 2017 • Renars Liepins, Ulrich Germann, Guntis Barzdins, Alex Birch, ra, Steve Renals, Susanne Weber, Peggy van der Kreeft, Herv{\'e} Bourlard, Jo{\~a}o Prieto, Ond{\v{r}}ej Klejch, Peter Bell, Alex Lazaridis, ros, Alfonso Mendes, Sebastian Riedel, Mariana S. C. Almeida, Pedro Balage, Shay B. Cohen, Tomasz Dwojak, Philip N. Garner, Andreas Giefer, Marcin Junczys-Dowmunt, Hina Imran, David Nogueira, Ahmed Ali, Mir, Sebasti{\~a}o a, Andrei Popescu-Belis, Lesly Miculicich Werlen, Nikos Papasarantopoulos, Abiola Obamuyide, Clive Jones, Fahim Dalvi, Andreas Vlachos, Yang Wang, Sibo Tong, Rico Sennrich, Nikolaos Pappas, Shashi Narayan, Marco Damonte, Nadir Durrani, Sameer Khurana, Ahmed Abdelali, Hassan Sajjad, Stephan Vogel, David Sheppey, Chris Hernon, Jeff Mitchell
We present the first prototype of the SUMMA Platform: an integrated platform for multilingual media monitoring.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
4 code implementations • EACL 2017 • Marco Damonte, Shay B. Cohen, Giorgio Satta
We describe a transition-based parser for AMR that parses sentences left-to-right, in linear time.
Ranked #5 on
AMR Parsing
on LDC2015E86
no code implementations • 9 Aug 2016 • Nikos Papasarantopoulos, Helen Jiang, Shay B. Cohen
We describe a technique for structured prediction, based on canonical correlation analysis.
no code implementations • ACL 2016 • Shashi Narayan, Shay B. Cohen
We describe a search algorithm for optimizing the number of latent states when estimating latent-variable PCFGs with spectral methods.
no code implementations • WS 2016 • Shashi Narayan, Siva Reddy, Shay B. Cohen
One of the limitations of semantic parsing approaches to open-domain question answering is the lexicosyntactic gap between natural language questions and knowledge base entries -- there are many ways to ask a question, all with the same answer.
no code implementations • 4 Nov 2015 • Guillaume Rabusseau, Borja Balle, Shay B. Cohen
We describe a technique to minimize weighted tree automata (WTA), a powerful formalisms that subsumes probabilistic context-free grammars (PCFGs) and latent-variable PCFGs.
no code implementations • TACL 2016 • Dominique Osborne, Shashi Narayan, Shay B. Cohen
Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views.
no code implementations • EMNLP 2015 • Shashi Narayan, Shay B. Cohen
We describe an approach to create a diverse set of predictions with spectral learning of latent-variable PCFGs (L-PCFGs).
no code implementations • CL 2016 • Shay B. Cohen, Daniel Gildea
Our result provides another proof for the best known result for parsing mildly context sensitive formalisms such as combinatory categorial grammars, head grammars, linear indexed grammars, and tree adjoining grammars, which can be parsed in time $O(n^{4. 76})$.
no code implementations • 18 Oct 2014 • Chiraag Lala, Shay B. Cohen
We describe a visualization tool that can be used to view the change in meaning of words over time.
no code implementations • TACL 2014 • Ke Zhai, Jordan Boyd-Graber, Shay B. Cohen
Adaptor grammars are a flexible, powerful formalism for defining nonparametric, unsupervised models of grammar productions.
no code implementations • NeurIPS 2012 • Michael Collins, Shay B. Cohen
We describe an approach to speed-up inference with latent variable PCFGs, which have been shown to be highly effective for natural language parsing.
no code implementations • NeurIPS 2010 • Noah A. Smith, Shay B. Cohen
Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures.