1 code implementation • EACL (AdaptNLP) 2021 • Antonis Maronikolakis, Hinrich Schütze
Thus, instead of training multiple models, we can train a single multidomain model saving on computational resources and training time.
no code implementations • EMNLP 2021 • Timo Schick, Hinrich Schütze
Providing pretrained language models with simple task descriptions in natural language enables them to solve some tasks in a fully unsupervised fashion.
no code implementations • Findings (EMNLP) 2021 • Antonis Maronikolakis, Philipp Dufter, Hinrich Schütze
The size of the vocabulary is a central design choice in large pretrained language models, with respect to both performance and memory requirements.
no code implementations • 13 May 2022 • Antonis Maronikolakis, Philip Baader, Hinrich Schütze
To tackle the rising phenomenon of hate speech, efforts have been made towards data curation and analysis.
no code implementations • ACL 2022 • Yihong Liu, Haris Jabbar, Hinrich Schütze
The primary novelties of our model are: (a) capturing language-specific sentence representations separately for each language using normalizing flows and (b) using a simple transformation of these latent representations for translating from one language to another.
no code implementations • 31 Mar 2022 • Marina Sedinkina, Martin Schmitt, Hinrich Schütze
The practical success of much of NLP depends on the availability of training data.
1 code implementation • ACL 2022 • Leonie Weissweiler, Valentin Hofmann, Masoud Jalili Sabet, Hinrich Schütze
We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages.
no code implementations • 17 Mar 2022 • Zhen Han, Ruotong Liao, Beiyan Liu, Yao Zhang, Zifeng Ding, Heinz Köppl, Hinrich Schütze, Volker Tresp
We align structured knowledge contained in temporal knowledge graphs with their textual descriptions extracted from news articles and propose a novel knowledge-text prediction task to inject the abundant information from descriptions into temporal knowledge embeddings.
Knowledge Graph Completion
Temporal Knowledge Graph Completion
no code implementations • Findings (ACL) 2022 • Ayyoob Imani, Lütfi Kerem Şenel, Masoud Jalili Sabet, François Yvon, Hinrich Schütze
First, we create a multiparallel word alignment graph, joining all bilingual word alignment pairs in one graph.
no code implementations • 16 Mar 2022 • Valentin Hofmann, Goran Glavaš, Nikola Ljubešić, Janet B. Pierrehumbert, Hinrich Schütze
Geographic linguistic features are commonly used to improve the performance of pretrained language models (PLMs) on NLP tasks where geographic knowledge is intuitively beneficial (e. g., geolocation prediction and dialect feature prediction).
no code implementations • Findings (ACL) 2022 • Sheng Liang, Mengjie Zhao, Hinrich Schütze
Recent research has made impressive progress in large-scale multimodal pre-training.
1 code implementation • ACL 2022 • Lütfi Kerem Senel, Timo Schick, Hinrich Schütze
Pretrained language models (PLMs) have achieved superhuman performance on many benchmarks, creating a need for harder tasks.
no code implementations • 12 Feb 2022 • Yanchen Liu, Timo Schick, Hinrich Schütze
Due to the high costs associated with finetuning large language models, various recent works propose to adapt them to specific tasks without any parameter updates through in-context learning.
no code implementations • 28 Jan 2022 • Silvia Severini, Ayyoob Imani, Philipp Dufter, Hinrich Schütze
Prior work on extracting MNE datasets from parallel corpora required resources such as large monolingual corpora or word aligners that are unavailable or perform poorly for underresourced languages.
no code implementations • 14 Dec 2021 • Mengjie Zhao, Fei Mi, Yasheng Wang, Minglei Li, Xin Jiang, Qun Liu, Hinrich Schütze
We propose LMTurk, a novel approach that treats few-shot learners as crowdsourcing workers.
no code implementations • 26 Nov 2021 • Timo Schick, Hinrich Schütze
Prompt-based approaches are strong at few-shot learning.
no code implementations • EMNLP 2021 • Nora Kassner, Oyvind Tafjord, Hinrich Schütze, Peter Clark
We show that, in a controlled experimental setting, these two mechanisms result in more consistent beliefs in the overall system, improving both the accuracy and consistency of its answers over time.
no code implementations • 28 Sep 2021 • Nikolai Solmsdorf, Dietrich Trautmann, Hinrich Schütze
Despite considerable recent progress, the creation of well-balanced and diverse resources remains a time-consuming and costly challenge in Argument Mining.
no code implementations • 23 Sep 2021 • Maximilian Mozes, Martin Schmitt, Vladimir Golkov, Hinrich Schütze, Daniel Cremers
We investigate the incorporation of visual relationships into the task of supervised image caption generation by proposing a model that leverages detected objects and auto-generated visual relationships to describe images in natural language.
no code implementations • EMNLP (insights) 2021 • Antonis Maronikolakis, Philipp Dufter, Hinrich Schütze
We show that the closer two languages are, the better BERT can align them on the character level.
1 code implementation • 16 Sep 2021 • Sheng Liang, Philipp Dufter, Hinrich Schütze
Multilingual pretrained language models (MPLMs) exhibit multilinguality and are well suited for transfer across languages.
no code implementations • 13 Sep 2021 • Antonis Maronikolakis, Philipp Dufter, Hinrich Schütze
The size of the vocabulary is a central design choice in large pretrained language models, with respect to both performance and memory requirements.
1 code implementation • EMNLP 2021 • Ayyoob Imani, Masoud Jalili Sabet, Lütfi Kerem Şenel, Philipp Dufter, François Yvon, Hinrich Schütze
With the advent of end-to-end deep learning approaches in machine translation, interest in word alignments initially decreased; however, they have again become a focus of research more recently.
1 code implementation • EMNLP 2021 • Martin Schmitt, Hinrich Schütze
If we allow for tokens outside the PLM's vocabulary, patterns can be adapted more flexibly to a PLM's idiosyncrasies.
Ranked #1 on
Few-Shot NLI
on SherLIiC
1 code implementation • EMNLP 2021 • Mengjie Zhao, Hinrich Schütze
It has been shown for English that discrete and soft prompting perform strongly in few-shot learning with pretrained language models (PLMs).
no code implementations • ACL 2021 • Ayyoob Imani, Masoud Jalili Sabet, Philipp Dufter, Michael Cysouw, Hinrich Schütze
With more than 7000 languages worldwide, multilingual natural language processing (NLP) is essential both from an academic and commercial perspective.
2 code implementations • 2 Jul 2021 • Luisa März, Stefan Schweter, Nina Poerner, Benjamin Roth, Hinrich Schütze
We propose new methods for in-domain and cross-domain Named Entity Recognition (NER) on historical data for Dutch and French.
no code implementations • 18 Apr 2021 • Valentin Hofmann, Janet B. Pierrehumbert, Hinrich Schütze
The increasing polarization of online political discourse calls for computational tools that are able to automatically detect and monitor ideological divides in social media.
1 code implementation • NAACL 2021 • Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze
Though word embeddings and topics are complementary representations, several past works have only used pretrained word embeddings in (neural) topic modeling to address data sparsity in short-text or small collection of documents.
1 code implementation • EMNLP 2021 • Timo Schick, Hinrich Schütze
To obtain high-quality sentence embeddings from pretrained language models (PLMs), they must either be augmented with additional pretraining objectives or finetuned on a large set of labeled text pairs.
Ranked #5 on
Semantic Textual Similarity
on SICK
1 code implementation • NAACL 2021 • Philipp Dufter, Nora Kassner, Hinrich Schütze
Recent research investigates factual knowledge stored in large pretrained language models (PLMs).
1 code implementation • 28 Feb 2021 • Timo Schick, Sahana Udupa, Hinrich Schütze
In this paper, we first demonstrate a surprising finding: pretrained language models recognize, to a considerable degree, their undesirable biases and the toxicity of the content they produce.
no code implementations • 22 Feb 2021 • Philipp Dufter, Martin Schmitt, Hinrich Schütze
Transformers are arguably the main workhorse in recent Natural Language Processing research.
1 code implementation • EACL 2021 • Martin Schmitt, Hinrich Schütze
Lexical inference in context (LIiC) is the task of recognizing textual entailment between two very similar sentences, i. e., sentences that only differ in one expression.
Ranked #2 on
Few-Shot NLI
on SherLIiC
no code implementations • 9 Feb 2021 • Sahand Sharifzadeh, Sina Moayed Baharlou, Martin Schmitt, Hinrich Schütze, Volker Tresp
We show that by fine-tuning the classification pipeline with the extracted knowledge from texts, we can achieve ~8x more accurate results in scene graph classification, ~3x in object classification, and ~1. 5x in predicate classification, compared to the supervised baselines with only 1% of the annotated images.
no code implementations • 6 Feb 2021 • Lutfi Kerem Senel, Hinrich Schütze
Recent progress in pretraining language models on large corpora has resulted in large performance gains on many NLP tasks.
1 code implementation • EACL 2021 • Nora Kassner, Philipp Dufter, Hinrich Schütze
(i) Can mBERT be used as a multilingual knowledge base?
1 code implementation • 1 Feb 2021 • Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, Yoav Goldberg
In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge?
1 code implementation • ACL 2021 • Valentin Hofmann, Janet B. Pierrehumbert, Hinrich Schütze
How does the input segmentation of pretrained language models (PLMs) affect their interpretations of complex words?
no code implementations • ACL 2021 • Mengjie Zhao, Yi Zhu, Ehsan Shareghi, Ivan Vulić, Roi Reichart, Anna Korhonen, Hinrich Schütze
Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretrained encoders like multilingual BERT.
2 code implementations • 22 Dec 2020 • Timo Schick, Hinrich Schütze
Providing pretrained language models with simple task descriptions in natural language enables them to solve some tasks in a fully unsupervised fashion.
no code implementations • 21 Dec 2020 • Ehsaneddin Asgari, Masoud Jalili Sabet, Philipp Dufter, Christopher Ringlstetter, Hinrich Schütze
This method's hypothesis is that the aggregation of different granularities of text for certain language pairs can help word-level alignment.
1 code implementation • COLING 2020 • Timo Schick, Helmut Schmid, Hinrich Schütze
A recent approach for few-shot text classification is to convert textual inputs to cloze questions that contain some form of task description, process them with a pretrained language model and map the predicted words to labels.
1 code implementation • ACL 2021 • Valentin Hofmann, Janet B. Pierrehumbert, Hinrich Schütze
Static word embeddings that represent words by a single vector cannot capture the variability of word meaning in different linguistic and extralinguistic contexts.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Yatin Chaudhary, Pankaj Gupta, Khushbu Saxena, Vivek Kulkarni, Thomas Runkler, Hinrich Schütze
Our work thus focuses on optimizing the computational cost of fine-tuning for document classification.
4 code implementations • NAACL 2021 • Timo Schick, Hinrich Schütze
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance.
3 code implementations • EMNLP (NLP4ConvAI) 2021 • Leonardo F. R. Ribeiro, Martin Schmitt, Hinrich Schütze, Iryna Gurevych
We show that the PLMs BART and T5 achieve new state-of-the-art results and that task-adaptive pretraining strategies improve their performance even further.
Ranked #1 on
KG-to-Text Generation
on WebNLG (All)
no code implementations • ACL 2019 • Marina Sedinkina, Nikolas Breitkopf, Hinrich Schütze
In our experiments, we demonstrate that the automatically adapted sentiment dictionary outperforms the previous state of the art in predicting the financial outcomes excess return and volatility.
1 code implementation • ICML 2020 • Pankaj Gupta, Yatin Chaudhary, Thomas Runkler, Hinrich Schütze
To address the problem, we propose a lifelong learning framework for neural topic modeling that can continuously process streams of document collections, accumulate topics and guide future topic modeling tasks by knowledge transfer from several sources to better deal with the sparse data.
1 code implementation • ICML 2020 • Yatin Chaudhary, Hinrich Schütze, Pankaj Gupta
Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics.
1 code implementation • CONLL 2020 • Nora Kassner, Benno Krojer, Hinrich Schütze
How can pretrained language models (PLMs) learn factual knowledge from the training set?
no code implementations • NAACL (TextGraphs) 2021 • Martin Schmitt, Leonardo F. R. Ribeiro, Philipp Dufter, Iryna Gurevych, Hinrich Schütze
We present Graformer, a novel Transformer-based encoder-decoder architecture for graph-to-text generation.
Ranked #5 on
KG-to-Text Generation
on AGENDA
no code implementations • 16 May 2020 • Ehsaneddin Asgari, Christoph Ringlstetter, Hinrich Schütze
This paper describes EmbLexChange, a system introduced by the "Life-Language" team for SemEval-2020 Task 1, on unsupervised detection of lexical-semantic changes.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Nora Kassner, Hinrich Schütze
Khandelwal et al. (2020) use a k-nearest-neighbor (kNN) component to improve language model performance.
1 code implementation • EMNLP 2020 • Valentin Hofmann, Janet B. Pierrehumbert, Hinrich Schütze
Can pretrained language models (PLMs) generate derivationally complex words?
1 code implementation • 1 May 2020 • Philipp Dufter, Hinrich Schütze
We aim to identify architectural properties of BERT and linguistic properties of languages that are necessary for BERT to become multilingual.
no code implementations • EMNLP 2020 • Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for pretrained weights in lieu of modifying them through finetuning.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Mengjie Zhao, Philipp Dufter, Yadollah Yaghoobzadeh, Hinrich Schütze
Pretrained language models have achieved a new state of the art on many NLP tasks, but there are still many open questions about how and why they work so well.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Masoud Jalili Sabet, Philipp Dufter, François Yvon, Hinrich Schütze
We find that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners, even with abundant parallel data; e. g., contextualized embeddings achieve a word alignment F1 for English-German that is 5 percentage points higher than eflomal, a high-quality statistical aligner, trained on 100k parallel sentences.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Nina Poerner, Ulli Waltinger, Hinrich Schütze
Domain adaptation of Pretrained Language Models (PTLMs) is typically achieved by unsupervised pretraining on target-domain text.
5 code implementations • 21 Jan 2020 • Timo Schick, Hinrich Schütze
Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained language model with "task descriptions" in natural language (e. g., Radford et al., 2019).
no code implementations • 7 Jan 2020 • Alena Moiseeva, Dietrich Trautmann, Michael Heimann, Hinrich Schütze
Such intelligent agents can assist the user by answering specific questions and executing routine tasks that are ordinarily performed in a natural language (i. e., customer support).
no code implementations • 12 Dec 2019 • James L. McClelland, Felix Hill, Maja Rudolph, Jason Baldridge, Hinrich Schütze
We take language to be a part of a system for understanding and communicating about situations.
no code implementations • EMNLP 2016 • Ryan Cotterell, Arun Kumar, Hinrich Schütze
Morphological segmentation has traditionally been modeled with non-hierarchical models, which yield flat segmentations as output.
no code implementations • ACL 2020 • Nina Poerner, Ulli Waltinger, Hinrich Schütze
We address the task of unsupervised Semantic Textual Similarity (STS) by ensembling diverse pre-trained sentence encoders into sentence meta-embeddings.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Nina Poerner, Ulli Waltinger, Hinrich Schütze
We present a novel way of injecting factual knowledge about entities into the pretrained BERT model (Devlin et al., 2019): We align Wikipedia2Vec entity vectors (Yamada et al., 2016) with BERT's native wordpiece vector space and use the aligned entity vectors as if they were wordpiece vectors.
2 code implementations • ACL 2020 • Nora Kassner, Hinrich Schütze
We find that PLMs do not distinguish between negated ("Birds cannot [MASK]") and non-negated ("Birds can [MASK]") cloze questions.
1 code implementation • ACL 2020 • Timo Schick, Hinrich Schütze
In this work, we transfer this idea to pretrained language models: We introduce BERTRAM, a powerful architecture based on BERT that is capable of inferring high-quality embeddings for rare words that are suitable as input representations for deep language models.
1 code implementation • WS 2019 • Usama Yaseen, Pankaj Gupta, Hinrich Schütze
Our RE system ranked first in the SeeDev-binary Relation Extraction Task with F1-score of 0. 3738.
no code implementations • 1 Oct 2019 • Heike Adel, Hinrich Schütze
In particular, we explore different ways of integrating the named entity types of the relation arguments into a neural network for relation classification, including a joint training and a structured prediction approach.
1 code implementation • WS 2019 • Yatin Chaudhary, Pankaj Gupta, Hinrich Schütze
This paper presents our system details and results of participation in the RDoC Tasks of BioNLP-OST 2019.
no code implementations • 25 Sep 2019 • Sanjeev Kumar Karn, Francine Chen, Yan-Ying Chen, Ulli Waltinger, Hinrich Schütze
The interleaved posts are encoded hierarchically, i. e., word-to-word (words in a post) followed by post-to-post (posts in a channel).
no code implementations • 25 Sep 2019 • Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze
Though word embeddings and topics are complementary representations, several past works have only used pretrained word embeddings in (neural) topic modeling to address data sparsity problem in short text or small collection of documents.
no code implementations • 14 Sep 2019 • Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze
Though word embeddings and topics are complementary representations, several past works have only used pre-trained word embeddings in (neural) topic modeling to address data sparsity problem in short text or small collection of documents.
no code implementations • WS 2019 • Pankaj Gupta, Khushbu Saxena, Usama Yaseen, Thomas Runkler, Hinrich Schütze
To address the tasks of sentence (SLC) and fragment level (FLC) propaganda detection, we explore different neural architectures (e. g., CNN, LSTM-CRF and BERT) and extract linguistic (e. g., part-of-speech, named entity, readability, sentiment, emotion, etc.
no code implementations • 4 Jul 2019 • Ryan Cotterell, Hinrich Schütze
Linguistic similarity is multi-faceted.
1 code implementation • ACL 2019 • Yadollah Yaghoobzadeh, Katharina Kann, Timothy J. Hazen, Eneko Agirre, Hinrich Schütze
Word embeddings typically represent different meanings of a word in a single conflated vector.
no code implementations • 5 Jun 2019 • Sanjeev Kumar Karn, Francine Chen, Yan-Ying Chen, Ulli Waltinger, Hinrich Schütze
Interleaved texts, where posts belonging to different threads occur in one sequence, are a common occurrence, e. g., online chat conversations.
1 code implementation • ACL 2019 • Martin Schmitt, Hinrich Schütze
We present SherLIiC, a testbed for lexical inference in context (LIiC), consisting of 3985 manually annotated inference rule candidates (InfCands), accompanied by (i) ~960k unlabeled InfCands, and (ii) ~190k typed textual relations between Freebase entities extracted from the large entity-linked corpus ClueWeb09.
Ranked #1 on
Lexical Entailment
on SherLIiC
1 code implementation • 22 Apr 2019 • Dietrich Trautmann, Johannes Daxenberger, Christian Stab, Hinrich Schütze, Iryna Gurevych
In this work, we argue that the task should be performed on a more fine-grained level of sequence labeling.
1 code implementation • EMNLP 2020 • Martin Schmitt, Sahand Sharifzadeh, Volker Tresp, Hinrich Schütze
To this end, we present the first approach to unsupervised text generation from KGs and show simultaneously how it can be used for unsupervised semantic parsing.
Ranked #1 on
Unsupervised KG-to-Text Generation
on VG graph-text
1 code implementation • IJCNLP 2019 • Philipp Dufter, Hinrich Schütze
In this work, we investigate three methods for making word spaces interpretable by rotation: Densifier (Rothe et al., 2016), linear SVMs and DensRay, a new method we propose.
2 code implementations • 14 Apr 2019 • Timo Schick, Hinrich Schütze
Pretraining deep neural network architectures with a language modeling objective has brought large improvements for many natural language processing tasks.
1 code implementation • NAACL 2019 • Timo Schick, Hinrich Schütze
Learning high-quality embeddings for rare words is a hard problem because of sparse context information.
1 code implementation • 9 Nov 2018 • Timo Schick, Hinrich Schütze
The general problem setting is that word embeddings are induced on an unlabeled training corpus and then a model is trained that embeds novel words into this induced embedding space.
no code implementations • 6 Nov 2018 • Heike Adel, Hinrich Schütze
Especially, it focuses on the coreference and classification component.
no code implementations • 1 Nov 2018 • Philipp Dufter, Mengjie Zhao, Hinrich Schütze
A simple and effective context-based multilingual embedding learner is Levy et al. (2017)'s S-ID (sentence ID) method.
no code implementations • 31 Oct 2018 • Nina Poerner, Masoud Jalili Sabet, Benjamin Roth, Hinrich Schütze
Count-based word alignment methods, such as the IBM models or fast-align, struggle on very small parallel corpora.
1 code implementation • EMNLP 2018 • Yadollah Yaghoobzadeh, Hinrich Schütze
For representation, we consider representations based on the context distribution of the entity (i. e., on its embedding), on the entity's name (i. e., on its surface form) and on its description in Wikipedia.
1 code implementation • 11 Oct 2018 • Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Bernt Andrassy, Thomas Runkler
iDepNN models the shortest and augmented dependency paths via recurrent and recursive neural networks to extract relationships within (intra-) and across (inter-) sentence boundaries.
Ranked #1 on
Relation Extraction
on MUC6
1 code implementation • ICLR 2019 • Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze
We address two challenges of probabilistic topic modelling in order to better estimate the probability of a word in a given context, i. e., P(word|context): (1) No Language Structure in Context: Probabilistic topic models ignore word order by summarizing a given context as a "bag-of-word" and consequently the semantics of words in the context is lost.
no code implementations • EMNLP 2018 • Katharina Kann, Hinrich Schütze
Neural state-of-the-art sequence-to-sequence (seq2seq) models often do not perform well for small training sets.
2 code implementations • WS 2018 • Nina Poerner, Benjamin Roth, Hinrich Schütze
Input optimization methods, such as Google Deep Dream, create interpretable representations of neurons for computer vision DNNs.
1 code implementation • 15 Sep 2018 • Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze
Here, we extend a neural autoregressive topic model to exploit the full context information around words in a document in a language modeling fashion.
no code implementations • NAACL 2019 • Apostolos Kemos, Heike Adel, Hinrich Schütze
Character-level models of tokens have been shown to be effective at dealing with within-token noise and out-of-vocabulary words.
1 code implementation • 11 Aug 2018 • Pankaj Gupta, Florian Buettner, Hinrich Schütze
Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks.
no code implementations • WS 2018 • Pankaj Gupta, Hinrich Schütze
Recurrent neural networks (RNNs) are temporal networks and cumulative in nature that have shown promising results in various natural language processing tasks.
2 code implementations • NAACL 2019 • Sanjeev Kumar Karn, Mark Buckley, Ulli Waltinger, Hinrich Schütze
In this work, we define the task of teaser generation and provide an evaluation benchmark and baseline systems for the process of generating teasers.
1 code implementation • WS 2018 • Yadollah Yaghoobzadeh, Katharina Kann, Hinrich Schütze
We propose a new evaluation method for word embeddings based on multi-label classification given a word embedding.
no code implementations • COLING 2018 • Pankaj Gupta, Bernt Andrassy, Hinrich Schütze
The task is challenging due to significant term mismatch in the query and ticket pairs of asymmetric lengths, where subject is a short text but description and solution are multi-sentence texts.
no code implementations • COLING 2018 • Wenpeng Yin, Yadollah Yaghoobzadeh, Hinrich Schütze
Large scale knowledge graphs (KGs) such as Freebase are generally incomplete.
1 code implementation • NAACL 2018 • Pankaj Gupta, Benjamin Roth, Hinrich Schütze
Semi-supervised bootstrapping techniques for relationship extraction from text iteratively expand a set of initial seed instances.
1 code implementation • ACL 2018 • Wenpeng Yin, Hinrich Schütze, Dan Roth
This work deals with SciTail, a natural entailment challenge derived from a multi-choice question answering problem.
no code implementations • NAACL 2018 • Katharina Kann, Manuel Mager, Ivan Meza-Ruiz, Hinrich Schütze
Morphological segmentation for polysynthetic languages is challenging, because a word may consist of many individual morphemes and training data can be extremely scarce.
no code implementations • 5 Mar 2018 • Benjamin Roth, Costanza Conforti, Nina Poerner, Sanjeev Karn, Hinrich Schütze
In this work, we introduce the task of Open-Type Relation Argument Extraction (ORAE): Given a corpus, a query entity Q and a knowledge base relation (e. g.,"Q authored notable work with title X"), the model has to extract an argument of non-standard entity type (entities that cannot be extracted by a standard named entity tagger, e. g. X: the title of a book or a work of art) from the corpus.
no code implementations • ACL 2018 • Philipp Dufter, Mengjie Zhao, Martin Schmitt, Alexander Fraser, Hinrich Schütze
We present a new method for estimating vector space representations of words: embedding learning by concept induction.
1 code implementation • 19 Jan 2018 • Nina Poerner, Benjamin Roth, Hinrich Schütze
The behavior of deep neural networks (DNNs) is hard to understand.
no code implementations • NAACL 2018 • Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Bernt Andrassy
We also introduce a metric (named as SPAN) to quantify the capability of dynamic topic model to capture word evolution in topics over time.
no code implementations • 26 Oct 2017 • Heike Adel, Hinrich Schütze
In this paper, we demonstrate the importance of coreference resolution for natural language processing on the example of the TAC Slot Filling shared task.
1 code implementation • TACL 2018 • Wenpeng Yin, Hinrich Schütze
We hypothesize that this is because the attention in CNNs has been mainly implemented as attentive pooling (i. e., it is applied to pooling) rather than as attentive convolution (i. e., it is integrated into convolution).
no code implementations • 7 Aug 2017 • Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schütze
This paper addresses the problem of corpus-level entity typing, i. e., inferring from a large corpus that an entity is a member of a class such as "food" or "artist".
no code implementations • EMNLP 2017 • Heike Adel, Hinrich Schütze
We introduce globally normalized convolutional neural networks for joint entity classification and relation extraction.
no code implementations • WS 2017 • Katharina Kann, Hinrich Schütze
We present a semi-supervised way of training a character-based encoder-decoder recurrent neural network for morphological reinflection, the task of generating one inflected word form from another.
no code implementations • EMNLP 2017 • Ehsaneddin Asgari, Hinrich Schütze
We present SuperPivot, an analysis method for low-resource languages that occur in a superparallel corpus, i. e., in a corpus that contains an order of magnitude more languages than parallel corpora currently in use.
no code implementations • ACL 2017 • Katharina Kann, Ryan Cotterell, Hinrich Schütze
We present a novel cross-lingual transfer method for paradigm completion, the task of mapping a lemma to its inflected forms, using a neural encoder-decoder model, the state of the art for the monolingual task.
4 code implementations • 7 Feb 2017 • Wenpeng Yin, Katharina Kann, Mo Yu, Hinrich Schütze
Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP).
no code implementations • EACL 2017 • Wenpeng Yin, Hinrich Schütze
This work studies comparatively two typical sentence matching tasks: textual entailment (TE) and answer selection (AS), observing that weaker phrase alignments are more critical in TE, while stronger phrase alignments deserve more attention in AS.
no code implementations • EACL 2017 • Yadollah Yaghoobzadeh, Hinrich Schütze
Entities are essential elements of natural language.
no code implementations • TACL 2018 • Ryan Cotterell, Hinrich Schütze
Since morphology obeys the principle of compositionality, the semantics of the word can be systematically derived from the meaning of its parts.
no code implementations • EACL 2017 • Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schütze
For the second noise type, we propose ways to improve the integration of noisy entity type predictions into relation extraction.
no code implementations • EACL 2017 • Heike Adel, Hinrich Schütze
Neural networks with attention have proven effective for many natural language processing tasks.
no code implementations • EACL 2017 • Katharina Kann, Ryan Cotterell, Hinrich Schütze
We explore the task of multi-source morphological reinflection, which generalizes the standard, single-source version.
no code implementations • ACL 2016 • Yadollah Yaghoobzadeh, Hinrich Schütze
We introduce a new methodology for intrinsic evaluation of word representations.
no code implementations • EMNLP 2015 • Yadollah Yaghoobzadeh, Hinrich Schütze
This paper addresses the problem of corpus-level entity typing, i. e., inferring from a large corpus that an entity is a member of a class such as "food" or "artist".
no code implementations • COLING 2016 • Wenpeng Yin, Mo Yu, Bing Xiang, Bo-Wen Zhou, Hinrich Schütze
In fact selection, we match the subject entity in a fact candidate with the entity mention in the question by a character-level convolutional neural network (char-CNN), and match the predicate in that fact with the question by a word-level CNN (word-CNN).
1 code implementation • ACL 2016 • Katharina Kann, Hinrich Schütze
Morphological reinflection is the task of generating a target form given a source form, a source tag and a target tag.
no code implementations • NAACL 2016 • Ngoc Thang Vu, Heike Adel, Pankaj Gupta, Hinrich Schütze
This paper investigates two different neural architectures for the task of relation classification: convolutional neural networks and recurrent neural networks.
no code implementations • 23 Apr 2016 • Wenpeng Yin, Hinrich Schütze
We address the problems of identifying phrase alignments of flexible granularity and pooling alignments of different intensities for these tasks.
no code implementations • HLT 2015 • Wenpeng Yin, Hinrich Schütze
This work, concerning paraphrase identification task, on one hand contributes to expanding deep learning embeddings to include continuous and discontinuous linguistic phrases.
no code implementations • EMNLP 2015 • Wenpeng Yin, Tobias Schnabel, Hinrich Schütze
We propose online unsupervised domain adaptation (DA), which is performed incrementally as data comes in and is applicable when batch DA is not possible.
no code implementations • NAACL 2016 • Heike Adel, Benjamin Roth, Hinrich Schütze
We address relation classification in the context of slot filling, the task of finding and evaluating fillers like "Steve Jobs" for the slot X in "X founded Apple".
no code implementations • CONLL 2015 • Wenpeng Yin, Hinrich Schütze
We propose MVCNN, a convolution neural network (CNN) architecture for sentence classification.
1 code implementation • NAACL 2016 • Sascha Rothe, Sebastian Ebert, Hinrich Schütze
Embeddings are generic representations that are useful for many NLP tasks.
no code implementations • WS 2016 • Wenpeng Yin, Sebastian Ebert, Hinrich Schütze
Understanding open-domain text is one of the primary challenges in natural language processing (NLP).
8 code implementations • TACL 2016 • Wenpeng Yin, Hinrich Schütze, Bing Xiang, Bo-Wen Zhou
(ii) We propose three attention schemes that integrate mutual influence between sentences into CNN; thus, the representation of each sentence takes into consideration its counterpart.
1 code implementation • 18 Aug 2015 • Wenpeng Yin, Hinrich Schütze
Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in natural language processing (NLP).
no code implementations • IJCNLP 2015 • Sascha Rothe, Hinrich Schütze
We present \textit{AutoExtend}, a system to learn embeddings for synsets and lexemes.
no code implementations • 19 Dec 2013 • Irina Sergienya, Hinrich Schütze
There are two main approaches to the distributed representation of words: low-dimensional deep learning embeddings and high-dimensional distributional models, in which each dimension corresponds to a context word.
no code implementations • 18 Dec 2013 • Wenpeng Yin, Hinrich Schütze
Deep learning embeddings have been successfully used for many natural language processing problems.