Search Results for author: Hinrich Schütze

Found 141 papers, 62 papers with code

Multidomain Pretrained Language Models for Green NLP

1 code implementation EACL (AdaptNLP) 2021 Antonis Maronikolakis, Hinrich Schütze

Thus, instead of training multiple models, we can train a single multidomain model saving on computational resources and training time.

Domain Adaptation Pretrained Language Models

Few-Shot Text Generation with Natural Language Instructions

no code implementations EMNLP 2021 Timo Schick, Hinrich Schütze

Providing pretrained language models with simple task descriptions in natural language enables them to solve some tasks in a fully unsupervised fashion.

Headline generation Pretrained Language Models +2

Wine is not v i n. On the Compatibility of Tokenizations across Languages

no code implementations Findings (EMNLP) 2021 Antonis Maronikolakis, Philipp Dufter, Hinrich Schütze

The size of the vocabulary is a central design choice in large pretrained language models, with respect to both performance and memory requirements.

Pretrained Language Models

Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes

no code implementations13 May 2022 Antonis Maronikolakis, Philip Baader, Hinrich Schütze

To tackle the rising phenomenon of hate speech, efforts have been made towards data curation and analysis.

Flow-Adapter Architecture for Unsupervised Machine Translation

no code implementations ACL 2022 Yihong Liu, Haris Jabbar, Hinrich Schütze

The primary novelties of our model are: (a) capturing language-specific sentence representations separately for each language using normalizing flows and (b) using a simple transformation of these latent representations for translating from one language to another.

Translation Unsupervised Machine Translation

CaMEL: Case Marker Extraction without Labels

1 code implementation ACL 2022 Leonie Weissweiler, Valentin Hofmann, Masoud Jalili Sabet, Hinrich Schütze

We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages.

Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations

no code implementations17 Mar 2022 Zhen Han, Ruotong Liao, Beiyan Liu, Yao Zhang, Zifeng Ding, Heinz Köppl, Hinrich Schütze, Volker Tresp

We align structured knowledge contained in temporal knowledge graphs with their textual descriptions extracted from news articles and propose a novel knowledge-text prediction task to inject the abundant information from descriptions into temporal knowledge embeddings.

Knowledge Graph Completion Temporal Knowledge Graph Completion

Geographic Adaptation of Pretrained Language Models

no code implementations16 Mar 2022 Valentin Hofmann, Goran Glavaš, Nikola Ljubešić, Janet B. Pierrehumbert, Hinrich Schütze

Geographic linguistic features are commonly used to improve the performance of pretrained language models (PLMs) on NLP tasks where geographic knowledge is intuitively beneficial (e. g., geolocation prediction and dialect feature prediction).

Language Modelling Masked Language Modeling +1

Semantic-Oriented Unlabeled Priming for Large-Scale Language Models

no code implementations12 Feb 2022 Yanchen Liu, Timo Schick, Hinrich Schütze

Due to the high costs associated with finetuning large language models, various recent works propose to adapt them to specific tasks without any parameter updates through in-context learning.

Pretrained Language Models

Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages

no code implementations28 Jan 2022 Silvia Severini, Ayyoob Imani, Philipp Dufter, Hinrich Schütze

Prior work on extracting MNE datasets from parallel corpora required resources such as large monolingual corpora or word aligners that are unavailable or perform poorly for underresourced languages.

Bilingual Lexicon Induction Transliteration

BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief

no code implementations EMNLP 2021 Nora Kassner, Oyvind Tafjord, Hinrich Schütze, Peter Clark

We show that, in a controlled experimental setting, these two mechanisms result in more consistent beliefs in the overall system, improving both the accuracy and consistency of its answers over time.

Language Modelling Pretrained Language Models

Active Learning for Argument Mining: A Practical Approach

no code implementations28 Sep 2021 Nikolai Solmsdorf, Dietrich Trautmann, Hinrich Schütze

Despite considerable recent progress, the creation of well-balanced and diverse resources remains a time-consuming and costly challenge in Argument Mining.

Active Learning Argument Mining

Scene Graph Generation for Better Image Captioning?

no code implementations23 Sep 2021 Maximilian Mozes, Martin Schmitt, Vladimir Golkov, Hinrich Schütze, Daniel Cremers

We investigate the incorporation of visual relationships into the task of supervised image caption generation by proposing a model that leverages detected objects and auto-generated visual relationships to describe images in natural language.

Graph Generation Image Captioning +1

BERT Cannot Align Characters

no code implementations EMNLP (insights) 2021 Antonis Maronikolakis, Philipp Dufter, Hinrich Schütze

We show that the closer two languages are, the better BERT can align them on the character level.

Locating Language-Specific Information in Contextualized Embeddings

1 code implementation16 Sep 2021 Sheng Liang, Philipp Dufter, Hinrich Schütze

Multilingual pretrained language models (MPLMs) exhibit multilinguality and are well suited for transfer across languages.

Pretrained Language Models

Wine is Not v i n. -- On the Compatibility of Tokenizations Across Languages

no code implementations13 Sep 2021 Antonis Maronikolakis, Philipp Dufter, Hinrich Schütze

The size of the vocabulary is a central design choice in large pretrained language models, with respect to both performance and memory requirements.

Pretrained Language Models

Graph Algorithms for Multiparallel Word Alignment

1 code implementation EMNLP 2021 Ayyoob Imani, Masoud Jalili Sabet, Lütfi Kerem Şenel, Philipp Dufter, François Yvon, Hinrich Schütze

With the advent of end-to-end deep learning approaches in machine translation, interest in word alignments initially decreased; however, they have again become a focus of research more recently.

Link Prediction Machine Translation +3

Continuous Entailment Patterns for Lexical Inference in Context

1 code implementation EMNLP 2021 Martin Schmitt, Hinrich Schütze

If we allow for tokens outside the PLM's vocabulary, patterns can be adapted more flexibly to a PLM's idiosyncrasies.

Few-Shot NLI Lexical Entailment +1

Discrete and Soft Prompting for Multilingual Models

1 code implementation EMNLP 2021 Mengjie Zhao, Hinrich Schütze

It has been shown for English that discrete and soft prompting perform strongly in few-shot learning with pretrained language models (PLMs).

Few-Shot Learning Natural Language Inference +1

ParCourE: A Parallel Corpus Explorer for a Massively Multilingual Corpus

no code implementations ACL 2021 Ayyoob Imani, Masoud Jalili Sabet, Philipp Dufter, Michael Cysouw, Hinrich Schütze

With more than 7000 languages worldwide, multilingual natural language processing (NLP) is essential both from an academic and commercial perspective.

Multilingual NLP Transfer Learning

Modeling Ideological Agenda Setting and Framing in Polarized Online Groups with Graph Neural Networks and Structured Sparsity

no code implementations18 Apr 2021 Valentin Hofmann, Janet B. Pierrehumbert, Hinrich Schütze

The increasing polarization of online political discourse calls for computational tools that are able to automatically detect and monitor ideological divides in social media.

Multi-source Neural Topic Modeling in Multi-view Embedding Spaces

1 code implementation NAACL 2021 Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze

Though word embeddings and topics are complementary representations, several past works have only used pretrained word embeddings in (neural) topic modeling to address data sparsity in short-text or small collection of documents.

Information Retrieval Word Embeddings

Generating Datasets with Pretrained Language Models

1 code implementation EMNLP 2021 Timo Schick, Hinrich Schütze

To obtain high-quality sentence embeddings from pretrained language models (PLMs), they must either be augmented with additional pretraining objectives or finetuned on a large set of labeled text pairs.

Pretrained Language Models Semantic Textual Similarity +1

Static Embeddings as Efficient Knowledge Bases?

1 code implementation NAACL 2021 Philipp Dufter, Nora Kassner, Hinrich Schütze

Recent research investigates factual knowledge stored in large pretrained language models (PLMs).

Pretrained Language Models

Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP

1 code implementation28 Feb 2021 Timo Schick, Sahana Udupa, Hinrich Schütze

In this paper, we first demonstrate a surprising finding: pretrained language models recognize, to a considerable degree, their undesirable biases and the toxicity of the content they produce.

Language Modelling Pretrained Language Models

Position Information in Transformers: An Overview

no code implementations22 Feb 2021 Philipp Dufter, Martin Schmitt, Hinrich Schütze

Transformers are arguably the main workhorse in recent Natural Language Processing research.

Language Models for Lexical Inference in Context

1 code implementation EACL 2021 Martin Schmitt, Hinrich Schütze

Lexical inference in context (LIiC) is the task of recognizing textual entailment between two very similar sentences, i. e., sentences that only differ in one expression.

Few-Shot NLI Natural Language Inference +1

Improving Scene Graph Classification by Exploiting Knowledge from Texts

no code implementations9 Feb 2021 Sahand Sharifzadeh, Sina Moayed Baharlou, Martin Schmitt, Hinrich Schütze, Volker Tresp

We show that by fine-tuning the classification pipeline with the extracted knowledge from texts, we can achieve ~8x more accurate results in scene graph classification, ~3x in object classification, and ~1. 5x in predicate classification, compared to the supervised baselines with only 1% of the annotated images.

Classification General Classification +8

Measuring and Improving Consistency in Pretrained Language Models

1 code implementation1 Feb 2021 Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, Yoav Goldberg

In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge?

Pretrained Language Models

A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters

no code implementations ACL 2021 Mengjie Zhao, Yi Zhu, Ehsan Shareghi, Ivan Vulić, Roi Reichart, Anna Korhonen, Hinrich Schütze

Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretrained encoders like multilingual BERT.

Few-Shot Learning

Few-Shot Text Generation with Pattern-Exploiting Training

2 code implementations22 Dec 2020 Timo Schick, Hinrich Schütze

Providing pretrained language models with simple task descriptions in natural language enables them to solve some tasks in a fully unsupervised fashion.

Headline generation Pretrained Language Models +3

Subword Sampling for Low Resource Word Alignment

no code implementations21 Dec 2020 Ehsaneddin Asgari, Masoud Jalili Sabet, Philipp Dufter, Christopher Ringlstetter, Hinrich Schütze

This method's hypothesis is that the aggregation of different granularities of text for certain language pairs can help word-level alignment.

Machine Translation Word Alignment

Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification

1 code implementation COLING 2020 Timo Schick, Helmut Schmid, Hinrich Schütze

A recent approach for few-shot text classification is to convert textual inputs to cloze questions that contain some form of task description, process them with a pretrained language model and map the predicted words to labels.

Few-Shot Text Classification General Classification +2

Dynamic Contextualized Word Embeddings

1 code implementation ACL 2021 Valentin Hofmann, Janet B. Pierrehumbert, Hinrich Schütze

Static word embeddings that represent words by a single vector cannot capture the variability of word meaning in different linguistic and extralinguistic contexts.

Language Modelling Word Embeddings

It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners

4 code implementations NAACL 2021 Timo Schick, Hinrich Schütze

When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance.

Natural Language Understanding Pretrained Language Models

Automatic Domain Adaptation Outperforms Manual Domain Adaptation for Predicting Financial Outcomes

no code implementations ACL 2019 Marina Sedinkina, Nikolas Breitkopf, Hinrich Schütze

In our experiments, we demonstrate that the automatically adapted sentiment dictionary outperforms the previous state of the art in predicting the financial outcomes excess return and volatility.

Domain Adaptation

Neural Topic Modeling with Continual Lifelong Learning

1 code implementation ICML 2020 Pankaj Gupta, Yatin Chaudhary, Thomas Runkler, Hinrich Schütze

To address the problem, we propose a lifelong learning framework for neural topic modeling that can continuously process streams of document collections, accumulate topics and guide future topic modeling tasks by knowledge transfer from several sources to better deal with the sparse data.

Data Augmentation Information Retrieval +1

Explainable and Discourse Topic-aware Neural Language Understanding

1 code implementation ICML 2020 Yatin Chaudhary, Hinrich Schütze, Pankaj Gupta

Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics.

Document Classification Language Modelling +3

Unsupervised Embedding-based Detection of Lexical Semantic Changes

no code implementations16 May 2020 Ehsaneddin Asgari, Christoph Ringlstetter, Hinrich Schütze

This paper describes EmbLexChange, a system introduced by the "Life-Language" team for SemEval-2020 Task 1, on unsupervised detection of lexical-semantic changes.

Identifying Necessary Elements for BERT's Multilinguality

1 code implementation1 May 2020 Philipp Dufter, Hinrich Schütze

We aim to identify architectural properties of BERT and linguistic properties of languages that are necessary for BERT to become multilingual.

Masking as an Efficient Alternative to Finetuning for Pretrained Language Models

no code implementations EMNLP 2020 Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze

We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for pretrained weights in lieu of modifying them through finetuning.

Pretrained Language Models

SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings

2 code implementations Findings of the Association for Computational Linguistics 2020 Masoud Jalili Sabet, Philipp Dufter, François Yvon, Hinrich Schütze

We find that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners, even with abundant parallel data; e. g., contextualized embeddings achieve a word alignment F1 for English-German that is 5 percentage points higher than eflomal, a high-quality statistical aligner, trained on 100k parallel sentences.

Machine Translation Multilingual Word Embeddings +2

Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference

5 code implementations21 Jan 2020 Timo Schick, Hinrich Schütze

Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained language model with "task descriptions" in natural language (e. g., Radford et al., 2019).

Few-Shot Text Classification General Classification +3

Multipurpose Intelligent Process Automation via Conversational Assistant

no code implementations7 Jan 2020 Alena Moiseeva, Dietrich Trautmann, Michael Heimann, Hinrich Schütze

Such intelligent agents can assist the user by answering specific questions and executing routine tasks that are ordinarily performed in a natural language (i. e., customer support).

Transfer Learning

Extending Machine Language Models toward Human-Level Language Understanding

no code implementations12 Dec 2019 James L. McClelland, Felix Hill, Maja Rudolph, Jason Baldridge, Hinrich Schütze

We take language to be a part of a system for understanding and communicating about situations.

Morphological Segmentation Inside-Out

no code implementations EMNLP 2016 Ryan Cotterell, Arun Kumar, Hinrich Schütze

Morphological segmentation has traditionally been modeled with non-hierarchical models, which yield flat segmentations as output.

Morphological Analysis

Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity

no code implementations ACL 2020 Nina Poerner, Ulli Waltinger, Hinrich Schütze

We address the task of unsupervised Semantic Textual Similarity (STS) by ensembling diverse pre-trained sentence encoders into sentence meta-embeddings.

Dimensionality Reduction Semantic Textual Similarity

E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT

1 code implementation Findings of the Association for Computational Linguistics 2020 Nina Poerner, Ulli Waltinger, Hinrich Schütze

We present a novel way of injecting factual knowledge about entities into the pretrained BERT model (Devlin et al., 2019): We align Wikipedia2Vec entity vectors (Yamada et al., 2016) with BERT's native wordpiece vector space and use the aligned entity vectors as if they were wordpiece vectors.

Entity Embeddings Entity Linking +3

BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance

1 code implementation ACL 2020 Timo Schick, Hinrich Schütze

In this work, we transfer this idea to pretrained language models: We introduce BERTRAM, a powerful architecture based on BERT that is capable of inferring high-quality embeddings for rare words that are suitable as input representations for deep language models.

Language Modelling Pretrained Language Models +1

Type-aware Convolutional Neural Networks for Slot Filling

no code implementations1 Oct 2019 Heike Adel, Hinrich Schütze

In particular, we explore different ways of integrating the named entity types of the relation arguments into a neural network for relation classification, including a joint training and a structured prediction approach.

Coreference Resolution General Classification +3

Generating Multi-Sentence Abstractive Summaries of Interleaved Texts

no code implementations25 Sep 2019 Sanjeev Kumar Karn, Francine Chen, Yan-Ying Chen, Ulli Waltinger, Hinrich Schütze

The interleaved posts are encoded hierarchically, i. e., word-to-word (words in a post) followed by post-to-post (posts in a channel).

Disentanglement

Multi-source Multi-view Transfer Learning in Neural Topic Modeling with Pretrained Topic and Word Embeddings

no code implementations25 Sep 2019 Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze

Though word embeddings and topics are complementary representations, several past works have only used pretrained word embeddings in (neural) topic modeling to address data sparsity problem in short text or small collection of documents.

Information Retrieval Transfer Learning +1

Multi-view and Multi-source Transfers in Neural Topic Modeling with Pretrained Topic and Word Embeddings

no code implementations14 Sep 2019 Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze

Though word embeddings and topics are complementary representations, several past works have only used pre-trained word embeddings in (neural) topic modeling to address data sparsity problem in short text or small collection of documents.

Information Retrieval Transfer Learning +1

Neural Architectures for Fine-Grained Propaganda Detection in News

no code implementations WS 2019 Pankaj Gupta, Khushbu Saxena, Usama Yaseen, Thomas Runkler, Hinrich Schütze

To address the tasks of sentence (SLC) and fragment level (FLC) propaganda detection, we explore different neural architectures (e. g., CNN, LSTM-CRF and BERT) and extract linguistic (e. g., part-of-speech, named entity, readability, sentiment, emotion, etc.

Propaganda detection

A Hierarchical Decoder with Three-level Hierarchical Attention to Generate Abstractive Summaries of Interleaved Texts

no code implementations5 Jun 2019 Sanjeev Kumar Karn, Francine Chen, Yan-Ying Chen, Ulli Waltinger, Hinrich Schütze

Interleaved texts, where posts belonging to different threads occur in one sequence, are a common occurrence, e. g., online chat conversations.

SherLIiC: A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference

1 code implementation ACL 2019 Martin Schmitt, Hinrich Schütze

We present SherLIiC, a testbed for lexical inference in context (LIiC), consisting of 3985 manually annotated inference rule candidates (InfCands), accompanied by (i) ~960k unlabeled InfCands, and (ii) ~190k typed textual relations between Freebase entities extracted from the large entity-linked corpus ClueWeb09.

Lexical Entailment Natural Language Inference

Analytical Methods for Interpretable Ultradense Word Embeddings

1 code implementation IJCNLP 2019 Philipp Dufter, Hinrich Schütze

In this work, we investigate three methods for making word spaces interpretable by rotation: Densifier (Rothe et al., 2016), linear SVMs and DensRay, a new method we propose.

Word Embeddings

Rare Words: A Major Problem for Contextualized Embeddings And How to Fix it by Attentive Mimicking

2 code implementations14 Apr 2019 Timo Schick, Hinrich Schütze

Pretraining deep neural network architectures with a language modeling objective has brought large improvements for many natural language processing tasks.

Language Modelling

Attentive Mimicking: Better Word Embeddings by Attending to Informative Contexts

1 code implementation NAACL 2019 Timo Schick, Hinrich Schütze

Learning high-quality embeddings for rare words is a hard problem because of sparse context information.

Word Embeddings

Learning Semantic Representations for Novel Words: Leveraging Both Form and Context

1 code implementation9 Nov 2018 Timo Schick, Hinrich Schütze

The general problem setting is that word embeddings are induced on an unlabeled training corpus and then a model is trained that embeds novel words into this induced embedding space.

Learning Semantic Representations Word Embeddings

Multilingual Embeddings Jointly Induced from Contexts and Concepts: Simple, Strong and Scalable

no code implementations1 Nov 2018 Philipp Dufter, Mengjie Zhao, Hinrich Schütze

A simple and effective context-based multilingual embedding learner is Levy et al. (2017)'s S-ID (sentence ID) method.

Multilingual Word Embeddings

Multi-Multi-View Learning: Multilingual and Multi-Representation Entity Typing

1 code implementation EMNLP 2018 Yadollah Yaghoobzadeh, Hinrich Schütze

For representation, we consider representations based on the context distribution of the entity (i. e., on its embedding), on the entity's name (i. e., on its surface form) and on its description in Wikipedia.

Entity Typing Multiview Learning +1

Neural Relation Extraction Within and Across Sentence Boundaries

1 code implementation11 Oct 2018 Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Bernt Andrassy, Thomas Runkler

iDepNN models the shortest and augmented dependency paths via recurrent and recursive neural networks to extract relationships within (intra-) and across (inter-) sentence boundaries.

Relation Extraction

textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior

1 code implementation ICLR 2019 Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze

We address two challenges of probabilistic topic modelling in order to better estimate the probability of a word in a given context, i. e., P(word|context): (1) No Language Structure in Context: Probabilistic topic models ignore word order by summarizing a given context as a "bag-of-word" and consequently the semantics of words in the context is lost.

Information Extraction Information Retrieval +3

Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting

no code implementations EMNLP 2018 Katharina Kann, Hinrich Schütze

Neural state-of-the-art sequence-to-sequence (seq2seq) models often do not perform well for small training sets.

Interpretable Textual Neuron Representations for NLP

2 code implementations WS 2018 Nina Poerner, Benjamin Roth, Hinrich Schütze

Input optimization methods, such as Google Deep Dream, create interpretable representations of neurons for computer vision DNNs.

Document Informed Neural Autoregressive Topic Models with Distributional Prior

1 code implementation15 Sep 2018 Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze

Here, we extend a neural autoregressive topic model to exploit the full context information around words in a document in a language modeling fashion.

Language Modelling Topic Models

Neural Semi-Markov Conditional Random Fields for Robust Character-Based Part-of-Speech Tagging

no code implementations NAACL 2019 Apostolos Kemos, Heike Adel, Hinrich Schütze

Character-level models of tokens have been shown to be effective at dealing with within-token noise and out-of-vocabulary words.

Part-Of-Speech Tagging

Document Informed Neural Autoregressive Topic Models

1 code implementation11 Aug 2018 Pankaj Gupta, Florian Buettner, Hinrich Schütze

Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks.

Language Modelling Text Categorization +1

LISA: Explaining Recurrent Neural Network Judgments via Layer-wIse Semantic Accumulation and Example to Pattern Transformation

no code implementations WS 2018 Pankaj Gupta, Hinrich Schütze

Recurrent neural networks (RNNs) are temporal networks and cumulative in nature that have shown promising results in various natural language processing tasks.

Decision Making Relation Classification +1

News Article Teaser Tweets and How to Generate Them

2 code implementations NAACL 2019 Sanjeev Kumar Karn, Mark Buckley, Ulli Waltinger, Hinrich Schütze

In this work, we define the task of teaser generation and provide an evaluation benchmark and baseline systems for the process of generating teasers.

Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retrieval in Asymmetric Texts

no code implementations COLING 2018 Pankaj Gupta, Bernt Andrassy, Hinrich Schütze

The task is challenging due to significant term mismatch in the query and ticket pairs of asymmetric lengths, where subject is a short text but description and solution are multi-sentence texts.

Joint Bootstrapping Machines for High Confidence Relation Extraction

1 code implementation NAACL 2018 Pankaj Gupta, Benjamin Roth, Hinrich Schütze

Semi-supervised bootstrapping techniques for relationship extraction from text iteratively expand a set of initial seed instances.

Relationship Extraction (Distant Supervised)

Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages

no code implementations NAACL 2018 Katharina Kann, Manuel Mager, Ivan Meza-Ruiz, Hinrich Schütze

Morphological segmentation for polysynthetic languages is challenging, because a word may consist of many individual morphemes and training data can be extremely scarce.

Cross-Lingual Transfer Data Augmentation

Neural Architectures for Open-Type Relation Argument Extraction

no code implementations5 Mar 2018 Benjamin Roth, Costanza Conforti, Nina Poerner, Sanjeev Karn, Hinrich Schütze

In this work, we introduce the task of Open-Type Relation Argument Extraction (ORAE): Given a corpus, a query entity Q and a knowledge base relation (e. g.,"Q authored notable work with title X"), the model has to extract an argument of non-standard entity type (entities that cannot be extracted by a standard named entity tagger, e. g. X: the title of a book or a work of art) from the corpus.

Question Answering

Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time

no code implementations NAACL 2018 Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Bernt Andrassy

We also introduce a metric (named as SPAN) to quantify the capability of dynamic topic model to capture word evolution in topics over time.

Topic Models

Impact of Coreference Resolution on Slot Filling

no code implementations26 Oct 2017 Heike Adel, Hinrich Schütze

In this paper, we demonstrate the importance of coreference resolution for natural language processing on the example of the TAC Slot Filling shared task.

Coreference Resolution Slot Filling

Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms

1 code implementation TACL 2018 Wenpeng Yin, Hinrich Schütze

We hypothesize that this is because the attention in CNNs has been mainly implemented as attentive pooling (i. e., it is applied to pooling) rather than as attentive convolution (i. e., it is integrated into convolution).

Natural Language Inference Representation Learning +1

Corpus-level Fine-grained Entity Typing

no code implementations7 Aug 2017 Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schütze

This paper addresses the problem of corpus-level entity typing, i. e., inferring from a large corpus that an entity is a member of a class such as "food" or "artist".

Entity Typing Knowledge Base Completion

Unlabeled Data for Morphological Generation With Character-Based Sequence-to-Sequence Models

no code implementations WS 2017 Katharina Kann, Hinrich Schütze

We present a semi-supervised way of training a character-based encoder-decoder recurrent neural network for morphological reinflection, the task of generating one inflected word form from another.

Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages

no code implementations EMNLP 2017 Ehsaneddin Asgari, Hinrich Schütze

We present SuperPivot, an analysis method for low-resource languages that occur in a superparallel corpus, i. e., in a corpus that contains an order of magnitude more languages than parallel corpora currently in use.

One-Shot Neural Cross-Lingual Transfer for Paradigm Completion

no code implementations ACL 2017 Katharina Kann, Ryan Cotterell, Hinrich Schütze

We present a novel cross-lingual transfer method for paradigm completion, the task of mapping a lemma to its inflected forms, using a neural encoder-decoder model, the state of the art for the monolingual task.

Cross-Lingual Transfer One-Shot Learning

Comparative Study of CNN and RNN for Natural Language Processing

4 code implementations7 Feb 2017 Wenpeng Yin, Katharina Kann, Mo Yu, Hinrich Schütze

Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP).

Task-Specific Attentive Pooling of Phrase Alignments Contributes to Sentence Matching

no code implementations EACL 2017 Wenpeng Yin, Hinrich Schütze

This work studies comparatively two typical sentence matching tasks: textual entailment (TE) and answer selection (AS), observing that weaker phrase alignments are more critical in TE, while stronger phrase alignments deserve more attention in AS.

Answer Selection Natural Language Inference +1

Joint Semantic Synthesis and Morphological Analysis of the Derived Word

no code implementations TACL 2018 Ryan Cotterell, Hinrich Schütze

Since morphology obeys the principle of compositionality, the semantics of the word can be systematically derived from the meaning of its parts.

Additive models Morphological Analysis

Noise Mitigation for Neural Entity Typing and Relation Extraction

no code implementations EACL 2017 Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schütze

For the second noise type, we propose ways to improve the integration of noisy entity type predictions into relation extraction.

Entity Typing Multi-Label Learning +1

Exploring Different Dimensions of Attention for Uncertainty Detection

no code implementations EACL 2017 Heike Adel, Hinrich Schütze

Neural networks with attention have proven effective for many natural language processing tasks.

Neural Multi-Source Morphological Reinflection

no code implementations EACL 2017 Katharina Kann, Ryan Cotterell, Hinrich Schütze

We explore the task of multi-source morphological reinflection, which generalizes the standard, single-source version.

TAG

Corpus-level Fine-grained Entity Typing Using Contextual Information

no code implementations EMNLP 2015 Yadollah Yaghoobzadeh, Hinrich Schütze

This paper addresses the problem of corpus-level entity typing, i. e., inferring from a large corpus that an entity is a member of a class such as "food" or "artist".

Entity Typing Knowledge Base Completion +1

Simple Question Answering by Attentive Convolutional Neural Network

no code implementations COLING 2016 Wenpeng Yin, Mo Yu, Bing Xiang, Bo-Wen Zhou, Hinrich Schütze

In fact selection, we match the subject entity in a fact candidate with the entity mention in the question by a character-level convolutional neural network (char-CNN), and match the predicate in that fact with the question by a word-level CNN (word-CNN).

Entity Linking Fact Selection +1

Single-Model Encoder-Decoder with Explicit Morphological Representation for Reinflection

1 code implementation ACL 2016 Katharina Kann, Hinrich Schütze

Morphological reinflection is the task of generating a target form given a source form, a source tag and a target tag.

TAG

Combining Recurrent and Convolutional Neural Networks for Relation Classification

no code implementations NAACL 2016 Ngoc Thang Vu, Heike Adel, Pankaj Gupta, Hinrich Schütze

This paper investigates two different neural architectures for the task of relation classification: convolutional neural networks and recurrent neural networks.

Classification General Classification +1

Why and How to Pay Different Attention to Phrase Alignments of Different Intensities

no code implementations23 Apr 2016 Wenpeng Yin, Hinrich Schütze

We address the problems of identifying phrase alignments of flexible granularity and pooling alignments of different intensities for these tasks.

Answer Selection Natural Language Inference +1

Discriminative Phrase Embedding for Paraphrase Identification

no code implementations HLT 2015 Wenpeng Yin, Hinrich Schütze

This work, concerning paraphrase identification task, on one hand contributes to expanding deep learning embeddings to include continuous and discontinuous linguistic phrases.

Paraphrase Identification

Online Updating of Word Representations for Part-of-Speech Tagging

no code implementations EMNLP 2015 Wenpeng Yin, Tobias Schnabel, Hinrich Schütze

We propose online unsupervised domain adaptation (DA), which is performed incrementally as data comes in and is applicable when batch DA is not possible.

Part-Of-Speech Tagging POS +1

Comparing Convolutional Neural Networks to Traditional Models for Slot Filling

no code implementations NAACL 2016 Heike Adel, Benjamin Roth, Hinrich Schütze

We address relation classification in the context of slot filling, the task of finding and evaluating fillers like "Steve Jobs" for the slot X in "X founded Apple".

Classification General Classification +2

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

8 code implementations TACL 2016 Wenpeng Yin, Hinrich Schütze, Bing Xiang, Bo-Wen Zhou

(ii) We propose three attention schemes that integrate mutual influence between sentences into CNN; thus, the representation of each sentence takes into consideration its counterpart.

Answer Selection Natural Language Inference +1

Learning Meta-Embeddings by Using Ensembles of Embedding Sets

1 code implementation18 Aug 2015 Wenpeng Yin, Hinrich Schütze

Word embeddings -- distributed representations of words -- in deep learning are beneficial for many tasks in natural language processing (NLP).

Part-Of-Speech Tagging Word Embeddings +1

Distributional Models and Deep Learning Embeddings: Combining the Best of Both Worlds

no code implementations19 Dec 2013 Irina Sergienya, Hinrich Schütze

There are two main approaches to the distributed representation of words: low-dimensional deep learning embeddings and high-dimensional distributional models, in which each dimension corresponds to a context word.

Deep Learning Embeddings for Discontinuous Linguistic Units

no code implementations18 Dec 2013 Wenpeng Yin, Hinrich Schütze

Deep learning embeddings have been successfully used for many natural language processing problems.

Coreference Resolution

Cannot find the paper you are looking for? You can Submit a new open access paper.