Search Results for author: Heng Ji

Found 170 papers, 53 papers with code

Semi-supervised New Event Type Induction and Event Detection

no code implementations EMNLP 2020 Lifu Huang, Heng Ji

We design a Semi-Supervised Vector Quantized Variational Autoencoder framework to automatically learn a discrete latent type representation for each seen and unseen type and optimize them using seen type event annotations.

Event Detection Event Extraction

Knowledge-Enriched Natural Language Generation

1 code implementation EMNLP (ACL) 2021 Wenhao Yu, Meng Jiang, Zhiting Hu, Qingyun Wang, Heng Ji, Nazneen Rajani

Knowledge-enriched text generation poses unique challenges in modeling and learning, driving active research in several core directions, ranging from integrated modeling of neural representations and symbolic information in the sequential/hierarchical/graphical structures, learning without direct supervisions due to the cost of structured annotation, efficient optimization and inference with massive and global constraints, to language grounding on multiple modalities, and generative reasoning with implicit commonsense knowledge and background knowledge.

Text Generation

Coreference by Appearance: Visually Grounded Event Coreference Resolution

no code implementations CRAC (ACL) 2021 Liming Wang, Shengyu Feng, Xudong Lin, Manling Li, Heng Ji, Shih-Fu Chang

Event coreference resolution is critical to understand events in the growing number of online news with multiple modalities including text, video, speech, etc.

Coreference Resolution Event Coreference Resolution +2

Timeline Summarization based on Event Graph Compression via Time-Aware Optimal Transport

1 code implementation EMNLP 2021 Manling Li, Tengfei Ma, Mo Yu, Lingfei Wu, Tian Gao, Heng Ji, Kathleen McKeown

Timeline Summarization identifies major events from a news collection and describes them following temporal order, with key dates tagged.

Timeline Summarization

Lifelong Event Detection with Knowledge Transfer

1 code implementation EMNLP 2021 Pengfei Yu, Heng Ji, Prem Natarajan

We focus on lifelong event detection as an exemplar case and propose a new problem formulation that is also generalizable to other IE tasks.

Event Detection Transfer Learning

The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction

1 code implementation EMNLP 2021 Manling Li, Sha Li, Zhenhailong Wang, Lifu Huang, Kyunghyun Cho, Heng Ji, Jiawei Han, Clare Voss

We introduce a new concept of Temporal Complex Event Schema: a graph-based schema representation that encompasses events, arguments, temporal connections and argument relations.

Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries

1 code implementation EMNLP 2021 Carl Edwards, ChengXiang Zhai, Heng Ji

Moreover, this can be viewed as an especially challenging cross-lingual retrieval problem by considering the molecules as a language with a very unique grammar.

Cross-Modal Retrieval

EventKE: Event-Enhanced Knowledge Graph Embedding

no code implementations Findings (EMNLP) 2021 Zixuan Zhang, Hongwei Wang, Han Zhao, Hanghang Tong, Heng Ji

Relations in most of the traditional knowledge graphs (KGs) only reflect static and factual connections, but fail to represent the dynamic activities and state changes about entities.

Knowledge Graph Embedding Knowledge Graphs

Personalized Entity Resolution with Dynamic Heterogeneous KnowledgeGraph Representations

no code implementations ACL (ECNLP) 2021 Ying Lin, Han Wang, Jiangning Chen, Tong Wang, Yue Liu, Heng Ji, Yang Liu, Premkumar Natarajan

We first build a cross-source heterogeneous knowledge graph from customer purchase history and product knowledge graph to jointly learn customer and product embeddings.

Entity Resolution

CLIP-Event: Connecting Text and Images with Event Structures

no code implementations13 Jan 2022 Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji, Shih-Fu Chang

Vision-language (V+L) pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text.

Contrastive Learning Event Extraction +1

MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding

no code implementations20 Dec 2021 Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander Schwing, Heng Ji

Specifically, the task involves multi-hop questions that require reasoning over image-caption pairs to identify the grounded visual object being referred to and then predicting a span from the news body text to answer the question.

Data Augmentation Question-Answer-Generation +1

Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences

1 code implementation10 Dec 2021 Yifan Chen, Qi Zeng, Dilek Hakkani-Tur, Di Jin, Heng Ji, Yun Yang

Transformer-based models are not efficient in processing long sequences due to the quadratic space and time complexity of the self-attention modules.

Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification

1 code implementation5 Dec 2021 Zhenhailong Wang, Heng Ji

State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks.

EEG Sentiment Analysis

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nyström Method

1 code implementation NeurIPS 2021 Yifan Chen, Qi Zeng, Heng Ji, Yun Yang

Transformers are expensive to train due to the quadratic time and space complexity in the self-attention mechanism.

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

1 code implementation EMNLP 2021 Yu Meng, Yunyi Zhang, Jiaxin Huang, Xuan Wang, Yu Zhang, Heng Ji, Jiawei Han

We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base.

Language Modelling Named Entity Recognition +1

Corpus-based Open-Domain Event Type Induction

1 code implementation EMNLP 2021 Jiaming Shen, Yunyi Zhang, Heng Ji, Jiawei Han

As events of the same type could be expressed in multiple ways, we propose to represent each event type as a cluster of <predicate sense, object head> pairs.

Event Extraction

IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System

no code implementations3 Sep 2021 Daniel Campos, Heng Ji

A large portion of chemistry literature focuses on new molecules and reactions between molecules.

Image Captioning

A Unified Transformer-based Framework for Duplex Text Normalization

no code implementations23 Aug 2021 Tuan Manh Lai, Yang Zhang, Evelina Bakhturina, Boris Ginsburg, Heng Ji

In addition, we also create a cleaned dataset from the Spoken Wikipedia Corpora for German and report the performance of our systems on the dataset.

Data Augmentation Speech Recognition +2

Fine-grained Information Extraction from Biomedical Literature based on Knowledge-enriched Abstract Meaning Representation

no code implementations ACL 2021 Zixuan Zhang, Nikolaus Parulian, Heng Ji, Ahmed Elsayed, Skatje Myers, Martha Palmer

In this paper, we propose a novel biomedical Information Extraction (IE) model to tackle these two challenges and extract scientific entities and events from English research papers.

Event Extraction Graph Attention

Event-Centric Natural Language Processing

no code implementations ACL 2021 Muhao Chen, Hongming Zhang, Qiang Ning, Manling Li, Heng Ji, Kathleen McKeown, Dan Roth

This tutorial targets researchers and practitioners who are interested in AI technologies that help machines understand natural language text, particularly real-world events described in the text.

HySPA: Hybrid Span Generation for Scalable Text-to-Graph Extraction

1 code implementation Findings (ACL) 2021 Liliang Ren, Chenkai Sun, Heng Ji, Julia Hockenmaier

Text-to-Graph extraction aims to automatically extract information graphs consisting of mentions and types from natural language texts.

Joint Entity and Relation Extraction

Event Time Extraction and Propagation via Graph Attention Networks

1 code implementation NAACL 2021 Haoyang Wen, Yanru Qu, Heng Ji, Qiang Ning, Jiawei Han, Avi Sil, Hanghang Tong, Dan Roth

Grounding events into a precise timeline is important for natural language understanding but has received limited attention in recent work.

Graph Attention Natural Language Understanding +1

Deep Learning on Graphs for Natural Language Processing

no code implementations NAACL 2021 Lingfei Wu, Yu Chen, Heng Ji, Yunyao Li

Due to its great power in modeling non-Euclidean data like graphs or manifolds, deep learning on graph techniques (i. e., Graph Neural Networks (GNNs)) have opened a new door to solving challenging graph-related NLP problems.

graph construction Graph Representation Learning +8

Abstract Meaning Representation Guided Graph Encoding and Decoding for Joint Information Extraction

1 code implementation NAACL 2021 Zixuan Zhang, Heng Ji

The tasks of Rich Semantic Parsing, such as Abstract Meaning Representation (AMR), share similar goals with Information Extraction (IE) to convert natural language texts into structured semantic representations.

Semantic Parsing

RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System

1 code implementation NAACL 2021 Haoyang Wen, Ying Lin, Tuan Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, Hongming Zhang, Xiaodong Yu, Alexander Dong, Zhenhailong Wang, Yi Fung, Piyush Mishra, Qing Lyu, D{\'\i}dac Sur{\'\i}s, Brian Chen, Susan Windisch Brown, Martha Palmer, Chris Callison-Burch, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang, Heng Ji

We present a new information extraction system that can automatically construct temporal event graphs from a collection of news documents from multiple sources, multiple languages (English and Spanish for our experiment), and multiple data modalities (speech, text, image and video).

Coreference Resolution Event Extraction

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method

1 code implementation NeurIPS 2021 Yifan Chen, Qi Zeng, Heng Ji, Yun Yang

Transformers are expensive to train due to the quadratic time and space complexity in the self-attention mechanism.

Stage-wise Fine-tuning for Graph-to-Text Generation

1 code implementation ACL 2021 Qingyun Wang, Semih Yavuz, Victoria Lin, Heng Ji, Nazneen Rajani

Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders.

Ranked #3 on Data-to-Text Generation on WebNLG (using extra training data)

Data-to-Text Generation KB-to-Language Generation +1

VAULT: VAriable Unified Long Text Representation for Machine Reading Comprehension

no code implementations ACL 2021 Haoyang Wen, Anthony Ferritto, Heng Ji, Radu Florian, Avirup Sil

Existing models on Machine Reading Comprehension (MRC) require complex model architecture for effectively modeling long texts with paragraph representation and classification, thereby making inference computationally inefficient for production use.

Machine Reading Comprehension

Learning Shared Semantic Space for Speech-to-Text Translation

2 code implementations Findings (ACL) 2021 Chi Han, Mingxuan Wang, Heng Ji, Lei LI

By projecting audio and text features to a common semantic representation, Chimera unifies MT and ST tasks and boosts the performance on ST benchmarks, MuST-C and Augmented Librispeech, to a new state-of-the-art.

Machine Translation Speech-to-Text Translation +1

Towards Robust Neural Retrieval Models with Synthetic Pre-Training

no code implementations15 Apr 2021 Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil

Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems.

Information Retrieval Machine Reading Comprehension

Future is not One-dimensional: Graph Modeling based Complex Event Schema Induction for Event Prediction

no code implementations13 Apr 2021 Manling Li, Sha Li, Zhenhailong Wang, Lifu Huang, Kyunghyun Cho, Heng Ji, Jiawei Han, Clare Voss

We introduce the concept of Temporal Complex Event Schema: a graph-based schema representation that encompasses events, arguments, temporal connections and argument relations.

Document-Level Event Argument Extraction by Conditional Generation

1 code implementation NAACL 2021 Sha Li, Heng Ji, Jiawei Han

On the task of argument extraction, we achieve an absolute gain of 7. 6% F1 and 5. 7% F1 over the next best model on the RAMS and WikiEvents datasets respectively.

Document-level Event Extraction Event Extraction +1

Personalized Entity Resolution with Dynamic Heterogeneous Knowledge Graph Representations

no code implementations6 Apr 2021 Ying Lin, Han Wang, Jiangning Chen, Tong Wang, Yue Liu, Heng Ji, Yang Liu, Premkumar Natarajan

For example, with "add milk to my cart", a customer may refer to a certain organic product, while some customers may want to re-order products they regularly purchase.

Entity Resolution

Efficient Attentions for Long Document Summarization

1 code implementation NAACL 2021 Luyang Huang, Shuyang Cao, Nikolaus Parulian, Heng Ji, Lu Wang

The quadratic computational and memory complexities of large Transformers have limited their scalability for long document summarization.

Document Summarization

Controllable and Diverse Text Generation in E-commerce

no code implementations23 Feb 2021 Huajie Shao, Jun Wang, Haohong Lin, Xuezhou Zhang, Aston Zhang, Heng Ji, Tarek Abdelzaher

The algorithm is injected into a Conditional Variational Autoencoder (CVAE), allowing \textit{Apex} to control both (i) the order of keywords in the generated sentences (conditioned on the input keywords and their order), and (ii) the trade-off between diversity and accuracy.

Text Generation

White Paper: Challenges and Considerations for the Creation of a Large Labelled Repository of Online Videos with Questionable Content

no code implementations25 Jan 2021 Thamar Solorio, Mahsa Shafaei, Christos Smailis, Mona Diab, Theodore Giannakopoulos, Heng Ji, Yang Liu, Rada Mihalcea, Smaranda Muresan, Ioannis Kakadiaris

This white paper presents a summary of the discussions regarding critical considerations to develop an extensive repository of online videos annotated with labels indicating questionable content.

MUSE: Textual Attributes Guided Portrait Painting Generation

1 code implementation9 Nov 2020 Xiaodan Hu, Pengfei Yu, Kevin Knight, Heng Ji, Bo Li, Honghui Shi

Experiments show that our approach can accurately illustrate 78% textual attributes, which also help MUSE capture the subject in a more creative and expressive way.

KompaRe: A Knowledge Graph Comparative Reasoning System

no code implementations6 Nov 2020 Lihui Liu, Boxin Du, Heng Ji, Hanghang Tong

In detail, we develop KompaRe, the first of its kind prototype system that provides comparative reasoning capability over large knowledge graphs.

Knowledge Graphs

Constrained Abstractive Summarization: Preserving Factual Consistency with Constrained Generation

2 code implementations24 Oct 2020 Yuning Mao, Xiang Ren, Heng Ji, Jiawei Han

Despite significant progress, state-of-the-art abstractive summarization methods are still prone to hallucinate content inconsistent with the source document.

Abstractive Text Summarization Keyphrase Extraction

Global Attention for Name Tagging

no code implementations CONLL 2018 Boliang Zhang, Spencer Whitehead, Lifu Huang, Heng Ji

Many name tagging approaches use local contextual information with much success, but fail when the local context is ambiguous or limited.

Text Classification Using Label Names Only: A Language Model Self-Training Approach

1 code implementation EMNLP 2020 Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, Jiawei Han

In this paper, we explore the potential of only using the label name of each class to train classification models on unlabeled data, without using any labeled documents.

Document Classification General Classification +3

ReviewRobot: Explainable Paper Review Generation based on Knowledge Synthesis

1 code implementation INLG (ACL) 2020 Qingyun Wang, Qi Zeng, Lifu Huang, Kevin Knight, Heng Ji, Nazneen Fatema Rajani

To assist human review process, we build a novel ReviewRobot to automatically assign a review score and write comments for multiple categories such as novelty and meaningful comparison.

Review Generation

A Survey of Knowledge-Enhanced Text Generation

3 code implementations9 Oct 2020 Wenhao Yu, Chenguang Zhu, Zaitang Li, Zhiting Hu, Qingyun Wang, Heng Ji, Meng Jiang

To address this issue, researchers have considered incorporating various forms of knowledge beyond the input text into the generation models.

Text Generation

Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction

1 code implementation Findings of the Association for Computational Linguistics 2020 Ranran Haoran Zhang, Qianying Liu, Aysa Xuemo Fan, Heng Ji, Daojian Zeng, Fei Cheng, Daisuke Kawahara, Sadao Kurohashi

We propose a novel Sequence-to-Unordered-Multi-Tree (Seq2UMTree) model to minimize the effects of exposure bias by limiting the decoding length to three within a triplet and removing the order among triplets.

Joint Entity and Relation Extraction

GAIA: A Fine-grained Multimedia Knowledge Extraction System

no code implementations ACL 2020 Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski, Marjorie Freedman

We present the first comprehensive, open source multimedia knowledge extraction system that takes a massive stream of unstructured, heterogeneous multimedia data from various sources and languages as input, and creates a coherent, structured knowledge base, indexing entities, relations, and events, following a rich, fine-grained ontology.

A Joint Neural Model for Information Extraction with Global Features

no code implementations ACL 2020 Ying Lin, Heng Ji, Fei Huang, Lingfei Wu

OneIE performs end-to-end IE in four stages: (1) Encoding a given sentence as contextualized word representations; (2) Identifying entity mentions and event triggers as nodes; (3) Computing label scores for all nodes and their pairwise links using local classifiers; (4) Searching for the globally optimal graph with a beam decoder.

Cross-media Structured Common Space for Multimedia Event Extraction

no code implementations ACL 2020 Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang

We introduce a new task, MultiMedia Event Extraction (M2E2), which aims to extract events and their arguments from multimedia documents.

Event Extraction

Training with Streaming Annotation

no code implementations11 Feb 2020 Tongtao Zhang, Heng Ji, Shih-Fu Chang, Marjorie Freedman

In this paper, we address a practical scenario where training data is released in a sequence of small-scale batches and annotation in earlier phases has lower quality than the later counterparts.

Event Extraction

An Attentive Fine-Grained Entity Typing Model with Latent Type Representation

no code implementations IJCNLP 2019 Ying Lin, Heng Ji

In addition, we propose a two-step mention-aware attention mechanism to enable the model to focus on important words in mentions and contexts.

Entity Typing Type prediction +1

Cross-lingual Structure Transfer for Relation and Event Extraction

no code implementations IJCNLP 2019 Ananya Subburathinam, Di Lu, Heng Ji, Jonathan May, Shih-Fu Chang, Avirup Sil, Clare Voss

The identification of complex semantic structures such as events and entity relations, already a challenging Information Extraction task, is doubly difficult from sources written in under-resourced and under-annotated languages.

Event Extraction Relation Extraction

Cross-lingual Joint Entity and Word Embedding to Improve Entity Linking and Parallel Sentence Mining

no code implementations WS 2019 Xiaoman Pan, Thamme Gowda, Heng Ji, Jonathan May, Scott Miller

Because this multilingual common space directly relates the semantics of contextual words in the source language to that of entities in the target language, we leverage it for unsupervised cross-lingual entity linking.

Cross-Lingual Entity Linking Entity Linking

Low-Resource Name Tagging Learned with Weakly Labeled Data

1 code implementation IJCNLP 2019 Yixin Cao, Zikun Hu, Tat-Seng Chua, Zhiyuan Liu, Heng Ji

Name tagging in low-resource languages or domains suffers from inadequate training data.

TAG

Raw-to-End Name Entity Recognition in Social Media

1 code implementation14 Aug 2019 Liyuan Liu, Zihan Wang, Jingbo Shang, Dandong Yin, Heng Ji, Xiang Ren, Shaowen Wang, Jiawei Han

Our model neither requires the conversion from character sequences to word sequences, nor assumes tokenizer can correctly detect all word boundaries.

Named Entity Recognition NER

Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization

no code implementations ACL 2019 Manling Li, Lingyu Zhang, Heng Ji, Richard J. Radke

Transcripts of natural, multi-person meetings differ significantly from documents like news articles, which can make Natural Language Generation models for generating summaries unfocused.

Meeting Summarization Text Generation

Cross-lingual NIL Entity Clustering for Low-resource Languages

no code implementations WS 2019 Kevin Blissett, Heng Ji

Clustering unlinkable entity mentions across documents in multiple languages (cross-lingual NIL Clustering) is an important task as part of Entity Discovery and Linking (EDL).

Biomedical Event Extraction based on Knowledge-driven Tree-LSTM

no code implementations NAACL 2019 Diya Li, Lifu Huang, Heng Ji, Jiawei Han

Event extraction for the biomedical domain is more challenging than that in the general news domain since it requires broader acquisition of domain-specific knowledge and deeper understanding of complex contexts.

Entity Linking Event Extraction

Multilingual Entity, Relation, Event and Human Value Extraction

no code implementations NAACL 2019 Manling Li, Ying Lin, Joseph Hoover, Spencer Whitehead, Clare Voss, Morteza Dehghani, Heng Ji

This paper demonstrates a state-of-the-art end-to-end multilingual (English, Russian, and Ukrainian) knowledge extraction system that can perform entity discovery and linking, relation extraction, event extraction, and coreference.

Event Extraction Relation Extraction

PaperRobot: Incremental Draft Generation of Scientific Ideas

2 code implementations ACL 2019 Qingyun Wang, Lifu Huang, Zhiying Jiang, Kevin Knight, Heng Ji, Mohit Bansal, Yi Luan

We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper.

Graph Attention Knowledge Graphs +4

A Grounded Unsupervised Universal Part-of-Speech Tagger for Low-Resource Languages

1 code implementation NAACL 2019 Ronald Cardenas, Ying Lin, Heng Ji, Jonathan May

We also show extrinsically that incorporating our POS tagger into a name tagger leads to state-of-the-art tagging performance in Sinhalese and Kinyarwanda, two languages with nearly no labeled POS data available.

Decipherment Part-Of-Speech Tagging +2

Improving Question Answering with External Knowledge

1 code implementation WS 2019 Xiaoman Pan, Kai Sun, Dian Yu, Jianshu Chen, Heng Ji, Claire Cardie, Dong Yu

We focus on multiple-choice question answering (QA) tasks in subject areas such as science, where we require both broad background knowledge and the facts from the given subject-area reference corpus.

Question Answering

Visualizing Group Dynamics based on Multiparty Meeting Understanding

no code implementations EMNLP 2018 Ni Zhang, Tongtao Zhang, Indrani Bhattacharya, Heng Ji, Rich Radke

Group discussions are usually aimed at sharing opinions, reaching consensus and making good decisions based on group knowledge.

Decision Making Opinion Mining +1

Genre Separation Network with Adversarial Training for Cross-genre Relation Extraction

no code implementations EMNLP 2018 Ge Shi, Chong Feng, Lifu Huang, Boliang Zhang, Heng Ji, Lejian Liao, He-Yan Huang

Relation Extraction suffers from dramatical performance decrease when training a model on one genre and directly applying it to a new genre, due to the distinct feature distributions.

Feature Engineering Relation Extraction +1

Incorporating Background Knowledge into Video Description Generation

no code implementations EMNLP 2018 Spencer Whitehead, Heng Ji, Mohit Bansal, Shih-Fu Chang, Clare Voss

We develop an approach that uses video meta-data to retrieve topically related news documents for a video and extracts the events and named entities from these documents.

Text Generation Video Captioning +1

Creative Language Encoding under Censorship

no code implementations COLING 2018 Heng Ji, Kevin Knight

People often create obfuscated language for online communication to avoid Internet censorship, share sensitive information, express strong sentiment or emotion, plan for secret actions, trade illegal products, or simply hold interesting conversations.

Seq2RDF: An end-to-end application for deriving Triples from Natural Language Text

3 code implementations4 Jul 2018 Yue Liu, Tongtao Zhang, Zhicheng Liang, Heng Ji, Deborah L. McGuinness

Inspired by recent successes in neural machine translation, we treat the triples within a given knowledge graph as an independent graph language and propose an encoder-decoder framework with an attention mechanism that leverages knowledge graph embeddings.

Knowledge Graph Embeddings Translation

Visual Attention Model for Name Tagging in Multimodal Social Media

no code implementations ACL 2018 Di Lu, Leonardo Neves, Vitor Carvalho, Ning Zhang, Heng Ji

Everyday billions of multimodal posts containing both images and text are shared in social media sites such as Snapchat, Twitter or Instagram.

Natural Language Understanding Question Answering

Platforms for Non-speakers Annotating Names in Any Language

no code implementations ACL 2018 Ying Lin, Cash Costello, Boliang Zhang, Di Lu, Heng Ji, James Mayfield, Paul McNamee

We demonstrate two annotation platforms that allow an English speaker to annotate names for any language without knowing the language.

A Multi-lingual Multi-task Architecture for Low-resource Sequence Labeling

1 code implementation ACL 2018 Ying Lin, Shengqi Yang, Veselin Stoyanov, Heng Ji

We propose a multi-lingual multi-task architecture to develop supervised models with a minimal amount of labeled data for sequence labeling.

Abstractive Text Summarization Machine Translation +2

ELISA-EDL: A Cross-lingual Entity Extraction, Linking and Localization System

no code implementations NAACL 2018 Boliang Zhang, Ying Lin, Xiaoman Pan, Di Lu, Jonathan May, Kevin Knight, Heng Ji

We demonstrate ELISA-EDL, a state-of-the-art re-trainable system to extract entity mentions from low-resource languages, link them to external English knowledge bases, and visualize locations related to disaster topics on a world heatmap.

Entity Extraction using GAN Entity Linking +1

Chengyu Cloze Test

1 code implementation WS 2018 Zhiying Jiang, Boliang Zhang, Lifu Huang, Heng Ji

We present a neural recommendation model for Chengyu, which is a special type of Chinese idiom.

Cloze Test

Paper Abstract Writing through Editing Mechanism

2 code implementations ACL 2018 Qingyun Wang, Zhi-Hao Zhou, Lifu Huang, Spencer Whitehead, Boliang Zhang, Heng Ji, Kevin Knight

We present a paper abstract writing system based on an attentive neural sequence-to-sequence model that can take a title as input and automatically generate an abstract.

Paper generation

Event Extraction with Generative Adversarial Imitation Learning

no code implementations21 Apr 2018 Tongtao Zhang, Heng Ji

We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN).

Event Extraction Feature Engineering +1

Entity-aware Image Caption Generation

no code implementations EMNLP 2018 Di Lu, Spencer Whitehead, Lifu Huang, Heng Ji, Shih-Fu Chang

Current image captioning approaches generate descriptions which lack specific information, such as named entities that are involved in the images.

Image Captioning

Multi-lingual Common Semantic Space Construction via Cluster-consistent Word Embedding

no code implementations EMNLP 2018 Lifu Huang, Kyunghyun Cho, Boliang Zhang, Heng Ji, Kevin Knight

We construct a multilingual common semantic space based on distributional semantics, where words from multiple languages are projected into a shared space to enable knowledge and resource transfer across languages.

Word Alignment

Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion

no code implementations WS 2015 Yue Liu, Tao Ge, Kusum S. Mathews, Heng Ji, Deborah L. McGuinness

In the medical domain, identifying and expanding abbreviations in clinical texts is a vital task for both better human and machine understanding.

Word Embeddings

Open Relation Extraction and Grounding

no code implementations IJCNLP 2017 Dian Yu, Lifu Huang, Heng Ji

Previous open Relation Extraction (open RE) approaches mainly rely on linguistic patterns and constraints to extract important relational triples from large-scale corpora.

Relation Extraction Slot Filling

Embracing Non-Traditional Linguistic Resources for Low-resource Language Name Tagging

no code implementations IJCNLP 2017 Boliang Zhang, Di Lu, Xiaoman Pan, Ying Lin, Halidanmu Abudukelimu, Heng Ji, Kevin Knight

Current supervised name tagging approaches are inadequate for most low-resource languages due to the lack of annotated data and actionable linguistic knowledge.

Relation Classification Word Embeddings

Learning Phrase Embeddings from Paraphrases with GRUs

no code implementations WS 2017 Zhihao Zhou, Lifu Huang, Heng Ji

Learning phrase representations has been widely explored in many Natural Language Processing (NLP) tasks (e. g., Sentiment Analysis, Machine Translation) and has shown promising improvements.

Machine Translation Sentiment Analysis +1

Acquiring Background Knowledge to Improve Moral Value Prediction

no code implementations16 Sep 2017 Ying Lin, Joe Hoover, Morteza Dehghani, Marlon Mooijman, Heng Ji

In this paper, we address the problem of detecting expressions of moral values in tweets using content analysis.

Value prediction

Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters

no code implementations EMNLP 2017 Min Yang, Jincheng Mei, Heng Ji, Wei Zhao, Zhou Zhao, Xiaojun Chen

We study the problem of identifying the topics and sentiments and tracking their shifts from social media texts in different geographical regions during emergencies and disasters.

Topic Models

Improving Slot Filling Performance with Attentive Neural Networks on Dependency Structures

no code implementations EMNLP 2017 Lifu Huang, Avirup Sil, Heng Ji, Radu Florian

Slot Filling (SF) aims to extract the values of certain types of attributes (or slots, such as person:cities\_of\_residence) for a given entity from a large collection of source documents.

Relation Extraction Slot Filling

Zero-Shot Transfer Learning for Event Extraction

1 code implementation ACL 2018 Lifu Huang, Heng Ji, Kyunghyun Cho, Clare R. Voss

Most previous event extraction studies have relied heavily on features derived from annotated event mentions, thus cannot be applied to new event types without annotation effort.

Event Extraction Transfer Learning

List-only Entity Linking

no code implementations ACL 2017 Ying Lin, Chin-Yew Lin, Heng Ji

Traditional Entity Linking (EL) technologies rely on rich structures and properties in the target knowledge base (KB).

Entity Linking

Cross-lingual Name Tagging and Linking for 282 Languages

no code implementations ACL 2017 Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, Heng Ji

The ambitious goal of this work is to develop a cross-lingual name tagging and linking framework for 282 languages that exist in Wikipedia.

Translation

Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach

1 code implementation EMNLP 2017 Liyuan Liu, Xiang Ren, Qi Zhu, Shi Zhi, Huan Gui, Heng Ji, Jiawei Han

These annotations, referred as heterogeneous supervision, often conflict with each other, which brings a new challenge to the original relation extraction task: how to infer the true label from noisy labels for a given instance.

Relation Extraction Representation Learning

Bitext Name Tagging for Cross-lingual Entity Annotation Projection

no code implementations COLING 2016 Dongxu Zhang, Boliang Zhang, Xiaoman Pan, Xiaocheng Feng, Heng Ji, Weiran Xu

Instead of directly relying on word alignment results, this framework combines advantages of rule-based methods and deep learning methods by implementing two steps: First, generates a high-confidence entity annotation set on IL side with strict searching methods; Second, uses this high-confidence set to weakly supervise the model training.

Named Entity Recognition NER +1

CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases

2 code implementations27 Oct 2016 Xiang Ren, Zeqiu Wu, Wenqi He, Meng Qu, Clare R. Voss, Heng Ji, Tarek F. Abdelzaher, Jiawei Han

We propose a novel domain-independent framework, called CoType, that runs a data-driven text segmentation algorithm to extract entity mentions, and jointly embeds entity mentions, relation mentions, text features and type labels into two low-dimensional spaces (for entity and relation mentions respectively), where, in each space, objects whose types are close will also have similar representations.

Joint Entity and Relation Extraction Text Segmentation

Aligning Coordinated Text Streams through Burst Information Network Construction and Decipherment

no code implementations27 Sep 2016 Tao Ge, Qing Dou, Xiaoman Pan, Heng Ji, Lei Cui, Baobao Chang, Zhifang Sui, Ming Zhou

We introduce a novel Burst Information Network (BINet) representation that can display the most important information and illustrate the connections among bursty entities, events and keywords in the corpus.

Decipherment Translation

Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding

3 code implementations17 Feb 2016 Xiang Ren, Wenqi He, Meng Qu, Clare R. Voss, Heng Ji, Jiawei Han

Current systems of fine-grained entity typing use distant supervision in conjunction with existing knowledge bases to assign categories (type labels) to entity mentions.

Entity Typing Semantic Similarity +1

Leveraging Deep Neural Networks and Knowledge Graphs for Entity Disambiguation

no code implementations28 Apr 2015 Hongzhao Huang, Larry Heck, Heng Ji

Entity Disambiguation aims to link mentions of ambiguous entities to a knowledge base (e. g., Wikipedia).

Entity Disambiguation Knowledge Graphs