EMNLP 2018

SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing

EMNLP 2018 google/sentencepiece

We perform a validation experiment of NMT on English-Japanese machine translation, and find that it is possible to achieve comparable accuracy to direct subword training from raw sentences.

MACHINE TRANSLATION

OpenKE: An Open Toolkit for Knowledge Embedding

EMNLP 2018 thunlp/OpenKE

We release an open toolkit for knowledge embedding (OpenKE), which provides a unified framework and various fundamental models to embed knowledge graphs into a continuous low-dimensional space.

INFORMATION RETRIEVAL KNOWLEDGE GRAPHS QUESTION ANSWERING REPRESENTATION LEARNING

Magnitude: A Fast, Efficient Universal Vector Embedding Utility Package

EMNLP 2018 plasticityai/magnitude

Vector space embedding models like word2vec, GloVe, fastText, and ELMo are extremely popular representations in natural language processing (NLP) applications.

WORD EMBEDDINGS

TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation

EMNLP 2018 pcyin/tranX

We present TRANX, a transition-based neural semantic parser that maps natural language (NL) utterances into formal meaning representations (MRs).

CODE GENERATION SEMANTIC PARSING

Juman++: A Morphological Analysis Toolkit for Scriptio Continua

EMNLP 2018 ku-nlp/jumanpp

We present a three-part toolkit for developing morphological analyzers for languages without natural word boundaries.

ART ANALYSIS LANGUAGE MODELLING MORPHOLOGICAL ANALYSIS PART-OF-SPEECH TAGGING TOKENIZATION

KT-Speech-Crawler: Automatic Dataset Construction for Speech Recognition from YouTube Videos

EMNLP 2018 EgorLakomkin/KTSpeechCrawler

In this paper, we describe KT-Speech-Crawler: an approach for automatic dataset construction for speech recognition by crawling YouTube videos.

SPEECH RECOGNITION

CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++

EMNLP 2018 arthurxlw/cytonMt

This paper presents an open-source neural machine translation toolkit named CytonMT (https://github. com/arthurxlw/cytonMt).

MACHINE TRANSLATION

SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology

EMNLP 2018 Comcast/SyntaViz

This paper describes SyntaViz, a visualization interface specifically designed for analyzing natural-language queries that were created by users of a voice-enabled product.

SENTIMENT ANALYSIS TOPIC MODELS