Search Results for author: Lori Levin

Found 37 papers, 7 papers with code

Pre-tokenization of Multi-word Expressions in Cross-lingual Word Embeddings

no code implementations EMNLP 2020 Naoki Otani, Satoru Ozaki, Xingyuan Zhao, Yucen Li, Micaelah St Johns, Lori Levin

We propose a simple method for word translation of MWEs to and from English in ten languages: we first compile lists of MWEs in each language and then tokenize the MWEs as single tokens before training word embeddings.

Cross-Lingual Word Embeddings Translation +2

Automatic Interlinear Glossing for Under-Resourced Languages Leveraging Translations

no code implementations COLING 2020 Xingyuan Zhao, Satoru Ozaki, Antonios Anastasopoulos, Graham Neubig, Lori Levin

Interlinear Glossed Text (IGT) is a widely used format for encoding linguistic information in language documentation projects and scholarly papers.

Cross-Lingual Transfer TAG

A Resource for Computational Experiments on Mapudungun

1 code implementation LREC 2020 Mingjun Duan, Carlos Fasola, Sai Krishna Rallabandi, Rodolfo M. Vega, Antonios Anastasopoulos, Lori Levin, Alan W. black

We present a resource for computational experiments on Mapudungun, a polysynthetic indigenous language spoken in Chile with upwards of 200 thousand speakers.

Machine Translation Speech Recognition +2

Using Interlinear Glosses as Pivot in Low-Resource Multilingual Machine Translation

no code implementations7 Nov 2019 Zhong Zhou, Lori Levin, David R. Mortensen, Alex Waibel

Firstly, we pool IGT for 1, 497 languages in ODIN (54, 545 glosses) and 70, 918 glosses in Arapaho and train a gloss-to-target NMT system from IGT to English, with a BLEU score of 25. 94.

Machine Translation Translation

The ARIEL-CMU Systems for LoReHLT18

no code implementations24 Feb 2019 Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W. black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, Kathleen McKeown

This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).

Machine Translation Translation

DeepCx: A transition-based approach for shallow semantic parsing with complex constructional triggers

no code implementations EMNLP 2018 Jesse Dunietz, Jaime Carbonell, Lori Levin

This paper introduces the surface construction labeling (SCL) task, which expands the coverage of Shallow Semantic Parsing (SSP) to include frames triggered by complex constructions.

Semantic Parsing

Annotation Schemes for Surface Construction Labeling

no code implementations COLING 2018 Lori Levin

In this talk I will describe the interaction of linguistics and language technologies in Surface Construction Labeling (SCL) from the perspective of corpus annotation tasks such as definiteness, modality, and causality.

Frame Semantic Parsing

URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors

no code implementations EACL 2017 Patrick Littell, David R. Mortensen, Ke Lin, Katherine Kairis, Carlisle Turner, Lori Levin

We introduce the URIEL knowledge base for massively multilingual NLP and the lang2vec utility, which provides information-rich vector identifications of languages drawn from typological, geographical, and phylogenetic databases and normalized to have straightforward and consistent formats, naming, and semantics.

Language Identification Language Modelling +1

Automatically Tagging Constructions of Causation and Their Slot-Fillers

no code implementations TACL 2017 Jesse Dunietz, Lori Levin, Jaime Carbonell

Semantic parsing becomes difficult in the face of the wide variety of linguistic realizations that causation can take on.

Semantic Parsing

Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik

no code implementations COLING 2016 Patrick Littell, Kartik Goyal, David R. Mortensen, Alexa Little, Chris Dyer, Lori Levin

This paper describes our construction of named-entity recognition (NER) systems in two Western Iranian languages, Sorani Kurdish and Tajik, as a part of a pilot study of {``}Linguistic Rapid Response{''} to potential emergency humanitarian relief situations.

Humanitarian Named Entity Recognition +1

PanPhon: A Resource for Mapping IPA Segments to Articulatory Feature Vectors

1 code implementation COLING 2016 David R. Mortensen, Patrick Littell, Akash Bharadwaj, Kartik Goyal, Chris Dyer, Lori Levin

This paper contributes to a growing body of evidence that{---}when coupled with appropriate machine-learning techniques{--}linguistically motivated, information-rich representations can outperform one-hot encodings of linguistic data.

NER

Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning

no code implementations NAACL 2016 Yulia Tsvetkov, Sunayana Sitaram, Manaal Faruqui, Guillaume Lample, Patrick Littell, David Mortensen, Alan W. black, Lori Levin, Chris Dyer

We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.

Representation Learning

Bridge-Language Capitalization Inference in Western Iranian: Sorani, Kurmanji, Zazaki, and Tajik

no code implementations LREC 2016 Patrick Littell, David R. Mortensen, Kartik Goyal, Chris Dyer, Lori Levin

In Sorani Kurdish, one of the most useful orthographic features in named-entity recognition {--} capitalization {--} is absent, as the language{'}s Perso-Arabic script does not make a distinction between uppercase and lowercase letters.

Named Entity Recognition

Unsupervised POS Induction with Word Embeddings

no code implementations HLT 2015 Chu-Cheng Lin, Waleed Ammar, Chris Dyer, Lori Levin

Unsupervised word embeddings have been shown to be valuable as features in supervised learning problems; however, their role in unsupervised problems has been less thoroughly explored.

POS Word Embeddings

Use of Modality and Negation in Semantically-Informed Syntactic MT

no code implementations5 Feb 2015 Kathryn Baker, Michael Bloodgood, Bonnie J. Dorr, Chris Callison-Burch, Nathaniel W. Filardo, Christine Piatko, Lori Levin, Scott Miller

We apply our MN annotation scheme to statistical machine translation using a syntactic framework that supports the inclusion of semantic annotations.

Machine Translation Translation

A Modality Lexicon and its use in Automatic Tagging

no code implementations17 Oct 2014 Kathryn Baker, Michael Bloodgood, Bonnie J. Dorr, Nathaniel W. Filardo, Lori Levin, Christine Piatko

Specifically, we describe the construction of a modality annotation scheme, a modality lexicon, and two automated modality taggers that were built using the lexicon and annotation scheme.

Machine Translation Translation

Semantically-Informed Syntactic Machine Translation: A Tree-Grafting Approach

no code implementations24 Sep 2014 Kathryn Baker, Michael Bloodgood, Chris Callison-Burch, Bonnie J. Dorr, Nathaniel W. Filardo, Lori Levin, Scott Miller, Christine Piatko

We describe a unified and coherent syntactic framework for supporting a semantically-informed syntactic approach to statistical machine translation.

Machine Translation Translation

Morphological parsing of Swahili using crowdsourced lexical resources

no code implementations LREC 2014 Patrick Littell, Kaitlyn Price, Lori Levin

We describe a morphological analyzer for the Swahili language, written in an extension of XFST/LEXC intended for the easy declaration of morphophonological patterns and importation of lexical resources.

Machine Translation

A Unified Annotation Scheme for the Semantic/Pragmatic Components of Definiteness

no code implementations LREC 2014 Archna Bhatia, M Simons, y, Lori Levin, Yulia Tsvetkov, Chris Dyer, Jordan Bender

We present a definiteness annotation scheme that captures the semantic, pragmatic, and discourse information, which we call communicative functions, associated with linguistic descriptions such as {``}a story about my speech{''}, {``}the story{''}, {``}every time I give it{''}, {``}this slideshow{''}.

Machine Translation

Resources for the Detection of Conventionalized Metaphors in Four Languages

no code implementations LREC 2014 Lori Levin, Teruko Mitamura, Brian MacWhinney, Davida Fromm, Jaime Carbonell, Weston Feely, Robert Frederking, Anatole Gershman, Carlos Ramirez

The extraction rules operate on the output of a dependency parser and identify the grammatical configurations (such as a verb with a prepositional phrase complement) that are likely to contain conventional metaphors.

The CMU METAL Farsi NLP Approach

1 code implementation LREC 2014 Weston Feely, Mehdi Manshadi, Robert Frederking, Lori Levin

While many high-quality tools are available for analyzing major languages such as English, equivalent freely-available tools for important but lower-resourced languages such as Farsi are more difficult to acquire and integrate into a useful NLP front end.

Dependency Parsing

Cannot find the paper you are looking for? You can Submit a new open access paper.