Search Results for author: Naoki Otani

Found 13 papers, 4 papers with code

Pre-tokenization of Multi-word Expressions in Cross-lingual Word Embeddings

no code implementations • EMNLP 2020 • Naoki Otani, Satoru Ozaki, Xingyuan Zhao, Yucen Li, Micaelah St Johns, Lori Levin

We propose a simple method for word translation of MWEs to and from English in ten languages: we first compile lists of MWEs in each language and then tokenize the MWEs as single tokens before training word embeddings.

Cross-Lingual Word Embeddings Translation +2

Paper
Add Code

LITE: Intent-based Task Representation Learning Using Weak Supervision

1 code implementation • NAACL 2022 • Naoki Otani, Michael Gamon, Sujay Kumar Jauhar, Mei Yang, Sri Raghu Malireddi, Oriana Riva

Users write to-dos as personal notes to themselves, about things they need to complete, remember or organize.

Management Multi-Task Learning +1

Paper
Code

Contextualized Word Vector-based Methods for Discovering Semantic Differences with No Training nor Word Alignment

no code implementations • 19 May 2023 • Ryo Nagata, Hiroya Takamura, Naoki Otani, Yoshifumi Kawasaki

In this paper, we propose methods for discovering semantic differences in words appearing in two corpora based on the norms of contextualized word vectors.

Word Alignment

Paper
Add Code

Construction Grammar Provides Unique Insight into Neural Language Models

no code implementations • 4 Feb 2023 • Leonie Weissweiler, Taiqi He, Naoki Otani, David R. Mortensen, Lori Levin, Hinrich Schütze

Construction Grammar (CxG) has recently been used as the basis for probing studies that have investigated the performance of large pretrained language models (PLMs) with respect to the structure and meaning of constructions.

Position

Paper
Add Code

Neural Polysynthetic Language Modelling

no code implementations • 11 May 2020 • Lane Schwartz, Francis Tyers, Lori Levin, Christo Kirov, Patrick Littell, Chi-kiu Lo, Emily Prud'hommeaux, Hyunji Hayley Park, Kenneth Steimel, Rebecca Knowles, Jeffrey Micher, Lonny Strunk, Han Liu, Coleman Haley, Katherine J. Zhang, Robbie Jimmerson, Vasilisa Andriyanets, Aldrian Obaja Muis, Naoki Otani, Jong Hyuk Park, Zhisong Zhang

In the literature, languages like Finnish or Turkish are held up as extreme examples of complexity that challenge common modelling assumptions.

Language Modelling Lemmatization +1

Paper
Add Code

What A Sunny Day â˜”: Toward Emoji-Sensitive Irony Detection

no code implementations • WS 2019 • Shirley Anugrah Hayati, Aditi Chaudhary, Naoki Otani, Alan W. black

Irony detection is an important task with applications in identification of online abuse and harassment.

Paper
Add Code

Toward Comprehensive Understanding of a Sentiment Based on Human Motives

1 code implementation • ACL 2019 • Naoki Otani, Eduard Hovy

In sentiment detection, the natural language processing community has focused on determining holders, facets, and valences, but has paid little attention to the reasons for sentiment decisions.

Transfer Learning

Paper
Code

The ARIEL-CMU Systems for LoReHLT18

no code implementations • 24 Feb 2019 • Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W. black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, Kathleen McKeown

This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).

Machine Translation Translation

Paper
Add Code

Unsupervised Cross-lingual Transfer of Word Embedding Spaces

1 code implementation • EMNLP 2018 • Ruochen Xu, Yiming Yang, Naoki Otani, Yuexin Wu

Supervised methods for this problem rely on the availability of cross-lingual supervision, either using parallel corpora or bilingual lexicons as the labeled data for training, which may not be available for many low resource languages.

Bilingual Lexicon Induction Cross-Lingual Transfer +4