Search Results for author: Kumiko Tanaka-Ishii

Found 14 papers, 1 papers with code

Strahler Number of Natural Language Sentences in Comparison with Random Trees

no code implementations6 Jul 2023 Kumiko Tanaka-Ishii, Akira Tanaka

The Strahler number was originally proposed to characterize the complexity of river bifurcation and has found various applications.

Sentence

Extraction of Templates from Phrases Using Sequence Binary Decision Diagrams

no code implementations28 Jan 2020 Daiki Hirano, Kumiko Tanaka-Ishii, Andrew Finch

The extraction of templates such as ``regard X as Y'' from a set of related phrases requires the identification of their internal structures.

Evaluating Computational Language Models with Scaling Properties of Natural Language

no code implementations CL 2019 Shuntaro Takahashi, Kumiko Tanaka-Ishii

Statistical mechanical analyses have revealed that natural language text is characterized by scaling properties, which quantify the global structure in the vocabulary population and the long memory of a text.

Text Generation

Word Familiarity and Frequency

no code implementations9 Jun 2018 Kumiko Tanaka-Ishii, Hiroshi Terada

Word frequency is assumed to correlate with word familiarity, but the strength of this correlation has not been thoroughly investigated.

Assessing Language Models with Scaling Properties

no code implementations24 Apr 2018 Shuntaro Takahashi, Kumiko Tanaka-Ishii

Five such tests are considered, with the first two accounting for the vocabulary population and the other three for the long memory of natural language.

Taylor's law for Human Linguistic Sequences

1 code implementation ACL 2018 Tatsuru Kobayashi, Kumiko Tanaka-Ishii

Taylor's law describes the fluctuation characteristics underlying a system in which the variance of an event within a time span grows by a power law with respect to the mean.

Time Series Time Series Analysis

Long-Range Correlation Underlying Childhood Language and Generative Models

no code implementations11 Dec 2017 Kumiko Tanaka-Ishii

Long-range correlation, a property of time series exhibiting long-term memory, is mainly studied in the statistical physics domain and has been reported to exist in natural language.

Time Series Time Series Analysis

Do Neural Nets Learn Statistical Laws behind Natural Language?

no code implementations16 Jul 2017 Shuntaro Takahashi, Kumiko Tanaka-Ishii

Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language.

Language Modelling

Upper Bound of Entropy Rate Revisited ---A New Extrapolation of Compressed Large-Scale Corpora---

no code implementations WS 2016 Ryosuke Takahira, Kumiko Tanaka-Ishii, {\L}ukasz D{\k{e}}bowski

The article presents results of entropy rate estimation for human languages across six languages by using large, state-of-the-art corpora of up to 7. 8 gigabytes.

Cannot find the paper you are looking for? You can Submit a new open access paper.