Search Results for author: Hidetaka Kamigaito

Found 56 papers, 19 papers with code

Top-Down RST Parsing Utilizing Granularity Levels in Documents

1 code implementation • 3 Apr 2020 • Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

To obtain better discourse dependency trees, we need to improve the accuracy of RST trees at the upper parts of the structures.

Ranked #3 on Discourse Parsing on RST-DT

Discourse Parsing Relation

Paper
Code

Syntactically Look-Ahead Attention Network for Sentence Compression

1 code implementation • 4 Feb 2020 • Hidetaka Kamigaito, Manabu Okumura

Sentence compression is the task of compressing a long sentence into a short one by deleting redundant words.

Ranked #1 on Sentence Compression on Google Dataset

Informativeness Sentence +1

Paper
Code

A Simple and Strong Baseline for End-to-End Neural RST-style Discourse Parsing

1 code implementation • 15 Oct 2022 • Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

To promote and further develop RST-style discourse parsing models, we need a strong baseline that can be regarded as a reference for reporting reliable experimental results.

Ranked #1 on Discourse Parsing on Instructional-DT (Instr-DT)

Discourse Parsing

Paper
Code

Towards Table-to-Text Generation with Numerical Reasoning

1 code implementation • ACL 2021 • Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura, Hiroya Takamura

In summary, our contributions are (1) a new dataset for numerical table-to-text generation using pairs of a table and a paragraph of a table description with richer inference from scientific papers, and (2) a table-to-text generation framework enriched with numerical reasoning.

Descriptive Table-to-Text Generation

Paper
Code

Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

1 code implementation • 21 Jun 2022 • Hidetaka Kamigaito, Katsuhiko Hayashi

To solve this problem, we theoretically analyzed NS loss to assist hyperparameter tuning and understand the better use of the NS loss in KGE learning.

Knowledge Graph Embedding

Paper
Code

SODA: Story Oriented Dense Video Captioning Evaluation Framework

1 code implementation • ECCV 2020 • Soichiro Fujita, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

This paper proposes a new evaluation framework, Story Oriented Dense video cAptioning evaluation framework (SODA), for measuring the performance of video story description systems.

Dense Video Captioning

Paper
Code

Character-based Thai Word Segmentation with Multiple Attentions

1 code implementation • RANLP 2021 • Thodsaporn Chay-intr, Hidetaka Kamigaito, Manabu Okumura

These models estimate word boundaries from a character sequence.

Ranked #2 on Thai Word Segmentation on BEST-2010

Segmentation Thai Word Segmentation

Paper
Code

Generating Weather Comments from Meteorological Simulations

1 code implementation • EACL 2021 • Soichiro Murakami, Sora Tanaka, Masatsugu Hangyo, Hidetaka Kamigaito, Kotaro Funakoshi, Hiroya Takamura, Manabu Okumura

The task of generating weather-forecast comments from meteorological simulations has the following requirements: (i) the changes in numerical values for various physical quantities need to be considered, (ii) the weather comments should be dependent on delivery time and area information, and (iii) the comments should provide useful information for users.

Informativeness

Paper
Code

Unified Interpretation of Softmax Cross-Entropy and Negative Sampling: With Case Study for Knowledge Graph Embedding

1 code implementation • ACL 2021 • Hidetaka Kamigaito, Katsuhiko Hayashi

In knowledge graph embedding, the theoretical relationship between the softmax cross-entropy and negative sampling loss functions has not been investigated.

Ranked #14 on Link Prediction on FB15k-237

Knowledge Graph Embedding Link Prediction +1

Paper
Code

Bidirectional Transformer Reranker for Grammatical Error Correction

1 code implementation • 22 May 2023 • Ying Zhang, Hidetaka Kamigaito, Manabu Okumura

Pre-trained seq2seq models have achieved state-of-the-art results in the grammatical error correction task.

Grammatical Error Correction Language Modelling +2

Paper
Code

Table and Image Generation for Investigating Knowledge of Entities in Pre-trained Vision and Language Models

1 code implementation • 3 Jun 2023 • Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

This task consists of two parts: the first is to generate a table containing knowledge about an entity and its related image, and the second is to generate an image from an entity with a caption and a table containing related knowledge of the entity.

Image Generation

Paper
Code

Model-based Subsampling for Knowledge Graph Completion

1 code implementation • 17 Sep 2023 • Xincan Feng, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

Subsampling is effective in Knowledge Graph Embedding (KGE) for reducing overfitting caused by the sparsity in Knowledge Graph (KG) datasets.

Knowledge Graph Completion Knowledge Graph Embedding

Paper
Code

An Empirical Study of Building a Strong Baseline for Constituency Parsing

1 code implementation • ACL 2018 • Jun Suzuki, Sho Takase, Hidetaka Kamigaito, Makoto Morishita, Masaaki Nagata

This paper investigates the construction of a strong baseline based on general purpose sequence-to-sequence models for constituency parsing.

Ranked #16 on Constituency Parsing on Penn Treebank

Abstractive Text Summarization Constituency Parsing +3

Paper
Code

Can we obtain significant success in RST discourse parsing by using Large Language Models?

1 code implementation • 8 Mar 2024 • Aru Maekawa, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura

Recently, decoder-only pre-trained large language models (LLMs), with several tens of billion parameters, have significantly impacted a wide range of natural language processing (NLP) tasks.

Discourse Parsing

Paper
Code

LATTE: Lattice ATTentive Encoding for Character-based Word Segmentation

2 code implementations • Journal of Natural Language Processing 2023 • Thodsaporn Chay-intr, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura

Our model employs the lattice structure to handle segmentation alternatives and utilizes graph neural networks along with an attention mechanism to attentively extract multi-granularity representation from the lattice for complementing character representations.

Ranked #1 on Chinese Word Segmentation on CTB6 (using extra training data)

Chinese Word Segmentation Japanese Word Segmentation +2

Paper
Code

A Simple and Effective Usage of Word Clusters for CBOW Model

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Yukun Feng, Chenlong Hu, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

We propose a simple and effective method for incorporating word clusters into the Continuous Bag-of-Words (CBOW) model.

Word Embeddings

Paper
Code

Cross-lingual Contextualized Phrase Retrieval

1 code implementation • 25 Mar 2024 • Huayang Li, Deng Cai, Zhi Qu, Qu Cui, Hidetaka Kamigaito, Lemao Liu, Taro Watanabe

In our work, we propose a new task formulation of dense retrieval, cross-lingual contextualized phrase retrieval, which aims to augment cross-lingual applications by addressing polysemy using context information.

Contrastive Learning Language Modelling +4

Paper
Code

Automatic Pyramid Evaluation Exploiting EDU-based Extractive Reference Summaries

no code implementations • EMNLP 2018 • Tsutomu Hirao, Hidetaka Kamigaito, Masaaki Nagata

This paper tackles automation of the pyramid method, a reliable manual evaluation framework.

Semantic Textual Similarity

Paper
Add Code

Higher-Order Syntactic Attention Network for Longer Sentence Compression

no code implementations • NAACL 2018 • Hidetaka Kamigaito, Katsuhiko Hayashi, Tsutomu Hirao, Masaaki Nagata

To solve this problem, we propose a higher-order syntactic attention network (HiSAN) that can handle higher-order dependency features as an attention distribution on LSTM hidden states.

Ranked #3 on Sentence Compression on Google Dataset

Informativeness Machine Translation +2

Paper
Add Code

Unsupervised Word Alignment by Agreement Under ITG Constraint

no code implementations • EMNLP 2016 • Hidetaka Kamigaito, Akihiro Tamura, Hiroya Takamura, Manabu Okumura, Eiichiro Sumita

Machine Translation Word Alignment

Paper
Add Code

Supervised Attention for Sequence-to-Sequence Constituency Parsing

no code implementations • IJCNLP 2017 • Hidetaka Kamigaito, Katsuhiko Hayashi, Tsutomu Hirao, Hiroya Takamura, Manabu Okumura, Masaaki Nagata

The sequence-to-sequence (Seq2Seq) model has been successfully applied to machine translation (MT).

Constituency Parsing Machine Translation +3

Paper
Add Code

Hierarchical Back-off Modeling of Hiero Grammar based on Non-parametric Bayesian Model

no code implementations • EMNLP 2015 • Hidetaka Kamigaito, Taro Watanabe, Hiroya Takamura, Manabu Okumura, Eiichiro Sumita

Machine Translation

Paper
Add Code

Unsupervised Word Alignment Using Frequency Constraint in Posterior Regularized EM

no code implementations • EMNLP 2014 • Hidetaka Kamigaito, Taro Watanabe, Hiroya Takamura, Manabu Okumura

Machine Translation Word Alignment

Paper
Add Code

Context-aware Neural Machine Translation with Coreference Information

no code implementations • WS 2019 • Takumi Ohtani, Hidetaka Kamigaito, Masaaki Nagata, Manabu Okumura

We present neural machine translation models for translating a sentence in a text by using a graph-based encoder which can consider coreference relations provided within the text explicitly.

Machine Translation Sentence +1

Paper
Add Code

A Simple and Effective Method for Injecting Word-Level Information into Character-Aware Neural Language Models

no code implementations • CONLL 2019 • Yukun Feng, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

Our injection method can also be used together with previous methods.

Language Modelling

Paper
Add Code

Split or Merge: Which is Better for Unsupervised RST Parsing?

no code implementations • IJCNLP 2019 • Naoki Kobayashi, Tsutomu Hirao, Kengo Nakamura, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

The first one builds the optimal tree in terms of a dissimilarity score function that is defined for splitting a text span into smaller ones.

Paper
Add Code

Discourse-Aware Hierarchical Attention Network for Extractive Single-Document Summarization

no code implementations • RANLP 2019 • Tatsuya Ishigaki, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

To incorporate the information of a discourse tree structure into the neural network-based summarizers, we propose a discourse-aware neural extractive summarizer which can explicitly take into account the discourse dependency tree structure of the source document.

Document Summarization Sentence

Paper
Add Code

Neural text normalization leveraging similarities of strings and sounds

no code implementations • COLING 2020 • Riku Kawamura, Tatsuya Aoki, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

We propose neural models that can normalize text by considering the similarities of word strings and sounds.

Paper
Add Code

Pointing to Subwords for Generating Function Names in Source Code

no code implementations • COLING 2020 • Shogo Fujita, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

We tackle the task of automatically generating a function name from source code.

Paper
Add Code

Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers

no code implementations • EACL 2021 • Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Manabu Okumura, Hiroya Takamura

Numerical tables are widely used to present experimental results in scientific papers.

Metric-Type Identification Vocal Bursts Type Prediction

Paper
Add Code

Improving Neural RST Parsing Model with Silver Agreement Subtrees

no code implementations • NAACL 2021 • Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

We then pre-train a neural RST parser with the obtained silver data and fine-tune it on the RST-DT.

Ranked #2 on Discourse Parsing on RST-DT (using extra training data)

Discourse Parsing Relation

Paper
Add Code

An Empirical Study of Generating Texts for Search Engine Advertising

no code implementations • NAACL 2021 • Hidetaka Kamigaito, Peinan Zhang, Hiroya Takamura, Manabu Okumura

Although there are many studies on neural language generation (NLG), few trials are put into the real world, especially in the advertising domain.

Text Generation

Paper
Add Code

One-class Text Classification with Multi-modal Deep Support Vector Data Description

no code implementations • EACL 2021 • Chenlong Hu, Yukun Feng, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

This work presents multi-modal deep SVDD (mSVDD) for one-class text classification.

text-classification Text Classification

Paper
Add Code

A New Surprise Measure for Extracting Interesting Relationships between Persons

no code implementations • EACL 2021 • Hidetaka Kamigaito, Jingun Kwon, Young-In Song, Manabu Okumura

We therefore propose a method for extracting interesting relationships between persons from natural language texts by focusing on their surprisingness.

Paper
Add Code

Hierarchical Trivia Fact Extraction from Wikipedia Articles

no code implementations • COLING 2020 • Jingun Kwon, Hidetaka Kamigaito, Young-In Song, Manabu Okumura

Recently, automatic trivia fact extraction has attracted much research interest.

Paper
Add Code

Why does Negative Sampling not Work Well? Analysis of Convexity in Negative Sampling

no code implementations • 29 Sep 2021 • Hidetaka Kamigaito, Katsuhiko Hayashi

On the other hand, properties of the NS loss function that are considered important for learning, such as the relationship between the noise distribution and the number of negative samples, have not been investigated theoretically.

Computational Efficiency Knowledge Graph Embedding

Paper
Add Code

Fusing Label Embedding into BERT: An Efficient Improvement for Text Classification

no code implementations • Findings (ACL) 2021 • Yijin Xiong, Yukun Feng, Hao Wu, Hidetaka Kamigaito, Manabu Okumura

text-classification Text Classification

Paper
Add Code

A Language Model-based Generative Classifier for Sentence-level Discourse Parsing

no code implementations • EMNLP 2021 • Ying Zhang, Hidetaka Kamigaito, Manabu Okumura

Discourse segmentation and sentence-level discourse parsing play important roles for various NLP tasks to consider textual coherence.

Discourse Segmentation Language Modelling +2

Paper
Add Code

Considering Nested Tree Structure in Sentence Extractive Summarization with Pre-trained Transformer

no code implementations • EMNLP 2021 • Jingun Kwon, Naoki Kobayashi, Hidetaka Kamigaito, Manabu Okumura

Sentence extractive summarization shortens a document by selecting sentences for a summary while preserving its important contents.

Ranked #4 on Extractive Text Summarization on CNN / Daily Mail

Extractive Text Summarization Sentence

Paper
Add Code

Improving Character-Aware Neural Language Model by Warming up Character Encoder under Skip-gram Architecture

no code implementations • RANLP 2021 • Yukun Feng, Chenlong Hu, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

Character-aware neural language models can capture the relationship between words by exploiting character-level information and are particularly effective for languages with rich morphology.

Language Modelling

Paper
Add Code

Making Your Tweets More Fancy: Emoji Insertion to Texts

no code implementations • RANLP 2021 • Jingun Kwon, Naoki Kobayashi, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

The results demonstrate that the position of emojis in texts is a good clue to boost the performance of emoji label prediction.

Position

Paper
Add Code

Abstractive Document Summarization with Word Embedding Reconstruction

no code implementations • RANLP 2021 • Jingyi You, Chenlong Hu, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

Neural sequence-to-sequence (Seq2Seq) models and BERT have achieved substantial improvements in abstractive document summarization (ADS) without and with pre-training, respectively.

Document Summarization Word Embeddings

Paper
Add Code

Generic Mechanism for Reducing Repetitions in Encoder-Decoder Models

no code implementations • RANLP 2021 • Ying Zhang, Hidetaka Kamigaito, Tatsuya Aoki, Hiroya Takamura, Manabu Okumura

Encoder-decoder models have been commonly used for many tasks such as machine translation and response generation.

Machine Translation Response Generation +2

Paper
Add Code

Aspect-based Analysis of Advertising Appeals for Search Engine Advertising

no code implementations • NAACL (ACL) 2022 • Soichiro Murakami, Peinan Zhang, Sho Hoshino, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

Writing an ad text that attracts people and persuades them to click or act is essential for the success of search engine advertising.

Paper
Add Code

Generating Repetitions with Appropriate Repeated Words

1 code implementation • NAACL 2022 • Toshiki Kawamoto, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura

A repetition is a response that repeats words in the previous speaker's utterance in a dialogue.

Language Modelling

Paper
Code

Joint Learning-based Heterogeneous Graph Attention Network for Timeline Summarization

no code implementations • NAACL 2022 • Jingyi You, Dongyuan Li, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura

Previous studies on the timeline summarization (TLS) task ignored the information interaction between sentences and dates, and adopted pre-defined unlearnable representations for them.

Event Detection Graph Attention +1

Paper
Add Code

Subsampling for Knowledge Graph Embedding Explained

no code implementations • 13 Sep 2022 • Hidetaka Kamigaito, Katsuhiko Hayashi

In this article, we explain the recent advance of subsampling methods in knowledge graph embedding (KGE) starting from the original one used in word2vec.

Knowledge Graph Embedding

Paper
Add Code

Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?

no code implementations • 15 Nov 2023 • Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

Knowledge Graph Completion (KGC) is a task that infers unseen relationships between entities in a KG.

Language Modelling Memorization

Paper
Add Code

Generating Diverse Translation with Perturbed kNN-MT

no code implementations • 14 Feb 2024 • Yuto Nishida, Makoto Morishita, Hidetaka Kamigaito, Taro Watanabe

Generating multiple translation candidates would enable users to choose the one that satisfies their needs.

Machine Translation Translation

Paper
Add Code

Evaluating Image Review Ability of Vision Language Models

no code implementations • 19 Feb 2024 • Shigeki Saito, Kazuki Hayashi, Yusuke Ide, Yusuke Sakai, Kazuma Onishi, Toma Suzuki, Seiji Gobara, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

Large-scale vision language models (LVLMs) are language models that are capable of processing images and text inputs by a single model.

Image Captioning

Paper
Add Code

Centroid-Based Efficient Minimum Bayes Risk Decoding

no code implementations • 17 Feb 2024 • Hiroyuki Deguchi, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe, Hideki Tanaka, Masao Utiyama

Minimum Bayes risk (MBR) decoding achieved state-of-the-art translation performance by using COMET, a neural metric that has a high correlation with human evaluation.

Translation

Paper
Add Code

Do LLMs Implicitly Determine the Suitable Text Difficulty for Users?

1 code implementation • 22 Feb 2024 • Seiji Gobara, Hidetaka Kamigaito, Taro Watanabe

Experimental results on the Stack-Overflow dataset and the TSCC dataset, including multi-turn conversation show that LLMs can implicitly handle text difficulty between user input and its generated response.

Question Answering

Paper
Code

Artwork Explanation in Large-scale Vision Language Models

no code implementations • 29 Feb 2024 • Kazuki Hayashi, Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

To address this issue, we propose a new task: the artwork explanation generation task, along with its evaluation dataset and metric for quantitatively assessing the understanding and utilization of knowledge about artworks.

Explanation Generation Text Generation

Paper
Add Code

Distilling Named Entity Recognition Models for Endangered Species from Large Language Models

no code implementations • 13 Mar 2024 • Jesse Atuhurra, Seiveright Cargill Dujohn, Hidetaka Kamigaito, Hiroyuki Shindo, Taro Watanabe

Natural language processing (NLP) practitioners are leveraging large language models (LLM) to create structured datasets from semi-structured and unstructured data sources such as patents, papers, and theses, without having domain-specific knowledge.

In-Context Learning Knowledge Distillation +5

Paper
Add Code

Revealing Trends in Datasets from the 2022 ACL and EMNLP Conferences

no code implementations • 31 Mar 2024 • Jesse Atuhurra, Hidetaka Kamigaito

NLP systems are on par or, in some cases, better than humans at accomplishing specific tasks.

Paper
Add Code

Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair

no code implementations • 18 Apr 2024 • Yusuke Sakai, Mana Makinae, Hidetaka Kamigaito, Taro Watanabe

In Simultaneous Machine Translation (SiMT) systems, training with a simultaneous interpretation (SI) corpus is an effective method for achieving high-quality yet low-latency systems.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.