1 code implementation • 17 Oct 2024 • Catarina G. Belem, Pouya Pezeskhpour, Hayate Iso, Seiji Maekawa, Nikita Bhutani, Estevam Hruschka
When evaluating 5 LLMs on our benchmarks, we observe that on average, up to 75% of the content in LLM-generated summary is hallucinated, with hallucinations more likely to occur towards the end of the summaries.
no code implementations • 15 Oct 2024 • Seiji Maekawa, Hayate Iso, Nikita Bhutani
While retrieval-augmented generation (RAG) models excel in accessing information from large document collections, they struggle with complex tasks that require aggregation and reasoning over information spanning across multiple documents--what we call holistic reasoning.
no code implementations • 2 Jun 2024 • Eser Kandogan, Sajjadur Rahman, Nikita Bhutani, Dan Zhang, Rafael Li Chen, Kushan Mitra, Sairam Gurajada, Pouya Pezeshkpour, Hayate Iso, Yanlin Feng, Hannah Kim, Chen Shen, Jin Wang, Estevam Hruschka
Large Language Models (LLMs) have showcased remarkable capabilities surpassing conventional NLP challenges, creating opportunities for use in production use cases.
1 code implementation • 27 Feb 2024 • Ayana Niwa, Hayate Iso
We introduce AmbigNLG, a novel task designed to tackle the challenge of task ambiguity in instructions for Natural Language Generation (NLG).
1 code implementation • 21 Feb 2024 • Seiji Maekawa, Hayate Iso, Sairam Gurajada, Nikita Bhutani
We demonstrate the efficacy of our finer-grained metric and insights through an adaptive retrieval system that selectively employs retrieval and recall based on the frequencies of entities and relations in the question.
1 code implementation • 10 Nov 2023 • Pouya Pezeshkpour, Hayate Iso, Thom Lake, Nikita Bhutani, Estevam Hruschka
We meticulously craft this benchmark to cater to a wide array of HR tasks, including matching and explaining resumes to job descriptions, extracting skills and experiences from resumes, and editing resumes.
1 code implementation • 20 Sep 2023 • Haopeng Zhang, Hayate Iso, Sairam Gurajada, Nikita Bhutani
Text editing is a crucial task of modifying text to better align with user intents.
1 code implementation • 14 Sep 2023 • Yunshu Wu, Hayate Iso, Pouya Pezeshkpour, Nikita Bhutani, Estevam Hruschka
Large Language Models (LLMs) have shown promising performance in summary evaluation tasks, yet they face challenges such as high computational costs and the Lost-in-the-Middle problem where important information in the middle of long documents is often overlooked.
1 code implementation • 21 Dec 2022 • Bosung Kim, Hayate Iso, Nikita Bhutani, Estevam Hruschka, Ndapa Nakashole, Tom Mitchell
We propose a novel framework, ZETT (ZEro-shot Triplet extraction by Template infilling), that aligns the task objective to the pre-training objective of generative transformers to generalize to unseen relations.
Ranked #1 on
Zero-shot Relation Triplet Extraction
on FewRel
1 code implementation • 16 Nov 2022 • Hayate Iso, Xiaolan Wang, Yoshi Suhara
To tackle the difficulty in collecting customer and professional review pairs, we develop a non-parallel training framework, Noisy Pairing and Partial Supervision (NAPA), which trains a stylized opinion summarization system from non-parallel customer and professional review sets.
1 code implementation • 15 Nov 2022 • Hayate Iso
Lexically constrained text generation is one of the constrained text generation tasks, which aims to generate text that covers all the given constraint lexicons.
1 code implementation • Findings (ACL) 2022 • Hayate Iso, Xiaolan Wang, Stefanos Angelidis, Yoshihiko Suhara
Opinion summarization focuses on generating summaries that reflect popular subjective information expressed in multiple online reviews.
1 code implementation • 14 Jun 2021 • Shogo Ujiie, Hayate Iso, Eiji Aramaki
We introduce BioCoM, a contrastive learning framework for biomedical entity linking that uses only two resources: a small-sized dictionary and a large number of raw biomedical articles.
no code implementations • NAACL (BioNLP) 2021 • Shogo Ujiie, Hayate Iso, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki
Disease name recognition and normalization, which is generally called biomedical entity linking, is a fundamental process in biomedical text mining.
1 code implementation • Findings (EMNLP) 2021 • Hayate Iso, Xiaolan Wang, Yoshihiko Suhara, Stefanos Angelidis, Wang-Chiew Tan
We found that text autoencoders tend to generate overly generic summaries from simply averaged latent vectors due to an unexpected $L_2$-norm shrinkage in the aggregated latent vectors, which we refer to as summary vector degeneration.
Ranked #1 on
Unsupervised Opinion Summarization
on Amazon
1 code implementation • ACL 2020 • Hayate Iso, chao qiao, Hang Li
We propose a novel text editing task, referred to as \textit{fact-based text editing}, in which the goal is to revise a given document to better describe the facts in a knowledge base (e. g., several triples).
Ranked #1 on
Fact-based Text Editing
on WebEdit
2 code implementations • ACL 2019 • Hayate Iso, Yui Uehara, Tatsuya Ishigaki, Hiroshi Noji, Eiji Aramaki, Ichiro Kobayashi, Yusuke Miyao, Naoaki Okazaki, Hiroya Takamura
We propose a data-to-text generation model with two modules, one for tracking and the other for text generation.
no code implementations • WS 2017 • Ryo Takeuchi, Hayate Iso, Kaoru Ito, Shoko Wakamiya, Eiji Aramaki
Based on these results, we can infer that social sensors can reliably detect unseasonal and local disease events under certain conditions, just as they can for seasonal or global events.
no code implementations • 8 May 2017 • Hayate Iso, Shoko Wakamiya, Eiji Aramaki
Nowadays, geographic information related to Twitter is crucially important for fine-grained applications.
no code implementations • COLING 2016 • Hayate Iso, Shoko Wakamiya, Eiji Aramaki
Because of the increasing popularity of social media, much information has been shared on the internet, enabling social media users to understand various real world events.