no code implementations • EMNLP (NLP4ConvAI) 2021 • Jin Qu, Kazuma Hashimoto, Wenhao Liu, Caiming Xiong, Yingbo Zhou
Compared with DNNC, our proposed method is more efficient in both training and serving since it is based upon the entailment between query utterance and labels instead of all the training examples.
no code implementations • ACL 2022 • Govardana Sachithanandam Ramachandran, Kazuma Hashimoto, Caiming Xiong
Further more we demonstrate sample efficiency, where our method trained only on 20% of the data, are comparable to current state of the art method trained on 100% data on two out of there evaluation metrics.
no code implementations • NLP4ConvAI (ACL) 2022 • JianGuo Zhang, Kazuma Hashimoto, Yao Wan, Zhiwei Liu, Ye Liu, Caiming Xiong, Philip Yu
Pre-trained Transformer-based models were reported to be robust in intent classification.
no code implementations • EMNLP 2020 • Semih Yavuz, Kazuma Hashimoto, Wenhao Liu, Nitish Shirish Keskar, Richard Socher, Caiming Xiong
The concept of Dialogue Act (DA) is universal across different task-oriented dialogue domains - the act of {``}request{''} carries the same speaker intention whether it is for restaurant reservation or flight booking.
no code implementations • 16 Nov 2023 • Kazuma Hashimoto, Karthik Raman, Michael Bendersky
Unlike the previous work, we introduce a novel labeling method, incremental utility, which estimates how much incremental knowledge is brought into the LLMs by a demonstration.
no code implementations • 14 Sep 2023 • Lingyu Gao, Aditi Chaudhary, Krishna Srinivasan, Kazuma Hashimoto, Karthik Raman, Michael Bendersky
In-context learning (ICL) i. e. showing LLMs only a few task-specific demonstrations has led to downstream gains with no task-specific fine-tuning required.
no code implementations • 19 May 2023 • Aditi Chaudhary, Karthik Raman, Krishna Srinivasan, Kazuma Hashimoto, Mike Bendersky, Marc Najork
While our experiments demonstrate that these modifications help improve performance of QGen techniques, we also find that QGen approaches struggle to capture the full nuance of the relevance label space and as a result the generated queries are not faithful to the desired relevance label.
no code implementations • 21 Dec 2022 • Kazuma Hashimoto, Iftekhar Naim, Karthik Raman
Sequence labeling is a core task in text understanding for IE/IR systems.
no code implementations • 29 Sep 2022 • Kazuma Hashimoto, Karthik Raman
GROOT works by training a generative sequential labeling model to match the decoder output distribution with that of the (black-box) reward function.
no code implementations • Findings (NAACL) 2022 • Haopeng Zhang, Semih Yavuz, Wojciech Kryscinski, Kazuma Hashimoto, Yingbo Zhou
Abstractive summarization systems leveraging pre-training language models have achieved superior results on benchmark datasets.
no code implementations • ACL 2022 • Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Nitish Shirish Keskar, Caiming Xiong
Fusion-in-decoder (Fid) (Izacard and Grave, 2020) is a generative question answering (QA) model that leverages passage retrieval with a pre-trained transformer and pushed the state of the art on single-hop QA.
no code implementations • Findings (ACL) 2022 • Tong Niu, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong
When finetuned on a single rich-resource language pair, be it English-centered or not, our model is able to match the performance of the ones finetuned on all language pairs under the same data budget with less than 2. 0 points decrease in accuracy.
1 code implementation • 23 Mar 2022 • Tian Xie, Xinyi Yang, Angela S. Lin, Feihong Wu, Kazuma Hashimoto, Jin Qu, Young Mo Kang, Wenpeng Yin, Huan Wang, Semih Yavuz, Gang Wu, Michael Jones, Richard Socher, Yingbo Zhou, Wenhao Liu, Caiming Xiong
At the core of the struggle is the need to script every single turn of interactions between the bot and the human user.
no code implementations • 16 Mar 2022 • Karthik Raman, Iftekhar Naim, Jiecao Chen, Kazuma Hashimoto, Kiran Yalasangi, Krishna Srinivasan
Pretrained, large, generative language models (LMs) have had great success in a wide range of sequence tagging and structured prediction tasks.
no code implementations • SpaNLP (ACL) 2022 • Man Luo, Kazuma Hashimoto, Semih Yavuz, Zhiwei Liu, Chitta Baral, Yingbo Zhou
Among several interesting findings, it is important to highlight that (1) the generative readers perform better in long context QA, (2) the extractive readers perform better in short context while also showing better out-of-domain generalization, and (3) the encoder of encoder-decoder PrLMs (e. g., T5) turns out to be a strong extractive reader and outperforms the standard choice of encoder-only PrLMs (e. g., RoBERTa).
1 code implementation • Findings (EMNLP) 2021 • Ye Liu, Kazuma Hashimoto, Yingbo Zhou, Semih Yavuz, Caiming Xiong, Philip S. Yu
In this work, we propose Dense Hierarchical Retrieval (DHR), a hierarchical framework that can generate accurate dense representations of passages by utilizing both macroscopic semantics in the document and microscopic semantics specific to each passage.
2 code implementations • SpaNLP (ACL) 2022 • Mingfei Gao, Zeyuan Chen, Nikhil Naik, Kazuma Hashimoto, Caiming Xiong, ran Xu
We propose a novel framework to conduct field extraction from forms with unlabeled data.
1 code implementation • ACL 2022 • Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong
We present RnG-KBQA, a Rank-and-Generate approach for KBQA, which remedies the coverage issue with a generation model while preserving a strong generalization capability.
1 code implementation • 8 Jun 2021 • JianGuo Zhang, Kazuma Hashimoto, Yao Wan, Zhiwei Liu, Ye Liu, Caiming Xiong, Philip S. Yu
Pre-trained Transformer-based models were reported to be robust in intent classification.
1 code implementation • NAACL 2021 • Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
Document grounded generation is the task of using the information provided in a document to improve text generation.
1 code implementation • 10 Mar 2021 • Govardana Sachithanandam Ramachandran, Kazuma Hashimoto, Caiming Xiong
This method gives guarantees on dialogue policy's performance and also learns to shape rewards according to intentions behind human responses, rather than just mimicking demonstration data; this couple with batch-RL helps overall with sample efficiency of the framework.
no code implementations • 28 Dec 2020 • Keisuke Shirai, Kazuma Hashimoto, Akiko Eriguchi, Takashi Ninomiya, Shinsuke Mori
In this paper, we propose to suppress an arbitrary type of errors by training the text generation model in a reinforcement learning framework, where we use a trainable reward function that is capable of discriminating between references and sentences containing the targeted type of errors.
1 code implementation • EMNLP 2020 • Jian-Guo Zhang, Kazuma Hashimoto, Wenhao Liu, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong
Intent detection is one of the core components of goal-oriented dialog systems, and detecting out-of-scope (OOS) intents is also a practically important skill.
2 code implementations • ICLR 2021 • Shiyang Li, Semih Yavuz, Kazuma Hashimoto, Jia Li, Tong Niu, Nazneen Rajani, Xifeng Yan, Yingbo Zhou, Caiming Xiong
Dialogue state trackers have made significant progress on benchmark datasets, but their generalization capability to novel and realistic scenarios beyond the held-out conversations is less understood.
Ranked #2 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1 (using extra training data)
1 code implementation • WS 2019 • Kazuma Hashimoto, Raffaella Buschiazzo, James Bradbury, Teresa Marshall, Richard Socher, Caiming Xiong
We build and evaluate translation models for seven target languages from English, with several different copy mechanisms and an XML-constrained beam search.
no code implementations • 17 Jun 2020 • Andre Esteva, Anuprit Kale, Romain Paulus, Kazuma Hashimoto, Wenpeng Yin, Dragomir Radev, Richard Socher
The COVID-19 global pandemic has resulted in international efforts to understand, track, and mitigate the disease, yielding a significant corpus of COVID-19 and SARS-CoV-2-related publications across scientific disciplines.
no code implementations • 27 Feb 2020 • Lichao Sun, Kazuma Hashimoto, Wenpeng Yin, Akari Asai, Jia Li, Philip Yu, Caiming Xiong
There is an increasing amount of literature that claims the brittleness of deep neural networks in dealing with adversarial examples that are created maliciously.
2 code implementations • ICLR 2020 • Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong
Answering questions that require multi-hop reasoning at web-scale necessitates retrieving multiple evidence documents, one of which often has little lexical or semantic relationship to the question.
Ranked #26 on Question Answering on HotpotQA
1 code implementation • Joint Conference on Lexical and Computational Semantics 2020 • Jian-Guo Zhang, Kazuma Hashimoto, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong
Dialog state tracking (DST) is a core component in task-oriented dialog systems.
Ranked #4 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.0
dialog state tracking Multi-domain Dialogue State Tracking +1
no code implementations • CL 2019 • Akiko Eriguchi, Kazuma Hashimoto, Yoshimasa Tsuruoka
In those NMT models, sentences are simply treated as sequences of words without any internal structure.
1 code implementation • 10 Sep 2018 • Akari Asai, Akiko Eriguchi, Kazuma Hashimoto, Yoshimasa Tsuruoka
Given a target language without RC training data and a pivot language with RC training data (e. g. English), our method leverages existing RC resources in the pivot language by combining a competitive RC model in the pivot language with an attentive Neural Machine Translation (NMT) model.
1 code implementation • NAACL 2019 • Kazuma Hashimoto, Yoshimasa Tsuruoka
A major obstacle in reinforcement learning-based sentence generation is the large action space whose size is equal to the vocabulary size of the target-side language.
no code implementations • EMNLP 2017 • Kazuma Hashimoto, Yoshimasa Tsuruoka
This paper presents a novel neural machine translation model which jointly learns translation and source-side latent graph representations of sentences.
no code implementations • WS 2016 • Kazuma Hashimoto, Akiko Eriguchi, Yoshimasa Tsuruoka
This paper describes our UT-KAY system that participated in the Workshop on Asian Translation 2016.
no code implementations • WS 2016 • Akiko Eriguchi, Kazuma Hashimoto, Yoshimasa Tsuruoka
This paper reports our systems (UT-AKY) submitted in the 3rd Workshop of Asian Translation 2016 (WAT{'}16) and their results in the English-to-Japanese translation task.
2 code implementations • EMNLP 2017 • Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher
Transfer and multi-task learning have traditionally focused on either a single source-target pair or very few, similar tasks.
Ranked #3 on Chunking on Penn Treebank
no code implementations • WS 2016 • Yusuke Watanabe, Kazuma Hashimoto, Yoshimasa Tsuruoka
Recently, recurrent neural networks have been shown to be successful on a variety of NLP tasks such as caption generation; however, the existing domain adaptation techniques are limited to (1) tune the model parameters by the target dataset after the training by the source dataset, or (2) design the network to have dual output, one for the source domain and the other for the target domain.
1 code implementation • ACL 2016 • Akiko Eriguchi, Kazuma Hashimoto, Yoshimasa Tsuruoka
Most of the existing Neural Machine Translation (NMT) models focus on the conversion of sequential data and do not directly use syntactic information.
no code implementations • ACL 2016 • Kazuma Hashimoto, Yoshimasa Tsuruoka
We present a novel method for jointly learning compositional and non-compositional phrase embeddings by adaptively weighting both types of embeddings using a compositionality scoring function.
no code implementations • CONLL 2015 • Kazuma Hashimoto, Pontus Stenetorp, Makoto Miwa, Yoshimasa Tsuruoka
We present a novel learning method for word embeddings designed for relation classification.