1 code implementation • CRAC (ACL) 2021 • Shubham Toshniwal, Patrick Xia, Sam Wiseman, Karen Livescu, Kevin Gimpel
While coreference resolution is defined independently of dataset domain, most models for performing coreference resolution do not transfer well to unseen domains.
2 code implementations • 26 Feb 2021 • Shubham Toshniwal, Sam Wiseman, Karen Livescu, Kevin Gimpel
Motivated by this issue, we consider the task of language modeling for the game of chess.
no code implementations • 1 Jan 2021 • Shubham Toshniwal, Sam Wiseman, Karen Livescu, Kevin Gimpel
Motivated by this issue, we consider the task of language modeling for the game of chess.
2 code implementations • EMNLP 2020 • Shubham Toshniwal, Sam Wiseman, Allyson Ettinger, Karen Livescu, Kevin Gimpel
Long document coreference resolution remains a challenging task due to the large memory and runtime requirements of current models.
Ranked #2 on
Coreference Resolution
on OntoNotes
1 code implementation • WS 2020 • Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu, Kevin Gimpel
Many natural language processing (NLP) tasks involve reasoning with textual spans, including question answering, entity recognition, and coreference resolution.
1 code implementation • ACL 2020 • Shubham Toshniwal, Allyson Ettinger, Kevin Gimpel, Karen Livescu
We propose PeTra, a memory-augmented neural network designed to track entities in its memory slots.
Ranked #1 on
Coreference Resolution
on GAP
(F1 metric)
3 code implementations • 21 Feb 2019 • Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon
Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.
no code implementations • 27 Jul 2018 • Shubham Toshniwal, Anjuli Kannan, Chung-Cheng Chiu, Yonghui Wu, Tara N. Sainath, Karen Livescu
In this paper, we compare a suite of past methods and some of our own proposed methods for using unpaired text data to improve encoder-decoder models.
no code implementations • 17 Jul 2018 • Kalpesh Krishna, Shubham Toshniwal, Karen Livescu
Previous work has shown that neural encoder-decoder speech recognition can be improved with hierarchical multitask learning, where auxiliary tasks are added at intermediate layers of a deep encoder.
no code implementations • 6 Nov 2017 • Shubham Toshniwal, Tara N. Sainath, Ron J. Weiss, Bo Li, Pedro Moreno, Eugene Weinstein, Kanishka Rao
Training a conventional automatic speech recognition (ASR) system to support multiple languages is challenging because the sub-word unit, lexicon and word inventories are typically language specific.
1 code implementation • NAACL 2018 • Trang Tran, Shubham Toshniwal, Mohit Bansal, Kevin Gimpel, Karen Livescu, Mari Ostendorf
In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses.
no code implementations • 5 Apr 2017 • Shubham Toshniwal, Hao Tang, Liang Lu, Karen Livescu
We hypothesize that using intermediate representations as auxiliary supervision at lower levels of deep networks may be a good way of combining the advantages of end-to-end training and more traditional pipeline approaches.
1 code implementation • 20 Oct 2016 • Shubham Toshniwal, Karen Livescu
We propose an attention-enabled encoder-decoder model for the problem of grapheme-to-phoneme conversion.