no code implementations • ICLR 2019 • Xilai Li, Yingbo Zhou, Tianfu Wu, Richard Socher, Caiming Xiong
During structure learning, the model optimizes for the best structure for the current task.
no code implementations • EMNLP (NLP4ConvAI) 2021 • Jin Qu, Kazuma Hashimoto, Wenhao Liu, Caiming Xiong, Yingbo Zhou
Compared with DNNC, our proposed method is more efficient in both training and serving since it is based upon the entailment between query utterance and labels instead of all the training examples.
no code implementations • 23 May 2023 • Srijan Bansal, Semih Yavuz, Bo Pang, Meghana Bhat, Yingbo Zhou
Question-answering (QA) tasks often investigate specific question types, knowledge domains, or reasoning skills, leading to specialized models catering to specific categories of QA tasks.
no code implementations • 18 May 2023 • Can Qin, Shu Zhang, Ning Yu, Yihao Feng, Xinyi Yang, Yingbo Zhou, Huan Wang, Juan Carlos Niebles, Caiming Xiong, Silvio Savarese, Stefano Ermon, Yun Fu, ran Xu
Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when prompted with arbitrary languages.
no code implementations • 12 May 2023 • Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou
It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question.
2 code implementations • 3 May 2023 • Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
In this study, we attempt to render the training of LLMs for program synthesis more efficient by unifying four key components: (1) model architectures, (2) learning methods, (3) infill sampling, and, (4) data distributions.
no code implementations • 3 Apr 2023 • Lifu Tu, Jin Qu, Semih Yavuz, Shafiq Joty, Wenhao Liu, Caiming Xiong, Yingbo Zhou
We evaluate our model's cross-lingual generalization capabilities on two conversation tasks: slot-filling and intent classification.
no code implementations • 28 Jan 2023 • Pengyu Zhang, Yingbo Zhou, Ming Hu, Xin Fu, Xian Wei, Mingsong Chen
Based on the concept of Continual Learning (CL), we prove that CyclicFL approximates existing centralized pre-training methods in terms of classification and prediction performance.
no code implementations • 17 Dec 2022 • Rui Meng, Ye Liu, Semih Yavuz, Divyansh Agarwal, Lifu Tu, Ning Yu, JianGuo Zhang, Meghana Bhat, Yingbo Zhou
Dense retrievers have made significant strides in text retrieval and open-domain question answering, even though most achievements were made possible only with large amounts of human supervision.
no code implementations • 22 Nov 2022 • Jiacheng Xu, Caiming Xiong, Silvio Savarese, Yingbo Zhou
We first investigate the vanilla best-first search (BFS) algorithm and then propose the Best-$k$ Search algorithm.
no code implementations • 9 Nov 2022 • Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou
Parsing natural language questions into executable logical forms is a useful and interpretable way to perform question answering on structured data such as knowledge bases (KB) or databases (DB).
1 code implementation • 22 Oct 2022 • Lifu Tu, Caiming Xiong, Yingbo Zhou
Pre-trained multilingual language models show significant performance gains for zero-shot cross-lingual model transfer on a wide range of natural language understanding (NLU) tasks.
1 code implementation • 20 Aug 2022 • Rui Meng, Tong Wang, Xingdi Yuan, Yingbo Zhou, Daqing He
Finally, we fine-tune the model with limited data with true labels to fully adapt it to the target domain.
no code implementations • 21 Jul 2022 • Paul Kassianik, Erik Nijkamp, Bo Pang, Yingbo Zhou, Caiming Xiong
As machine learning tools progress, the inevitable question arises: How can machine learning help us write better code?
no code implementations • Findings (NAACL) 2022 • Haopeng Zhang, Semih Yavuz, Wojciech Kryscinski, Kazuma Hashimoto, Yingbo Zhou
Abstractive summarization systems leveraging pre-training language models have achieved superior results on benchmark datasets.
no code implementations • ACL 2022 • Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Nitish Shirish Keskar, Caiming Xiong
Fusion-in-decoder (Fid) (Izacard and Grave, 2020) is a generative question answering (QA) model that leverages passage retrieval with a pre-trained transformer and pushed the state of the art on single-hop QA.
no code implementations • Findings (ACL) 2022 • Tong Niu, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong
When finetuned on a single rich-resource language pair, be it English-centered or not, our model is able to match the performance of the ones finetuned on all language pairs under the same data budget with less than 2. 0 points decrease in accuracy.
4 code implementations • 25 Mar 2022 • Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong
To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.
Ranked #1 on
Program Synthesis
on HumanEval
1 code implementation • 23 Mar 2022 • Tian Xie, Xinyi Yang, Angela S. Lin, Feihong Wu, Kazuma Hashimoto, Jin Qu, Young Mo Kang, Wenpeng Yin, Huan Wang, Semih Yavuz, Gang Wu, Michael Jones, Richard Socher, Yingbo Zhou, Wenhao Liu, Caiming Xiong
At the core of the struggle is the need to script every single turn of interactions between the bot and the human user.
no code implementations • 15 Mar 2022 • Bo Pang, Erik Nijkamp, Wojciech Kryściński, Silvio Savarese, Yingbo Zhou, Caiming Xiong
Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents.
Ranked #1 on
Text Summarization
on Pubmed
no code implementations • SpaNLP (ACL) 2022 • Man Luo, Kazuma Hashimoto, Semih Yavuz, Zhiwei Liu, Chitta Baral, Yingbo Zhou
Among several interesting findings, it is important to highlight that (1) the generative readers perform better in long context QA, (2) the extractive readers perform better in short context while also showing better out-of-domain generalization, and (3) the encoder of encoder-decoder PrLMs (e. g., T5) turns out to be a strong extractive reader and outperforms the standard choice of encoder-only PrLMs (e. g., RoBERTa).
1 code implementation • ICLR 2022 • Yu Bai, Song Mei, Huan Wang, Yingbo Zhou, Caiming Xiong
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly over existing approaches in several applications such as prediction intervals with improved length, minimum-volume prediction sets for multi-output regression, and label prediction sets for image classification.
1 code implementation • Findings (EMNLP) 2021 • Ye Liu, Kazuma Hashimoto, Yingbo Zhou, Semih Yavuz, Caiming Xiong, Philip S. Yu
In this work, we propose Dense Hierarchical Retrieval (DHR), a hierarchical framework that can generate accurate dense representations of passages by utilizing both macroscopic semantics in the document and microscopic semantics specific to each passage.
1 code implementation • 21 Oct 2021 • Devansh Arpit, Huan Wang, Yingbo Zhou, Caiming Xiong
We first show that this chaotic behavior exists even along the training optimization trajectory of a single model, and propose a simple model averaging protocol that both significantly boosts domain generalization and diminishes the impact of stochasticity by improving the rank correlation between the in-domain validation accuracy and out-domain test accuracy, which is crucial for reliable early stopping.
Ranked #3 on
Domain Generalization
on DomainNet
(using extra training data)
no code implementations • 29 Sep 2021 • Bo Pang, Erik Nijkamp, Wojciech Maciej Kryscinski, Silvio Savarese, Yingbo Zhou, Caiming Xiong
Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents.
1 code implementation • 20 Sep 2021 • Aadyot Bhatnagar, Paul Kassianik, Chenghao Liu, Tian Lan, Wenzhuo Yang, Rowan Cassius, Doyen Sahoo, Devansh Arpit, Sri Subramanian, Gerald Woo, Amrita Saha, Arun Kumar Jagota, Gokulakrishnan Gopalakrishnan, Manpreet Singh, K C Krithika, Sukumar Maddineni, Daeki Cho, Bo Zong, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Steven Hoi, Huan Wang
We introduce Merlion, an open-source machine learning library for time series.
1 code implementation • ACL 2022 • Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong
We present RnG-KBQA, a Rank-and-Generate approach for KBQA, which remedies the coverage issue with a generation model while preserving a strong generalization capability.
1 code implementation • NAACL 2021 • Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
Document grounded generation is the task of using the information provided in a document to improve text generation.
2 code implementations • ICLR 2021 • Shiyang Li, Semih Yavuz, Kazuma Hashimoto, Jia Li, Tong Niu, Nazneen Rajani, Xifeng Yan, Yingbo Zhou, Caiming Xiong
Dialogue state trackers have made significant progress on benchmark datasets, but their generalization capability to novel and realistic scenarios beyond the held-out conversations is less understood.
Ranked #2 on
Multi-domain Dialogue State Tracking
on MULTIWOZ 2.1
(using extra training data)
Dialogue State Tracking
Multi-domain Dialogue State Tracking
no code implementations • EMNLP 2021 • Tong Niu, Semih Yavuz, Yingbo Zhou, Nitish Shirish Keskar, Huan Wang, Caiming Xiong
To enforce a surface form dissimilar from the input, whenever the language model emits a token contained in the source sequence, DB prevents the model from outputting the subsequent source token for the next generation step.
no code implementations • NeurIPS 2020 • Huaxiu Yao, Yingbo Zhou, Mehrdad Mahdavi, Zhenhui Li, Richard Socher, Caiming Xiong
When a new task is encountered, it constructs a meta-knowledge pathway by either utilizing the most relevant knowledge blocks or exploring new blocks.
2 code implementations • ICLR 2021 • Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong
We propose Deep Autoencoding Predictive Components (DAPC) -- a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • CVPR 2021 • Mingfei Gao, Yingbo Zhou, ran Xu, Richard Socher, Caiming Xiong
Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications.
Ranked #4 on
Online Action Detection
on THUMOS'14
no code implementations • 4 May 2020 • Young Mo Kang, Yingbo Zhou
A common framework is to dynamically construct a small language model from the provided contextual mini corpus and interpolate its score with the main language model during the decoding process.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 8 Apr 2020 • Weiran Wang, Guangsen Wang, Aadyot Bhatnagar, Yingbo Zhou, Caiming Xiong, Richard Socher
For Switchboard, our phone-based BPE system achieves 6. 8\%/14. 4\% word error rate (WER) on the Switchboard/CallHome portion of the test set while joint decoding achieves 6. 3\%/13. 3\% WER.
no code implementations • 1 Mar 2020 • Lichao Sun, Yingbo Zhou, Philip S. Yu, Caiming Xiong
Ensuring the privacy of sensitive data used to train modern machine learning models is of paramount importance in many areas of practice.
no code implementations • 25 Sep 2019 • Lichao Sun, Yingbo Zhou, Jia Li, Richard Socher, Philip S. Yu, Caiming Xiong
Ensuring the privacy of sensitive data used to train modern machine learning models is of paramount importance in many areas of practice.
no code implementations • 5 Jun 2019 • Lichao Sun, Yingbo Zhou, Ji Wang, Jia Li, Richard Sochar, Philip S. Yu, Caiming Xiong
Privacy-preserving deep learning is crucial for deploying deep neural network based solutions, especially when the model works on data that contains sensitive information.
no code implementations • 31 Mar 2019 • Xilai Li, Yingbo Zhou, Tianfu Wu, Richard Socher, Caiming Xiong
Addressing catastrophic forgetting is one of the key challenges in continual learning where machine learning systems are trained with sequential or streaming tasks.
2 code implementations • ICLR 2019 • Ehsan Hosseini-Asl, Yingbo Zhou, Caiming Xiong, Richard Socher
In low-resource supervised setting, the results show that our approach improves absolute performance by 14% and 4% when adapting SVHN to MNIST and vice versa, respectively, which outperforms unsupervised domain adaptation methods that require high-resource unlabeled target domain.
1 code implementation • CVPR 2018 • Luowei Zhou, Yingbo Zhou, Jason J. Corso, Richard Socher, Caiming Xiong
To address this problem, we propose an end-to-end transformer model for dense video captioning.
Ranked #8 on
Video Captioning
on YouCook2
no code implementations • 27 Mar 2018 • Ehsan Hosseini-Asl, Yingbo Zhou, Caiming Xiong, Richard Socher
Domain adaptation plays an important role for speech recognition models, in particular, for domains that have low resources.
no code implementations • 19 Dec 2017 • Yingbo Zhou, Caiming Xiong, Richard Socher
However, there is usually a disparity between the negative maximum likelihood and the performance metric used in speech recognition, e. g., word error rate (WER).
Ranked #45 on
Speech Recognition
on LibriSpeech test-clean
no code implementations • 19 Dec 2017 • Yingbo Zhou, Caiming Xiong, Richard Socher
We augment audio data through random perturbations of tempo, pitch, volume, temporal alignment, and adding random noise. We further investigate the effect of dropout when applied to the inputs of all layers of the network.
no code implementations • 21 May 2017 • Yingbo Zhou, Utkarsh Porwal, Roberto Konow
In this paper, we reformulated the spell correction problem as a machine translation task under the encoder-decoder framework.
no code implementations • ICLR 2018 • Devansh Arpit, Yingbo Zhou, Hung Q. Ngo, Nils Napp, Venu Govindaraju
Auto-Encoders are unsupervised models that aim to learn patterns from observed data by minimizing a reconstruction cost.
no code implementations • 4 Mar 2016 • Devansh Arpit, Yingbo Zhou, Bhargava U. Kota, Venu Govindaraju
While the authors of Batch Normalization (BN) identify and address an important problem involved in training deep networks-- Internal Covariate Shift-- the current solution has certain drawbacks.
no code implementations • 5 Dec 2015 • Rohit Kumar Pandey, Yingbo Zhou, Bhargava Urala Kota, Venu Govindaraju
In this paper we present a framework for secure identification using deep neural networks, and apply it to the task of template protection for face authentication.
no code implementations • 14 Jun 2015 • Rohit Pandey, Yingbo Zhou, Venu Govindaraju
In this paper we present Deep Secure Encoding: a framework for secure classification using deep neural networks, and apply it to the task of biometric template protection for faces.
no code implementations • 21 May 2015 • Devansh Arpit, Yingbo Zhou, Hung Ngo, Venu Govindaraju
While the authors of Batch Normalization (BN) identify and address an important problem involved in training deep networks-- \textit{Internal Covariate Shift}-- the current solution has certain drawbacks.
no code implementations • NeurIPS 2014 • Yingbo Zhou, Utkarsh Porwal, Ce Zhang, Hung Q. Ngo, XuanLong Nguyen, Christopher Ré, Venu Govindaraju
Superior performance of our method is demonstrated on a challenging relation extraction task from a very large data set that have both redundant features and sample size in the order of millions.
no code implementations • 6 May 2014 • Yingbo Zhou, Devansh Arpit, Ifeoma Nwogu, Venu Govindaraju
But due to the greedy scheme of the layerwise training technique, the parameters of lower layers are fixed when training higher layers.
11 code implementations • 1 Jul 2013 • Ian J. Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, Yingbo Zhou, Chetan Ramaiah, Fangxiang Feng, Ruifan Li, Xiaojie Wang, Dimitris Athanasakis, John Shawe-Taylor, Maxim Milakov, John Park, Radu Ionescu, Marius Popescu, Cristian Grozea, James Bergstra, Jingjing Xie, Lukasz Romaszko, Bing Xu, Zhang Chuang, Yoshua Bengio
The ICML 2013 Workshop on Challenges in Representation Learning focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge.
Ranked #1 on
Facial Expression Recognition (FER)
on FER2013
(using extra training data)
BIG-bench Machine Learning
Facial Expression Recognition (FER)
+1