1 code implementation • COLING 2022 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong
Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.
no code implementations • ICLR 2019 • Xilai Li, Yingbo Zhou, Tianfu Wu, Richard Socher, Caiming Xiong
During structure learning, the model optimizes for the best structure for the current task.
no code implementations • EMNLP 2020 • Semih Yavuz, Kazuma Hashimoto, Wenhao Liu, Nitish Shirish Keskar, Richard Socher, Caiming Xiong
The concept of Dialogue Act (DA) is universal across different task-oriented dialogue domains - the act of {``}request{''} carries the same speaker intention whether it is for restaurant reservation or flight booking.
no code implementations • EMNLP 2020 • Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher
Pre-training in natural language processing makes it easier for an adversary with only query access to a victim model to reconstruct a local copy of the victim by training with gibberish input data paired with the victim{'}s labels for that data.
no code implementations • EMNLP (NLP4ConvAI) 2021 • Jin Qu, Kazuma Hashimoto, Wenhao Liu, Caiming Xiong, Yingbo Zhou
Compared with DNNC, our proposed method is more efficient in both training and serving since it is based upon the entailment between query utterance and labels instead of all the training examples.
no code implementations • NLP4ConvAI (ACL) 2022 • JianGuo Zhang, Kazuma Hashimoto, Yao Wan, Zhiwei Liu, Ye Liu, Caiming Xiong, Philip Yu
Pre-trained Transformer-based models were reported to be robust in intent classification.
no code implementations • ACL 2022 • Govardana Sachithanandam Ramachandran, Kazuma Hashimoto, Caiming Xiong
Further more we demonstrate sample efficiency, where our method trained only on 20% of the data, are comparable to current state of the art method trained on 100% data on two out of there evaluation metrics.
1 code implementation • 1 Jun 2023 • Fan Yin, Jesse Vig, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Jason Wu
Large language models (LLMs) have shown impressive performance in following natural language instructions to solve unseen tasks.
no code implementations • 1 Jun 2023 • Shentao Yang, Shujian Zhang, Congying Xia, Yihao Feng, Caiming Xiong, Mingyuan Zhou
Aligning language models (LMs) with preferences is an important problem in natural language generation.
1 code implementation • 30 May 2023 • Philippe Laban, Jesse Vig, Wojciech Kryscinski, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
Text simplification research has mostly focused on sentence-level simplification, even though many desirable edits - such as adding relevant background information or reordering content - may require document-level context.
1 code implementation • 23 May 2023 • Philippe Laban, Wojciech Kryściński, Divyansh Agarwal, Alexander R. Fabbri, Caiming Xiong, Shafiq Joty, Chien-Sheng Wu
To address this, we propose a new protocol for inconsistency detection benchmark creation and implement it in a 10-domain benchmark called SummEdits.
no code implementations • 18 May 2023 • Can Qin, Shu Zhang, Ning Yu, Yihao Feng, Xinyi Yang, Yingbo Zhou, Huan Wang, Juan Carlos Niebles, Caiming Xiong, Silvio Savarese, Stefano Ermon, Yun Fu, ran Xu
Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when prompted with arbitrary languages.
1 code implementation • 14 May 2023 • Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese
Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning multimodal features across 3D shapes, their 2D counterparts, and language descriptions.
Ranked #2 on
3D Point Cloud Classification
on ScanObjectNN
(using extra training data)
no code implementations • 12 May 2023 • Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou
It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question.
no code implementations • 12 May 2023 • Ziwei Fan, Zhiwei Liu, Shelby Heinecke, JianGuo Zhang, Huan Wang, Caiming Xiong, Philip S. Yu
This paper presents a novel paradigm for the Zero-Shot Item-based Recommendation (ZSIR) task, which pre-trains a model on product knowledge graph (PKG) to refine the item features from PLMs.
2 code implementations • 3 May 2023 • Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
In this study, we attempt to render the training of LLMs for program synthesis more efficient by unifying four key components: (1) model architectures, (2) learning methods, (3) infill sampling, and, (4) data distributions.
no code implementations • 3 Apr 2023 • Lifu Tu, Jin Qu, Semih Yavuz, Shafiq Joty, Wenhao Liu, Caiming Xiong, Yingbo Zhou
We evaluate our model's cross-lingual generalization capabilities on two conversation tasks: slot-filling and intent classification.
1 code implementation • 17 Mar 2023 • Can Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, ran Xu
Empirical results show that GlueNet can be trained efficiently and enables various capabilities beyond previous state-of-the-art models: 1) multilingual language models such as XLM-Roberta can be aligned with existing T2I models, allowing for the generation of high-quality images from captions beyond English; 2) GlueNet can align multi-modal encoders such as AudioCLIP with the Stable Diffusion model, enabling sound-to-image generation; 3) it can also upgrade the current text encoder of the latent diffusion model for challenging case generation.
1 code implementation • 16 Mar 2023 • Shu Zhang, Xinyi Yang, Yihao Feng, Can Qin, Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong, ran Xu
Incorporating human feedback has been shown to be crucial to align text generated by large language models to human preferences.
no code implementations • 10 Mar 2023 • Itai Feigenbaum, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Devansh Arpit
We then provide an analytic average case analysis of the PC Algorithm for causal discovery, as well as a variant of the SGS Algorithm we call UniformSGS.
1 code implementation • 7 Mar 2023 • Yixin Liu, Alexander R. Fabbri, Yilun Zhao, PengFei Liu, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev
Interpretability and efficiency are two important considerations for the adoption of neural automatic metrics.
2 code implementations • 20 Feb 2023 • Yihao Feng, Shentao Yang, Shujian Zhang, JianGuo Zhang, Caiming Xiong, Mingyuan Zhou, Huan Wang
Prior works mainly focus on adopting advanced RL techniques to train the ToD agents, while the design of the reward function is not well studied.
no code implementations • 18 Feb 2023 • Ce Zhou, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, Cheng Ji, Qiben Yan, Lifang He, Hao Peng, JianXin Li, Jia Wu, Ziwei Liu, Pengtao Xie, Caiming Xiong, Jian Pei, Philip S. Yu, Lichao Sun
This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities.
no code implementations • 17 Feb 2023 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Xiang 'Anthony' Chen, Caiming Xiong
In a second usability study, we developed and implemented a reading exercise with 95 novice news readers to measure exposure to coverage diversity.
1 code implementation • 15 Feb 2023 • Aadyot Bhatnagar, Huan Wang, Caiming Xiong, Yu Bai
We prove that our methods achieve near-optimal strongly adaptive regret for all interval lengths simultaneously, and approximately valid coverage.
no code implementations • 2 Feb 2023 • Fan Chen, Huan Wang, Caiming Xiong, Song Mei, Yu Bai
However, the fundamental limits for learning in revealing POMDPs are much less understood, with existing lower bounds being rather preliminary and having substantial gaps from the current best upper bounds.
1 code implementation • 25 Jan 2023 • Devansh Arpit, Matthew Fernandez, Chenghao Liu, Weiran Yao, Wenzhuo Yang, Paul Josel, Shelby Heinecke, Eric Hu, Huan Wang, Stephen Hoi, Caiming Xiong, Kun Zhang, Juan Carlos Niebles
We introduce the Salesforce CausalAI Library, an open-source library for causal analysis using observational data.
no code implementations • 6 Jan 2023 • Manli Shu, Le Xue, Ning Yu, Roberto Martín-Martín, Juan Carlos Niebles, Caiming Xiong, ran Xu
By plugging our proposed modules into the state-of-the-art transformer-based 3D detector, we improve the previous best results on both benchmarks, with the largest improvement margin on small objects.
1 code implementation • 19 Dec 2022 • Ning Yu, Chia-Chih Chen, Zeyuan Chen, Rui Meng, Gang Wu, Paul Josel, Juan Carlos Niebles, Caiming Xiong, ran Xu
Graphic layout designs play an essential role in visual communication.
2 code implementations • 15 Dec 2022 • Yixin Liu, Alexander R. Fabbri, PengFei Liu, Yilun Zhao, Linyong Nan, Ruilin Han, Simeng Han, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev
4) We evaluate existing automatic metrics using the collected human annotations across evaluation protocols and demonstrate how our benchmark leads to more statistically stable and significant results.
1 code implementation • CVPR 2023 • Le Xue, Mingfei Gao, Chen Xing, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese
Then, ULIP learns a 3D representation space aligned with the common image-text space, using a small number of automatically synthesized triplets.
Ranked #3 on
Training-free 3D Point Cloud Classification
on ModelNet40
(using extra training data)
no code implementations • 22 Nov 2022 • Jiacheng Xu, Caiming Xiong, Silvio Savarese, Yingbo Zhou
We first investigate the vanilla best-first search (BFS) algorithm and then propose the Best-$k$ Search algorithm.
no code implementations • 14 Nov 2022 • Yiyuan Li, Tong Che, Yezhen Wang, Zhengbao Jiang, Caiming Xiong, Snigdha Chaturvedi
In this work, we propose Symmetrical Prompt Enhancement (SPE), a continuous prompt-based method for factual probing in PLMs that leverages the symmetry of the task by constructing symmetrical prompts for subject and object prediction.
1 code implementation • 11 Nov 2022 • Alexander R. Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong
We propose to use sentence-compression data to train the post-editing model to take a summary with extrinsic entity errors marked with special tokens and output a compressed, well-formed summary with those errors removed.
1 code implementation • 9 Nov 2022 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Xiang 'Anthony' Chen, Caiming Xiong
There are many potential benefits to news readers accessing diverse sources.
no code implementations • 9 Nov 2022 • Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou
Parsing natural language questions into executable logical forms is a useful and interpretable way to perform question answering on structured data such as knowledge bases (KB) or databases (DB).
no code implementations • 23 Oct 2022 • Xiangyu Peng, Chen Xing, Prafulla Kumar Choubey, Chien-Sheng Wu, Caiming Xiong
Through this way, SESoM inherits the superior generalization of model ensemble approaches and simultaneously captures the sample-specific competence of each source prompt.
1 code implementation • 22 Oct 2022 • Lifu Tu, Caiming Xiong, Yingbo Zhou
Pre-trained multilingual language models show significant performance gains for zero-shot cross-lingual model transfer on a wide range of natural language understanding (NLU) tasks.
1 code implementation • 6 Oct 2022 • Zhoujun Cheng, Tianbao Xie, Peng Shi, Chengzu Li, Rahul Nadkarni, Yushi Hu, Caiming Xiong, Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu
We propose Binder, a training-free neural-symbolic framework that maps the task input to a program, which (1) allows binding a unified API of language model (LM) functionalities to a programming language (e. g., SQL, Python) to extend its grammar coverage and thus tackle more diverse questions, (2) adopts an LM as both the program parser and the underlying model called by the API during execution, and (3) requires only a few in-context exemplar annotations.
Ranked #3 on
Semantic Parsing
on WikiTableQuestions
1 code implementation • 2 Sep 2022 • Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Luke Benson, Lucy Sun, Ekaterina Zubova, Yujie Qiao, Matthew Burtell, David Peng, Jonathan Fan, Yixin Liu, Brian Wong, Malcolm Sailor, Ansong Ni, Linyong Nan, Jungo Kasai, Tao Yu, Rui Zhang, Shafiq Joty, Alexander R. Fabbri, Wojciech Kryscinski, Xi Victoria Lin, Caiming Xiong, Dragomir Radev
We present FOLIO, a human-annotated, open-domain, and logically complex and diverse dataset for reasoning in natural language (NL), equipped with first order logic (FOL) annotations.
no code implementations • 7 Aug 2022 • Yongjun Chen, Jia Li, Zhiwei Liu, Nitish Shirish Keskar, Huan Wang, Julian McAuley, Caiming Xiong
Due to the dynamics of users' interests and model updates during training, considering randomly sampled items from a user's non-interacted item set as negatives can be uninformative.
no code implementations • 21 Jul 2022 • Paul Kassianik, Erik Nijkamp, Bo Pang, Yingbo Zhou, Caiming Xiong
As machine learning tools progress, the inevitable question arises: How can machine learning help us write better code?
no code implementations • 6 Jun 2022 • Runyu Zhang, Qinghua Liu, Huan Wang, Caiming Xiong, Na Li, Yu Bai
Next, we show that this framework instantiated with the Optimistic Follow-The-Regularized-Leader (OFTRL) algorithm at each state (and smooth value updates) can find an $\mathcal{\widetilde{O}}(T^{-5/6})$ approximate NE in $T$ iterations, and a similar algorithm with slightly modified value update rule achieves a faster $\mathcal{\widetilde{O}}(T^{-1})$ convergence rate.
1 code implementation • 31 May 2022 • Wenzhuo Yang, Jia Li, Caiming Xiong, Steven C. H. Hoi
Counterfactual explanation is an important Explainable AI technique to explain machine learning predictions.
no code implementations • ACL 2022 • Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Nitish Shirish Keskar, Caiming Xiong
Fusion-in-decoder (Fid) (Izacard and Grave, 2020) is a generative question answering (QA) model that leverages passage retrieval with a pre-trained transformer and pushed the state of the art on single-hop QA.
no code implementations • Findings (ACL) 2022 • Tong Niu, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong
When finetuned on a single rich-resource language pair, be it English-centered or not, our model is able to match the performance of the ones finetuned on all language pairs under the same data budget with less than 2. 0 points decrease in accuracy.
1 code implementation • 13 May 2022 • Philippe Laban, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Precisely assessing the progress in natural language generation (NLG) tasks is challenging, and human evaluation to establish a preference in a model's output over another is often necessary.
no code implementations • Findings (NAACL) 2022 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Wenhao Liu, Caiming Xiong
Question generation (QGen) models are often evaluated with standardized NLG metrics that are based on n-gram overlap.
1 code implementation • CVPR 2022 • Shu Zhang, ran Xu, Caiming Xiong, Chetan Ramaiah
Current contrastive learning frameworks focus on leveraging a single supervisory signal to learn representations, which limits the efficacy on unseen data and downstream tasks.
1 code implementation • Findings (NAACL) 2022 • Ehsan Hosseini-Asl, Wenhao Liu, Caiming Xiong
Our evaluation results on the single-task polarity prediction show that our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
1 code implementation • 5 Apr 2022 • Yongjun Chen, Jia Li, Caiming Xiong
A generator, as an auxiliary model, is trained jointly with the discriminator to sample plausible alternative next items and will be thrown out after training.
1 code implementation • 25 Mar 2022 • Zhiwei Liu, Yongjun Chen, Jia Li, Man Luo, Philip S. Yu, Caiming Xiong
However, existing methods all construct views by adopting augmentation from data perspectives, while we argue that 1) optimal data augmentation methods are hard to devise, 2) data augmentation methods destroy sequential correlations, and 3) data augmentation fails to incorporate comprehensive self-supervised signals.
4 code implementations • 25 Mar 2022 • Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong
To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.
Ranked #1 on
Program Synthesis
on HumanEval
1 code implementation • 23 Mar 2022 • Tian Xie, Xinyi Yang, Angela S. Lin, Feihong Wu, Kazuma Hashimoto, Jin Qu, Young Mo Kang, Wenpeng Yin, Huan Wang, Semih Yavuz, Gang Wu, Michael Jones, Richard Socher, Yingbo Zhou, Wenhao Liu, Caiming Xiong
At the core of the struggle is the need to script every single turn of interactions between the bot and the human user.
no code implementations • ACL 2022 • Wenpeng Yin, Jia Li, Caiming Xiong
This work defines a new learning paradigm ConTinTin (Continual Learning from Task Instructions), in which a system should learn a sequence of new tasks one by one, each task is explained by a piece of textual instruction.
no code implementations • 15 Mar 2022 • Bo Pang, Erik Nijkamp, Wojciech Kryściński, Silvio Savarese, Yingbo Zhou, Caiming Xiong
Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents.
Ranked #1 on
Text Summarization
on Pubmed
2 code implementations • 28 Feb 2022 • Liang Qiu, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Extracting structure information from dialogue data can help us better understand user and system behaviors.
1 code implementation • ICLR 2022 • Yu Bai, Song Mei, Huan Wang, Yingbo Zhou, Caiming Xiong
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly over existing approaches in several applications such as prediction intervals with improved length, minimum-volume prediction sets for multi-output regression, and label prediction sets for image classification.
1 code implementation • 5 Feb 2022 • Yongjun Chen, Zhiwei Liu, Jia Li, Julian McAuley, Caiming Xiong
Specifically, we introduce a latent variable to represent users' intents and learn the distribution function of the latent variable via clustering.
5 code implementations • 28 Jan 2022 • Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision.
Ranked #4 on
Image Captioning
on nocaps-val-out-domain
1 code implementation • 16 Jan 2022 • Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu
Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.
Ranked #1 on
Task-Oriented Dialogue Systems
on KVRET
1 code implementation • 12 Jan 2022 • Zohreh Ovaisi, Shelby Heinecke, Jia Li, Yongfeng Zhang, Elena Zheleva, Caiming Xiong
Robust machine learning is an increasingly important topic that focuses on developing models resilient to various forms of imperfect data.
1 code implementation • NAACL 2022 • Alexander R. Fabbri, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Factual consistency is an essential quality of text summarization models in practical settings.
1 code implementation • 15 Dec 2021 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong
Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.
no code implementations • 20 Nov 2021 • Wenpeng Yin, Shelby Heinecke, Jia Li, Nitish Shirish Keskar, Michael Jones, Shouzhong Shi, Stanislav Georgiev, Kurt Milich, Joseph Esposito, Caiming Xiong
The distribution gap between training datasets and data encountered in production is well acknowledged.
1 code implementation • 18 Nov 2021 • Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, ran Xu, Wenhao Liu, Caiming Xiong
To enlarge the set of base classes, we propose a method to automatically generate pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs.
1 code implementation • Findings (EMNLP) 2021 • Ye Liu, Kazuma Hashimoto, Yingbo Zhou, Semih Yavuz, Caiming Xiong, Philip S. Yu
In this work, we propose Dense Hierarchical Retrieval (DHR), a hierarchical framework that can generate accurate dense representations of passages by utilizing both macroscopic semantics in the document and microscopic semantics specific to each passage.
1 code implementation • 21 Oct 2021 • Devansh Arpit, Huan Wang, Yingbo Zhou, Caiming Xiong
We first show that this chaotic behavior exists even along the training optimization trajectory of a single model, and propose a simple model averaging protocol that both significantly boosts domain generalization and diminishes the impact of stochasticity by improving the rank correlation between the in-domain validation accuracy and out-domain test accuracy, which is crucial for reliable early stopping.
Ranked #3 on
Domain Generalization
on DomainNet
(using extra training data)
no code implementations • 19 Oct 2021 • Bram Wallace, Devansh Arpit, Huan Wang, Caiming Xiong
Pretraining convolutional neural networks via self-supervision, and applying them in transfer learning, is an incredibly fast-growing field that is rapidly and iteratively improving performance across practically all image domains.
no code implementations • 19 Oct 2021 • Devansh Arpit, Aadyot Bhatnagar, Huan Wang, Caiming Xiong
Wasserstein autoencoder (WAE) shows that matching two distributions is equivalent to minimizing a simple autoencoder (AE) loss under the constraint that the latent space of this AE matches a pre-specified prior distribution.
no code implementations • 19 Oct 2021 • Anthony Meng Huat Tiong, Junnan Li, Guosheng Lin, Boyang Li, Caiming Xiong, Steven C. H. Hoi
ICCL interpolates two images from a class-agnostic sampler and a class-aware sampler, and trains the model such that the representation of the interpolative image can be used to retrieve the centroids for both source classes.
Ranked #20 on
Long-tail Learning
on CIFAR-10-LT (ρ=10)
1 code implementation • Findings (NAACL) 2022 • Lidiya Murakhovs'ka, Chien-Sheng Wu, Philippe Laban, Tong Niu, Wenhao Liu, Caiming Xiong
Asking good questions is an essential ability for both human and machine intelligence.
1 code implementation • ACL 2022 • Prakhar Gupta, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Fact-checking is an essential tool to mitigate the spread of misinformation and disinformation.
no code implementations • 11 Oct 2021 • Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong
However, given the limited size of the gender-neutral data and its potential distributional mismatch with the original pre-training data, catastrophic forgetting would occur during the second-phase pre-training.
1 code implementation • 8 Oct 2021 • Le Xue, Mingfei Gao, Zeyuan Chen, Caiming Xiong, ran Xu
We propose a novel framework to evaluate the robustness of transformer-based form field extraction methods via form attacks.
2 code implementations • SpaNLP (ACL) 2022 • Mingfei Gao, Zeyuan Chen, Nikhil Naik, Kazuma Hashimoto, Caiming Xiong, ran Xu
We propose a novel framework to conduct field extraction from forms with unlabeled data.
no code implementations • 29 Sep 2021 • Zhiwei Liu, Yongjun Chen, Jia Li, Man Luo, Philip S. Yu, Caiming Xiong
However, existing methods all construct views by adopting augmentation from data perspectives, while we argue that 1) optimal data augmentation methods are hard to devise, 2) data augmentation methods destroy sequential correlations, and 3) data augmentation fails to incorporate comprehensive self-supervised signals.
no code implementations • 29 Sep 2021 • Bo Pang, Erik Nijkamp, Wojciech Maciej Kryscinski, Silvio Savarese, Yingbo Zhou, Caiming Xiong
Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents.
no code implementations • 23 Sep 2021 • Yongjun Chen, Jia Li, Chenghao Liu, Chenxi Li, Markus Anderle, Julian McAuley, Caiming Xiong
However, properly integrating them into user interest models is challenging since attribute dynamics can be diverse such as time-interval aware, periodic patterns (etc.
1 code implementation • 20 Sep 2021 • Aadyot Bhatnagar, Paul Kassianik, Chenghao Liu, Tian Lan, Wenzhuo Yang, Rowan Cassius, Doyen Sahoo, Devansh Arpit, Sri Subramanian, Gerald Woo, Amrita Saha, Arun Kumar Jagota, Gokulakrishnan Gopalakrishnan, Manpreet Singh, K C Krithika, Sukumar Maddineni, Daeki Cho, Bo Zong, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Steven Hoi, Huan Wang
We introduce Merlion, an open-source machine learning library for time series.
1 code implementation • ACL 2022 • Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong
We present RnG-KBQA, a Rank-and-Generate approach for KBQA, which remedies the coverage issue with a generation model while preserving a strong generalization capability.
1 code implementation • 14 Aug 2021 • Zhiwei Liu, Yongjun Chen, Jia Li, Philip S. Yu, Julian McAuley, Caiming Xiong
In this paper, we investigate the application of contrastive Self-Supervised Learning (SSL) to the sequential recommendation, as a way to alleviate some of these issues.
4 code implementations • NeurIPS 2021 • Junnan Li, Ramprasaath R. Selvaraju, Akhilesh Deepak Gotmare, Shafiq Joty, Caiming Xiong, Steven Hoi
Most existing methods employ a transformer-based multimodal encoder to jointly model visual tokens (region-based image features) and word tokens.
Ranked #3 on
Zero-Shot Cross-Modal Retrieval
on COCO 2014
no code implementations • NeurIPS 2021 • Pan Zhou, Caiming Xiong, Xiao-Tong Yuan, Steven Hoi
Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and impairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query.
1 code implementation • Findings (ACL) 2021 • Wenpeng Yin, Dragomir Radev, Caiming Xiong
It has been studied intensively in the past few years thanks to the availability of large-scale labeled datasets.
1 code implementation • 10 Jun 2021 • Eric Zhao, Alexander R. Trott, Caiming Xiong, Stephan Zheng
We study the problem of training a principal in a multi-agent general-sum game using reinforcement learning (RL).
no code implementations • NeurIPS 2021 • Yu Bai, Song Mei, Huan Wang, Caiming Xiong
Estimating the data uncertainty in regression tasks is often done by learning a quantile function or a prediction interval of the true label conditioned on the input.
no code implementations • NeurIPS 2021 • Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai
This offline result is the first that matches the sample complexity lower bound in this setting, and resolves a recent open question in offline RL.
1 code implementation • 8 Jun 2021 • JianGuo Zhang, Kazuma Hashimoto, Yao Wan, Zhiwei Liu, Ye Liu, Caiming Xiong, Philip S. Yu
Pre-trained Transformer-based models were reported to be robust in intent classification.
1 code implementation • NeurIPS 2021 • Ryan Theisen, Huan Wang, Lav R. Varshney, Caiming Xiong, Richard Socher
Moreover, we show that by varying the temperature of the learned flow models, we can generate synthetic datasets that closely resemble standard benchmark datasets, but with almost any desired Bayes error.
1 code implementation • ACL 2021 • Keyang Xu, Tongzheng Ren, Shikun Zhang, Yihao Feng, Caiming Xiong
Deployed real-world machine learning applications are often subject to uncontrolled and even potentially malicious inputs.
no code implementations • NAACL 2021 • Erik Nijkamp, Bo Pang, Ying Nian Wu, Caiming Xiong
We introduce Self-CRItic Pretraining Transformers (SCRIPT) for representation learning of text.
1 code implementation • Findings (ACL) 2021 • Chien-Sheng Wu, Linqing Liu, Wenhao Liu, Pontus Stenetorp, Caiming Xiong
In this paper, we aim to improve abstractive dialogue summarization quality and, at the same time, enable granularity control.
2 code implementations • 18 May 2021 • Wojciech Kryściński, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong, Dragomir Radev
The majority of available text summarization datasets include short-form source documents that lack long-range causal and temporal dependencies, and often contain strong layout and stylistic biases.
1 code implementation • ACL 2022 • Chien-Sheng Wu, Andrea Madotto, Wenhao Liu, Pascale Fung, Caiming Xiong
This paper introduces QAConv, a new question answering (QA) dataset that uses conversations as a knowledge source.
no code implementations • 3 May 2021 • Congying Xia, Caiming Xiong, Philip Yu
PSN consists of two identical subnetworks with the same structure but different weights: an action network and an object network.
1 code implementation • NAACL 2021 • Bailin Wang, Wenpeng Yin, Xi Victoria Lin, Caiming Xiong
Moreover, explicitly modeling compositions using PCFG leads to a better exploration of unseen programs, thus generate more diverse data.
1 code implementation • 1 Apr 2021 • Linyong Nan, Chiachun Hsieh, Ziming Mao, Xi Victoria Lin, Neha Verma, Rui Zhang, Wojciech Kryściński, Nick Schoelkopf, Riley Kong, Xiangru Tang, Murori Mutuma, Ben Rosand, Isabel Trindade, Renusree Bandaru, Jacob Cunningham, Caiming Xiong, Dragomir Radev
Existing table question answering datasets contain abundant factual questions that primarily evaluate the query and schema comprehension capability of a system, but they fail to include questions that require complex reasoning and integration of information due to the constraint of the associated short-form answers.
1 code implementation • 10 Mar 2021 • Govardana Sachithanandam Ramachandran, Kazuma Hashimoto, Caiming Xiong
This method gives guarantees on dialogue policy's performance and also learns to shape rewards according to intentions behind human responses, rather than just mimicking demonstration data; this couple with batch-RL helps overall with sample efficiency of the framework.
1 code implementation • CVPR 2021 • Hanqing Wang, Wenguan Wang, Wei Liang, Caiming Xiong, Jianbing Shen
Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i. e., entailing an agent to navigate 3D environments through following linguistic instructions.
no code implementations • NeurIPS 2021 • Yu Bai, Chi Jin, Huan Wang, Caiming Xiong
Real world applications such as economics and policy making often involve solving multi-agent games with two unique features: (1) The agents are inherently asymmetric and partitioned into leaders and followers; (2) The agents have different reward functions, thus the game is general-sum.
no code implementations • 22 Feb 2021 • Rachel Luo, Aadyot Bhatnagar, Yu Bai, Shengjia Zhao, Huan Wang, Caiming Xiong, Silvio Savarese, Stefano Ermon, Edward Schmerling, Marco Pavone
In this work, we propose the local calibration error (LCE) to span the gap between average and individual reliability.
no code implementations • 15 Feb 2021 • Yu Bai, Song Mei, Huan Wang, Caiming Xiong
Modern machine learning models with high accuracy are often miscalibrated -- the predicted top probability does not reflect the actual accuracy, and tends to be over-confident.
1 code implementation • EACL 2021 • Tianxing He, Bryan McCann, Caiming Xiong, Ehsan Hosseini-Asl
In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e. g., Roberta) for natural language understanding (NLU) tasks.
2 code implementations • NAACL 2021 • Karan Goel, Nazneen Rajani, Jesse Vig, Samson Tan, Jason Wu, Stephan Zheng, Caiming Xiong, Mohit Bansal, Christopher Ré
Despite impressive performance on standard benchmarks, deep neural networks are often brittle when deployed in real-world systems.
1 code implementation • ICCV 2021 • Junnan Li, Caiming Xiong, Steven C.H. Hoi
In contrast to most existing methods, we combat noise by learning robust representation.
no code implementations • 1 Jan 2021 • Yu Bai, Tengyu Ma, Huan Wang, Caiming Xiong
In this paper, we propose Neural Rank Preserving Transforms (NRPT), a new post-calibration method that adjusts the output probabilities of a trained classifier using a calibrator of higher capacity, while maintaining its prediction accuracy.
no code implementations • 1 Jan 2021 • Junnan Li, Caiming Xiong, Steven Hoi
In contrast to most existing methods, we combat noise by learning robust representation.
no code implementations • 1 Jan 2021 • Eric Zhao, Alexander R Trott, Caiming Xiong, Stephan Zheng
Policies for real-world multi-agent problems, such as optimal taxation, can be learned in multi-agent simulations with AI agents that emulate humans.
no code implementations • 1 Jan 2021 • Devansh Arpit, Aadyot Bhatnagar, Huan Wang, Caiming Xiong
Quantitatively, we show that our algorithm achieves a new state-of-the-art FID of 54. 36 on CIFAR-10, and performs competitively with existing models on CelebA in terms of FID score.
no code implementations • 1 Jan 2021 • Devansh Arpit, Huan Wang, Caiming Xiong, Richard Socher, Yoshua Bengio
Disjoint Manifold Separation: Neural Bayes allows us to formulate an objective which can optimally label samples from disjoint manifolds present in the support of a continuous distribution.
no code implementations • 1 Jan 2021 • Chien-Sheng Wu, Linqing Liu, Wenhao Liu, Pontus Stenetorp, Caiming Xiong
2) A simple strategy to control the granularity of the final summary.
1 code implementation • EMNLP 2021 • Han Guo, Nazneen Fatema Rajani, Peter Hase, Mohit Bansal, Caiming Xiong
With the availability of the fast influence functions, we demonstrate their usefulness in four applications.
no code implementations • 28 Dec 2020 • Stanislaw Jastrzebski, Devansh Arpit, Oliver Astrand, Giancarlo Kerg, Huan Wang, Caiming Xiong, Richard Socher, Kyunghyun Cho, Krzysztof Geras
The early phase of training a deep neural network has a dramatic effect on the local curvature of the loss function.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Xi Victoria Lin, Richard Socher, Caiming Xiong
We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing.
no code implementations • 16 Dec 2020 • Chen Xing, Wenhao Liu, Caiming Xiong
According to recent studies and our empirical observations, one possible reason is that some easy-to-fit patterns in the training data, such as frequently co-occurring word combinations, dominate and harm pre-training, making it hard for the model to fit more complex information.
1 code implementation • 8 Dec 2020 • Junxian He, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong
Our approach enables users to control multiple aspects of generated summaries by interacting with the summarization system through textual input in the form of a set of keywords or descriptive prompts.
1 code implementation • 7 Dec 2020 • Govardana Sachithanandam Ramachandran, Ivan Brugere, Lav R. Varshney, Caiming Xiong
Similarly, social networks within universities and organizations may enable certain groups to more easily access people with valuable information or influence.
no code implementations • 3 Dec 2020 • Genta Indra Winata, Guangsen Wang, Caiming Xiong, Steven Hoi
One crucial challenge of real-world multilingual speech recognition is the long-tailed distribution problem, where some resource-rich languages like English have abundant training data, but a long tail of low-resource languages have varying amounts of limited training data.
3 code implementations • ICCV 2021 • Junnan Li, Caiming Xiong, Steven Hoi
CoMatch jointly learns two representations of the training data, their class probabilities and low-dimensional embeddings.
no code implementations • 6 Nov 2020 • Hiroaki Hayashi, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong
To overcome this problem, we introduce a new task of disentangled paper summarization, which seeks to generate separate summaries for the paper contributions and the context of the work, making it easier to identify the key findings shared in articles.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Chien-Sheng Wu, Steven Hoi, Caiming Xiong
We present and investigate two self-supervised objectives: preserving latent consistency and modeling conversational behavior.
no code implementations • EMNLP 2020 • Chien-Sheng Wu, Caiming Xiong
This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.
1 code implementation • EMNLP 2020 • Jian-Guo Zhang, Kazuma Hashimoto, Wenhao Liu, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong
Intent detection is one of the core components of goal-oriented dialog systems, and detecting out-of-scope (OOS) intents is also a practically important skill.
2 code implementations • ICLR 2021 • Shiyang Li, Semih Yavuz, Kazuma Hashimoto, Jia Li, Tong Niu, Nazneen Rajani, Xifeng Yan, Yingbo Zhou, Caiming Xiong
Dialogue state trackers have made significant progress on benchmark datasets, but their generalization capability to novel and realistic scenarios beyond the held-out conversations is less understood.
Ranked #2 on
Multi-domain Dialogue State Tracking
on MULTIWOZ 2.1
(using extra training data)
Dialogue State Tracking
Multi-domain Dialogue State Tracking
no code implementations • EMNLP 2021 • Tong Niu, Semih Yavuz, Yingbo Zhou, Nitish Shirish Keskar, Huan Wang, Caiming Xiong
To enforce a surface form dissimilar from the input, whenever the language model emits a token contained in the source sequence, DB prevents the model from outputting the subsequent source token for the next generation step.
no code implementations • NeurIPS 2020 • Huaxiu Yao, Yingbo Zhou, Mehrdad Mahdavi, Zhenhui Li, Richard Socher, Caiming Xiong
When a new task is encountered, it constructs a meta-knowledge pathway by either utilizing the most relevant knowledge blocks or exploring new blocks.
no code implementations • 18 Oct 2020 • Nazneen Fatema Rajani, Ben Krause, Wengpeng Yin, Tong Niu, Richard Socher, Caiming Xiong
Interpretability techniques in NLP have mainly focused on understanding individual predictions using attention visualization or gradient-based saliency maps over tokens.
no code implementations • 12 Oct 2020 • Yu Bai, Minshuo Chen, Pan Zhou, Tuo Zhao, Jason D. Lee, Sham Kakade, Huan Wang, Caiming Xiong
A common practice in meta-learning is to perform a train-validation split (\emph{train-val method}) where the prior adapts to the task on one split of the data, and the resulting predictor is evaluated on another split.
no code implementations • NeurIPS 2020 • Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, Weinan E
The result shows that (1) the escaping time of both SGD and ADAM~depends on the Radon measure of the basin positively and the heaviness of gradient noise negatively; (2) for the same basin, SGD enjoys smaller escaping time than ADAM, mainly because (a) the geometry adaptation in ADAM~via adaptively scaling each gradient coordinate well diminishes the anisotropic structure in gradient noise and results in larger Radon measure of a basin; (b) the exponential gradient average in ADAM~smooths its gradient and leads to lighter gradient noise tails than SGD.
2 code implementations • ICLR 2021 • Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong
We propose Deep Autoencoding Predictive Components (DAPC) -- a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • EMNLP 2020 • Wenpeng Yin, Nazneen Fatema Rajani, Dragomir Radev, Richard Socher, Caiming Xiong
We demonstrate that this framework enables a pretrained entailment model to work well on new entailment domains in a few-shot setting, and show its effectiveness as a unified solver for several downstream NLP tasks such as question answering and coreference resolution when the end-task annotations are limited.
1 code implementation • EMNLP 2020 • Yifan Gao, Chien-Sheng Wu, Jingjing Li, Shafiq Joty, Steven C. H. Hoi, Caiming Xiong, Irwin King, Michael R. Lyu
Based on the learned EDU and entailment representations, we either reply to the user our final decision "yes/no/irrelevant" of the initial question, or generate a follow-up question to inquiry more information.
1 code implementation • ICLR 2021 • Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong
We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data.
Ranked #8 on
Semantic Parsing
on spider
no code implementations • Findings of the Association for Computational Linguistics 2020 • Congying Xia, Caiming Xiong, Philip Yu, Richard Socher
In this paper, we focus on generating training examples for few-shot intents in the realistic imbalanced scenario.
1 code implementation • ICLR 2021 • Junnan Li, Caiming Xiong, Steven C. H. Hoi
We propose momentum prototypes (MoPro), a simple contrastive learning method that achieves online label noise correction, out-of-distribution sample removal, and representation learning.
Ranked #13 on
Image Classification
on WebVision-1000
no code implementations • ACL 2020 • Jichuan Zeng, Xi Victoria Lin, Caiming Xiong, Richard Socher, Michael R. Lyu, Irwin King, Steven C. H. Hoi
Natural language interfaces to databases (NLIDB) democratize end user access to relational data.
5 code implementations • 24 Jul 2020 • Alexander R. Fabbri, Wojciech Kryściński, Bryan McCann, Caiming Xiong, Richard Socher, Dragomir Radev
The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress.
2 code implementations • NAACL 2021 • Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani
Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures.
1 code implementation • ACL 2020 • Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael Lyu, Steven C. H. Hoi
The goal of conversational machine reading is to answer user questions given a knowledge base text which may require asking clarification questions.
1 code implementation • NeurIPS 2020 • Pan Zhou, Caiming Xiong, Richard Socher, Steven C. H. Hoi
Then we propose a theory-inspired path-regularized DARTS that consists of two key modules: (i) a differential group-structured sparse binary gate introduced for each operation to avoid unfair competition among operations, and (ii) a path-depth-wise regularization used to incite search exploration for deep architectures that often converge slower than shallow ones as shown in our theory and are not well explored during the search.
2 code implementations • ICLR 2021 • Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani
Transformer architectures have proven to learn useful representations for protein classification and generation tasks.
1 code implementation • WS 2019 • Kazuma Hashimoto, Raffaella Buschiazzo, James Bradbury, Teresa Marshall, Richard Socher, Caiming Xiong
We build and evaluate translation models for seven target languages from English, with several different copy mechanisms and an XML-constrained beam search.
no code implementations • NeurIPS 2020 • Minshuo Chen, Yu Bai, Jason D. Lee, Tuo Zhao, Huan Wang, Caiming Xiong, Richard Socher
When the trainable network is the quadratic Taylor model of a wide two-layer network, we show that neural representation can achieve improved sample complexities compared with the raw input: For learning a low-rank degree-$p$ polynomial ($p \geq 4$) in $d$ dimension, neural representation requires only $\tilde{O}(d^{\lceil p/2 \rceil})$ samples, while the best-known sample complexity upper bound for the raw input is $\tilde{O}(d^{p-1})$.
no code implementations • CVPR 2021 • Mingfei Gao, Yingbo Zhou, ran Xu, Richard Socher, Caiming Xiong
Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications.
Ranked #4 on
Online Action Detection
on THUMOS'14
1 code implementation • 26 May 2020 • Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael R. Lyu, Steven C. H. Hoi
The goal of conversational machine reading is to answer user questions given a knowledge base text which may require asking clarification questions.
2 code implementations • ICLR 2021 • Junnan Li, Pan Zhou, Caiming Xiong, Steven C. H. Hoi
This paper presents Prototypical Contrastive Learning (PCL), an unsupervised representation learning method that addresses the fundamental limitations of instance-wise contrastive learning.
Ranked #5 on
Contrastive Learning
on imagenet-1k
1 code implementation • ACL 2020 • Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong
Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models.
2 code implementations • ACL 2020 • Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming Xiong, Richard Socher, Dragomir Radev
Our framework learns to generate explanations of how the physical simulation will causally evolve so that an agent or a human can easily reason about a solution using those interpretable descriptions.
1 code implementation • EMNLP 2020 • Yue Wang, Shafiq Joty, Michael R. Lyu, Irwin King, Caiming Xiong, Steven C. H. Hoi
By contrast, in this work, we propose VD-BERT, a simple yet effective framework of unified vision-dialog Transformer that leverages the pretrained BERT language models for Visual Dialog tasks.
1 code implementation • EMNLP 2020 • Chien-Sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice.
no code implementations • 8 Apr 2020 • Weiran Wang, Guangsen Wang, Aadyot Bhatnagar, Yingbo Zhou, Caiming Xiong, Richard Socher
For Switchboard, our phone-based BPE system achieves 6. 8\%/14. 4\% word error rate (WER) on the Switchboard/CallHome portion of the test set while joint decoding achieves 6. 3\%/13. 3\% WER.