no code implementations • DeeLIO (ACL) 2022 • Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen
In this work, we investigate whether there are more effective strategies for judiciously selecting in-context examples (relative to random sampling) that better leverage GPT-3’s in-context learning capabilities. Inspired by the recent success of leveraging a retrieval module to augment neural networks, we propose to retrieve examples that are semantically-similar to a test query sample to formulate its corresponding prompt.
Natural Language Understanding
Open-Domain Question Answering
+1
1 code implementation • 1 Jul 2022 • Yizhe Zhang, Suraj Mishra, Peixian Liang, Hao Zheng, Danny Z. Chen
We aim to quantitatively measure the practical usability of medical image segmentation models: to what extent, how often, and on which samples a model's predictions can be used/trusted.
no code implementations • 2 Jun 2022 • Peixian Liang, Yizhe Zhang, Yifan Ding, Jianxu Chen, Chinedu S. Madukoma, Tim Weninger, Joshua D. Shrout, Danny Z. Chen
We observe that probability maps by DL semantic segmentation models can be used to generate many possible instance candidates, and accurate instance segmentation can be achieved by selecting from them a set of "optimized" candidates as output instances.
no code implementations • 23 Mar 2022 • Yizhe Zhang, Deng Cai
We demonstrate that MemSizer provides an improved balance between efficiency and accuracy over the vanilla transformer and other efficient transformer variants in three typical sequence generation tasks, including machine translation, abstractive text summarization, and language modeling.
no code implementations • 18 Mar 2022 • Shikib Mehri, Jinho Choi, Luis Fernando D'Haro, Jan Deriu, Maxine Eskenazi, Milica Gasic, Kallirroi Georgila, Dilek Hakkani-Tur, Zekang Li, Verena Rieser, Samira Shaikh, David Traum, Yi-Ting Yeh, Zhou Yu, Yizhe Zhang, Chen Zhang
This is a report on the NSF Future Directions Workshop on Automatic Evaluation of Dialog.
no code implementations • CVPR 2022 • Hyojin Park, Alan Yessenbayev, Tushar Singhal, Navin Kumar Adhikari, Yizhe Zhang, Shubhankar Mangesh Borse, Hong Cai, Nilesh Prasad Pandey, Fei Yin, Frank Mayer, Balaji Calidas, Fatih Porikli
Such a deployment scheme best utilizes the available processing power on the smartphone and enables real-time operation of our adaptive video segmentation algorithm.
no code implementations • 12 Dec 2021 • Zhisong Zhang, Yizhe Zhang, Bill Dolan
Nevertheless, due to the incompatibility between absolute positional encoding and insertion-based generation schemes, it needs to refresh the encoding of every token in the generated partial hypothesis at each step, which could be costly.
no code implementations • 3 Nov 2021 • Shubhankar Borse, Hong Cai, Yizhe Zhang, Fatih Porikli
While deeply supervised networks are common in recent literature, they typically impose the same learning objective on all transitional layers despite their varying representation powers.
Ranked #9 on
Semantic Segmentation
on NYU Depth v2
no code implementations • 24 Oct 2021 • Hong Cai, Janarbek Matai, Shubhankar Borse, Yizhe Zhang, Amin Ansari, Fatih Porikli
In order to enable such knowledge distillation across two different visual tasks, we introduce a small, trainable network that translates the predicted depth map to a semantic segmentation map, which can then be supervised by the teacher network.
no code implementations • 24 Oct 2021 • Yizhe Zhang, Shubhankar Borse, Hong Cai, Ying Wang, Ning Bi, Xiaoyun Jiang, Fatih Porikli
More specifically, by measuring the perceptual consistency between the predicted segmentation and the available ground truth on a nearby frame and combining it with the segmentation confidence, we can accurately assess the classification correctness on each pixel.
1 code implementation • 24 Oct 2021 • Yizhe Zhang, Shubhankar Borse, Hong Cai, Fatih Porikli
Since inconsistency mainly arises from the model's uncertainty in its output, we propose an adaptation scheme where the model learns from its own segmentation decisions as it streams a video, which allows producing more confident and temporally consistent labeling for similarly-looking pixels across frames.
no code implementations • Findings (ACL) 2021 • Zeqiu Wu, Michel Galley, Chris Brockett, Yizhe Zhang, Bill Dolan
The advent of large pre-trained language models has made it possible to make high-quality predictions on how to add or change a sentence in a document.
1 code implementation • 14 May 2021 • Yizhe Zhang, Siqi Sun, Xiang Gao, Yuwei Fang, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan
We propose a framework that alleviates this data constraint by jointly training a grounded generator and document retriever on the language model signal.
2 code implementations • ACL 2022 • Tianyu Liu, Yizhe Zhang, Chris Brockett, Yi Mao, Zhifang Sui, Weizhu Chen, Bill Dolan
Large pretrained generative models like GPT-3 often suffer from hallucinating non-existent or incorrect content, which undermines their potential merits in real applications.
1 code implementation • 16 Apr 2021 • Xiang Gao, Yizhe Zhang, Michel Galley, Bill Dolan
To alleviate this risk, we propose an adversarial training approach to learn a robust model, ATT (Adversarial Turing Test), that discriminates machine-generated responses from human-written replies.
1 code implementation • CVPR 2021 • Shubhankar Borse, Ying Wang, Yizhe Zhang, Fatih Porikli
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network, which efficiently learns the degree of parametric transformations between estimated and target boundaries.
Ranked #11 on
Semantic Segmentation
on NYU Depth v2
1 code implementation • EMNLP 2021 • Jungo Kasai, Hao Peng, Yizhe Zhang, Dani Yogatama, Gabriel Ilharco, Nikolaos Pappas, Yi Mao, Weizhu Chen, Noah A. Smith
Specifically, we propose a swap-then-finetune procedure: in an off-the-shelf pretrained transformer, we replace the softmax attention with its linear-complexity recurrent alternative and then finetune.
Ranked #1 on
Machine Translation
on WMT2017 Chinese-English
1 code implementation • 2 Mar 2021 • Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao
The progress in Query-focused Multi-Document Summarization (QMDS) has been limited by the lack of sufficient largescale high-quality training datasets.
1 code implementation • 17 Jan 2021 • Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen
Inspired by the recent success of leveraging a retrieval module to augment large-scale neural network models, we propose to retrieve examples that are semantically-similar to a test sample to formulate its corresponding prompt.
no code implementations • 2 Jan 2021 • Ping Yu, Ruiyi Zhang, Yang Zhao, Yizhe Zhang, Chunyuan Li, Changyou Chen
Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision.
no code implementations • 1 Jan 2021 • Liqun Chen, Yizhe Zhang, Dianqi Li, Chenyang Tao, Dong Wang, Lawrence Carin
There has been growing interest in representation learning for text data, based on theoretical arguments and empirical evidence.
no code implementations • 21 Dec 2020 • Deng Cai, Yizhe Zhang, Yichen Huang, Wai Lam, Bill Dolan
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding: Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
no code implementations • 17 Dec 2020 • Hongxiao Wang, Hao Zheng, Jianxu Chen, Lin Yang, Yizhe Zhang, Danny Z. Chen
Second, we devise an effective data selection policy for judiciously sampling the generated images: (1) to make the generated training set better cover the dataset, the clusters that are underrepresented in the original training set are covered more; (2) to make the training process more effective, we identify and oversample the images of "hard cases" in the data for which annotated training data may be scarce.
no code implementations • EMNLP 2020 • Guoyin Wang, Chunyuan Li, Jianqiao Li, Hao Fu, Yuh-Chen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang, Dinghan Shen, Qian Yang, Lawrence Carin
An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
1 code implementation • NAACL 2021 • Dianqi Li, Yizhe Zhang, Hao Peng, Liqun Chen, Chris Brockett, Ming-Ting Sun, Bill Dolan
Adversarial examples expose the vulnerabilities of natural language processing (NLP) models, and can be used to evaluate and improve their robustness.
1 code implementation • EMNLP 2020 • Xiang Gao, Yizhe Zhang, Michel Galley, Chris Brockett, Bill Dolan
Particularly, our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback.
no code implementations • 14 Aug 2020 • Siyang Yuan, Ke Bai, Liqun Chen, Yizhe Zhang, Chenyang Tao, Chunyuan Li, Guoyin Wang, Ricardo Henao, Lawrence Carin
Cross-domain alignment between image objects and text sequences is key to many visual-language tasks, and it poses a fundamental challenge to both computer vision and natural language processing.
no code implementations • NeurIPS 2020 • Yash Bhalgat, Yizhe Zhang, Jamie Lin, Fatih Porikli
We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
1 code implementation • ACL 2020 • Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan
We present a large, tunable neural conversational response generation model, DIALOGPT (dialogue generative pre-trained transformer).
no code implementations • ACL 2020 • Pengyu Cheng, Martin Renqiang Min, Dinghan Shen, Christopher Malon, Yizhe Zhang, Yitong Li, Lawrence Carin
Learning disentangled representations of natural language is essential for many NLP tasks, e. g., conditional text generation, style transfer, personalized dialogue systems, etc.
1 code implementation • EMNLP 2020 • Yizhe Zhang, Guoyin Wang, Chunyuan Li, Zhe Gan, Chris Brockett, Bill Dolan
Large-scale pre-trained language models, such as BERT and GPT-2, have achieved excellent performance in language representation learning and free-form text generation.
1 code implementation • 1 May 2020 • Zeqiu Wu, Michel Galley, Chris Brockett, Yizhe Zhang, Xiang Gao, Chris Quirk, Rik Koncel-Kedziorski, Jianfeng Gao, Hannaneh Hajishirzi, Mari Ostendorf, Bill Dolan
Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process, often resulting in uninteresting responses.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Yu Cheng, Zhe Gan, Yizhe Zhang, Oussama Elachqar, Dianqi Li, Jingjing Liu
To realize high-quality style transfer with natural context preservation, we propose a Context-Aware Style Transfer (CAST) model, which uses two separate encoders for each input sentence and its surrounding context.
1 code implementation • EMNLP 2020 • Chunyuan Li, Xiang Gao, Yuan Li, Baolin Peng, Xiujun Li, Yizhe Zhang, Jianfeng Gao
We hope that our first pre-trained big VAE language model itself and results can help the NLP community renew the interests of deep generative models in the era of large-scale pre-training, and make these principled methods more practical.
1 code implementation • ICLR 2020 • Xinjie Fan, Yizhe Zhang, Zhendong Wang, Mingyuan Zhou
To stabilize this method, we adapt to contextual generation of categorical sequences a policy gradient estimator, which evaluates a set of correlated Monte Carlo (MC) rollouts for variance control.
1 code implementation • ACL 2020 • Yichen Huang, Yizhe Zhang, Oussama Elachqar, Yu Cheng
Missing sentence generation (or sentence infilling) fosters a wide range of applications in natural language generation, such as document auto-completion and meeting note expansion.
Natural Language Processing
Natural Language Understanding
+1
1 code implementation • EACL 2021 • Woon Sang Cho, Yizhe Zhang, Sudha Rao, Asli Celikyilmaz, Chenyan Xiong, Jianfeng Gao, Mengdi Wang, Bill Dolan
In the SL stage, a single-document question generator is trained.
7 code implementations • 1 Nov 2019 • Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan
We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer).
no code implementations • WS 2019 • Woon Sang Cho, Yizhe Zhang, Sudha Rao, Chris Brockett, Sungjin Lee
A preliminary step towards this goal is to generate a question that captures common concepts of multiple documents.
no code implementations • 11 Sep 2019 • Shuyang Dai, Yu Cheng, Yizhe Zhang, Zhe Gan, Jingjing Liu, Lawrence Carin
Recent unsupervised approaches to domain adaptation primarily focus on minimizing the gap between the source and the target domains through refining the feature generator, in order to learn a better alignment between the two domains.
1 code implementation • IJCNLP 2019 • Xiang Gao, Yizhe Zhang, Sungjin Lee, Michel Galley, Chris Brockett, Jianfeng Gao, Bill Dolan
This structure allows the system to generate stylized relevant responses by sampling in the neighborhood of the conversation model prediction, and continuously control the style level.
no code implementations • WS 2019 • Xinnuo Xu, Yizhe Zhang, Lars Liden, Sungjin Lee
Although the data-driven approaches of some recent bot building platforms make it possible for a wide range of users to easily create dialogue systems, those platforms don{'}t offer tools for quickly identifying which log dialogues contain problems.
no code implementations • 31 Aug 2019 • Bryan, Xia, Yuan Gong, Yizhe Zhang, Christian Poellabauer
Recent efforts have shown promising results for person re-identification by designing part-based architectures to allow a neural network to learn discriminative representations from semantically coherent parts.
1 code implementation • IJCNLP 2019 • Dianqi Li, Yizhe Zhang, Zhe Gan, Yu Cheng, Chris Brockett, Ming-Ting Sun, Bill Dolan
These data may demonstrate domain shift, which impedes the benefits of utilizing such data for training.
no code implementations • ACL 2019 • Vighnesh Leonardo Shiv, Chris Quirk, Anshuman Suri, Xiang Gao, Khuram Shahid, Nithya Govindarajan, Yizhe Zhang, Jianfeng Gao, Michel Galley, Chris Brockett, Tulasi Menon, Bill Dolan
The Intelligent Conversation Engine: Code and Pre-trained Systems (Microsoft Icecaps) is an upcoming open-source natural language processing repository.
no code implementations • 7 Jun 2019 • Yizhe Zhang, Michael T. C. Ying, Danny Z. Chen
Ablation study confirms the effectiveness of our proposed learning scheme for medical images.
no code implementations • ACL 2019 • Liqun Chen, Guoyin Wang, Chenyang Tao, Dinghan Shen, Pengyu Cheng, Xinyuan Zhang, Wenlin Wang, Yizhe Zhang, Lawrence Carin
Constituting highly informative network embeddings is an important tool for network analysis.
1 code implementation • 13 Mar 2019 • Yizhe Zhang, Xiang Gao, Sungjin Lee, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan
Generating responses that are consistent with the dialogue context is one of the central challenges in building engaging conversational agents.
no code implementations • 28 Feb 2019 • Yizhe Zhang, Lin Yang, Hao Zheng, Peixian Liang, Colleen Mangold, Raquel G. Loreto, David. P. Hughes, Danny Z. Chen
To better mimic human visual perception, we think it is desirable for the deep learning model to be able to perceive not only raw images but also SP images.
no code implementations • NAACL 2019 • Xiang Gao, Sungjin Lee, Yizhe Zhang, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan
In this paper, we propose a SpaceFusion model to jointly optimize diversity and relevance that essentially fuses the latent space of a sequence-to-sequence model and that of an autoencoder model by leveraging novel regularization terms.
Ranked #1 on
Dialogue Generation
on Reddit (multi-ref)
no code implementations • ACL 2019 • Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin
Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation with latent variables.
no code implementations • ICLR 2019 • Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, Lawrence Carin
Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE).
no code implementations • 15 Jan 2019 • Peixian Liang, Jianxu Chen, Hao Zheng, Lin Yang, Yizhe Zhang, Danny Z. Chen
The cascade decoder structure aims to conduct more effective decoding of hierarchically encoded features and is more compatible with common encoders than the known decoders.
1 code implementation • 10 Dec 2018 • Hao Zheng, Yizhe Zhang, Lin Yang, Peixian Liang, Zhuo Zhao, Chaoli Wang, Danny Z. Chen
In this paper, we propose a new ensemble learning framework for 3D biomedical image segmentation that combines the merits of 2D and 3D models.
Ranked #1 on
Cardiovascular MR Segmentaiton
on HVSMR 2016
no code implementations • WS 2019 • Woon Sang Cho, Pengchuan Zhang, Yizhe Zhang, Xiujun Li, Michel Galley, Chris Brockett, Mengdi Wang, Jianfeng Gao
Generating coherent and cohesive long-form texts is a challenging task.
no code implementations • 27 Sep 2018 • Woon Sang Cho, Pengchuan Zhang, Yizhe Zhang, Xiujun Li, Mengdi Wang, Jianfeng Gao
Generating coherent and cohesive long-form texts is a challenging problem in natural language generation.
no code implementations • 27 Sep 2018 • Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Lawrence Carin
Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation.
1 code implementation • NeurIPS 2018 • Liqun Chen, Shuyang Dai, Chenyang Tao, Dinghan Shen, Zhe Gan, Haichao Zhang, Yizhe Zhang, Lawrence Carin
However, the discrete nature of text hinders the application of GAN to text-generation tasks.
4 code implementations • NeurIPS 2018 • Yizhe Zhang, Michel Galley, Jianfeng Gao, Zhe Gan, Xiujun Li, Chris Brockett, Bill Dolan
Responses generated by neural conversational models tend to lack informativeness and diversity.
2 code implementations • ICML 2018 • Yunchen Pu, Shuyang Dai, Zhe Gan, Wei-Yao Wang, Guoyin Wang, Yizhe Zhang, Ricardo Henao, Lawrence Carin
Distinct from most existing approaches, that only learn conditional distributions, the proposed model aims to learn a joint distribution of multiple random variables (domains).
no code implementations • 2 Jun 2018 • Lin Yang, Yizhe Zhang, Zhuo Zhao, Hao Zheng, Peixian Liang, Michael T. C. Ying, Anil T. Ahuja, Danny Z. Chen
In recent years, deep learning (DL) methods have become powerful tools for biomedical image segmentation.
2 code implementations • ACL 2018 • Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, Lawrence Carin
Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations.
Ranked #1 on
Named Entity Recognition
on CoNLL 2000
2 code implementations • ACL 2018 • Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin
Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences.
Ranked #11 on
Text Classification
on DBpedia
no code implementations • 1 Feb 2018 • Peixian Liang, Jianxu Chen, Pavel A. Brodskiy, Qinfeng Wu, Yejia Zhang, Yizhe Zhang, Lin Yang, Jeremiah J. Zartman, Danny Z. Chen
A key to analyzing spatial-temporal patterns of $Ca^{2+}$ signal waves is to accurately align the pouches across image sequences.
no code implementations • ICLR 2018 • Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Ricardo Henao, Lawrence Carin
In this paper, we conduct an extensive comparative study between Simple Word Embeddings-based Models (SWEMs), with no compositional parameters, relative to employing word embeddings within RNN/CNN-based models.
no code implementations • 15 Nov 2017 • Wenlin Wang, Yunchen Pu, Vinay Kumar Verma, Kai Fan, Yizhe Zhang, Changyou Chen, Piyush Rai, Lawrence Carin
We present a deep generative model for learning to predict classes not seen at training time.
no code implementations • 21 Sep 2017 • Dinghan Shen, Yizhe Zhang, Ricardo Henao, Qinliang Su, Lawrence Carin
A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives.
1 code implementation • NeurIPS 2017 • Zhe Gan, Liqun Chen, Wei-Yao Wang, Yunchen Pu, Yizhe Zhang, Hao liu, Chunyuan Li, Lawrence Carin
The generators are designed to learn the two-way conditional distributions between the two domains, while the discriminators implicitly define a ternary discriminative function, which is trained to distinguish real data pairs and two kinds of fake data pairs.
Image-to-Image Translation
Semi-Supervised Image Classification
+1
no code implementations • 4 Sep 2017 • Changyou Chen, Wenlin Wang, Yizhe Zhang, Qinliang Su, Lawrence Carin
However, there has been little theoretical analysis of the impact of minibatch size to the algorithm's convergence rate.
4 code implementations • NeurIPS 2017 • Yizhe Zhang, Dinghan Shen, Guoyin Wang, Zhe Gan, Ricardo Henao, Lawrence Carin
Learning latent representations from long text sequences is an important first step in many natural language processing applications.
no code implementations • 15 Jun 2017 • Lin Yang, Yizhe Zhang, Jianxu Chen, Si-Yuan Zhang, Danny Z. Chen
Image segmentation is a fundamental problem in biomedical image analysis.
1 code implementation • ICML 2017 • Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin
We propose a framework for generating realistic text via adversarial training.
no code implementations • ICML 2017 • Yizhe Zhang, Changyou Chen, Zhe Gan, Ricardo Henao, Lawrence Carin
A framework is proposed to improve the sampling efficiency of stochastic gradient MCMC, based on Hamiltonian Monte Carlo.
no code implementations • NeurIPS 2016 • Changyou Chen, Nan Ding, Chunyuan Li, Yizhe Zhang, Lawrence Carin
In this paper we develop theory to show that while the bias and MSE of an SG-MCMC algorithm depend on the staleness of stochastic gradients, its estimation variance (relative to the expected estimate, based on a prescribed number of samples) is independent of it.
no code implementations • NeurIPS 2016 • Jianxu Chen, Lin Yang, Yizhe Zhang, Mark Alber, Danny Z. Chen
Segmentation of 3D images is a fundamental problem in biomedical image analysis.
no code implementations • NeurIPS 2016 • Yizhe Zhang, Xiangyu Wang, Changyou Chen, Ricardo Henao, Kai Fan, Lawrence Carin
We unify slice sampling and Hamiltonian Monte Carlo (HMC) sampling, demonstrating their connection via the Hamiltonian-Jacobi equation from Hamiltonian mechanics.
no code implementations • 16 Dec 2015 • Yizhe Zhang, Ricardo Henao, Lawrence Carin, Jianling Zhong, Alexander J. Hartemink
When learning a hidden Markov model (HMM), sequen- tial observations can often be complemented by real-valued summary response variables generated from the path of hid- den states.