no code implementations • Findings (ACL) 2022 • Hao Cheng, Zhihua Zhang
The Conditional Masked Language Model (CMLM) is a strong baseline of NAT.
1 code implementation • 30 May 2023 • Zelalem Gero, Chandan Singh, Hao Cheng, Tristan Naumann, Michel Galley, Jianfeng Gao, Hoifung Poon
Extracting patient information from unstructured text is a critical task in health decision-support and clinical research.
no code implementations • 23 May 2023 • Yu Zhang, Hao Cheng, Zhihong Shen, Xiaodong Liu, Ye-Yi Wang, Jianfeng Gao
Scientific literature understanding tasks have gained significant attention due to their potential to accelerate scientific discovery.
no code implementations • 4 May 2023 • Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng Gao
Our approach outperforms recent self-supervised retrievers in zero-shot evaluations and achieves state-of-the-art fine-tuned retrieval performance on NQ, HotpotQA and OTT-QA.
Ranked #4 on
Question Answering
on HotpotQA
no code implementations • 3 May 2023 • Hao Cheng, Meng Zhang, Liangyou Li, Qun Liu, Zhihua Zhang
Utilizing pivot language effectively can significantly improve low-resource machine translation.
no code implementations • 3 May 2023 • Hao Cheng, Meng Zhang, Weixuan Wang, Liangyou Li, Qun Liu, Zhihua Zhang
We can use automatic summarization or machine translation evaluation metrics for length-controllable machine translation, but this is not necessarily suitable and accurate.
no code implementations • 28 Apr 2023 • Jinhao Duan, Quanfu Fan, Hao Cheng, Xiaoshuang Shi, Kaidi Xu
In this paper, we introduce Temporal Adversarial Augmentation (TA), a novel video augmentation technique that utilizes temporal attention.
1 code implementation • 19 Apr 2023 • Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Jianfeng Gao
Chameleon synthesizes programs by composing various tools (e. g., LLMs, off-the-shelf vision models, web search engines, Python functions, and heuristic-based modules) for accomplishing complex reasoning tasks.
no code implementations • 28 Mar 2023 • Sanxing Chen, Hao Cheng, Xiaodong Liu, Jian Jiao, Yangfeng Ji, Jianfeng Gao
Learning transferable representation of knowledge graphs (KGs) is challenging due to the heterogeneous, multi-relational nature of graph structures.
1 code implementation • 27 Feb 2023 • Mengmeng Liu, Hao Cheng, Lin Chen, Hellward Broszio, Jiangtao Li, Runjiang Zhao, Monika Sester, Michael Ying Yang
Trajectory prediction for autonomous driving must continuously reason the motion stochasticity of road agents and comply with scene constraints.
no code implementations • 24 Feb 2023 • Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao
Large language models (LLMs), such as ChatGPT, are able to generate human-like, fluent responses for many downstream tasks, e. g., task-oriented dialog and question answering.
no code implementations • 15 Feb 2023 • Weicheng Zhang, Hao Cheng, Fatema T. Johora, Monika Sester
Predicting trajectories of pedestrians based on goal information in highly interactive scenes is a crucial step toward Intelligent Transportation Systems and Autonomous Driving.
no code implementations • 6 Feb 2023 • Yunshuang Yuan, Hao Cheng, Michael Ying Yang, Monika Sester
Safety is critical for autonomous driving, and one aspect of improving safety is to accurately capture the uncertainties of the perception system, especially knowing the unknown.
1 code implementation • 3 Feb 2023 • Lanqing Guo, Siyu Huang, Ding Liu, Hao Cheng, Bihan Wen
It is still challenging for the deep shadow removal model to exploit the global contextual correlation between shadow and non-shadow regions.
Ranked #1 on
Shadow Removal
on Adjusted ISTD
no code implementations • 21 Dec 2022 • Zonglin Yang, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao, Furu Wei
To this end, we propose a new task, which is to induce natural language rules from natural language facts, and create a dataset termed DEER containing 1. 2k rule-fact pairs for the task, where rules and facts are written in natural language.
2 code implementations • 22 Oct 2022 • Kaixin Ma, Hao Cheng, Xiaodong Liu, Eric Nyberg, Jianfeng Gao
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.
no code implementations • 22 Oct 2022 • Zhiying Xu, Jiafan Xu, Hongding Peng, Wei Wang, Xiaoliang Wang, Haoran Wan, Haipeng Dai, Yixu Xu, Hao Cheng, Kun Wang, Guihai Chen
Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware.
no code implementations • 11 Oct 2022 • Hao Cheng, Hao Fang, Xiaodong Liu, Jianfeng Gao
Given its effectiveness on knowledge-intensive natural language processing tasks, dense retrieval models have become increasingly popular.
no code implementations • 30 Sep 2022 • Wenjie Li, Qiaolin Xia, Hao Cheng, Kouyin Xue, Shu-Tao Xia
As an emerging secure learning paradigm in leveraging cross-silo private data, vertical federated learning (VFL) is expected to improve advertising models by enabling the joint learning of complementary user attributes privately owned by the advertiser and the publisher.
no code implementations • 26 Sep 2022 • Hao Cheng, Pu Zhao, Yize Li, Xue Lin, James Diffenderfer, Ryan Goldhahn, Bhavya Kailkhura
Recently, Diffenderfer and Kailkhura proposed a new paradigm for learning compact yet highly accurate binary neural networks simply by pruning and quantizing randomly weighted full precision neural networks.
1 code implementation • 16 Sep 2022 • Hao Cheng, Mengmeng Liu, Lin Chen, Hellward Broszio, Monika Sester, Michael Ying Yang
Our model achieves performance on par with the state-of-the-art models at a much higher prediction speed tested on multiple open datasets.
1 code implementation • 30 Aug 2022 • Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon
We present a bi-encoder framework for named entity recognition (NER), which applies contrastive learning to map candidate text spans and entity types into the same vector representation space.
Ranked #1 on
Named Entity Recognition (NER)
on BC5CDR
1 code implementation • 2 Jul 2022 • Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi
In an information-seeking conversation, a user converses with an agent to ask a series of questions that can often be under- or over-specified.
no code implementations • 15 Jun 2022 • Wenyu Jiang, Yuxin Ge, Hao Cheng, Mingcai Chen, Shuai Feng, Chongjun Wang
We propose a novel method, READ (Reconstruction Error Aggregated Detector), to unify inconsistencies from classifier and autoencoder.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
no code implementations • 31 May 2022 • Wenjie Li, Qiaolin Xia, Junfeng Deng, Hao Cheng, Jiangming Liu, Kouying Xue, Yong Cheng, Shu-Tao Xia
As an emerging secure learning paradigm in leveraging cross-agency private data, vertical federated learning (VFL) is expected to improve advertising models by enabling the joint learning of complementary user attributes privately owned by the advertiser and the publisher.
1 code implementation • 24 May 2022 • Bo-Ru Lu, Yushi Hu, Hao Cheng, Noah A. Smith, Mari Ostendorf
Human conversations can evolve in many different ways, creating challenges for automatic understanding and summarization.
1 code implementation • 20 May 2022 • Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei
With the visually-augmented context, VaLM uses a visual knowledge fusion layer to enable multimodal grounded language modeling by attending to both text context and visual knowledge in images.
2 code implementations • 19 May 2022 • Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, Yixuan Li
Our method is motivated by the analysis that the norm of the logit keeps increasing during training, leading to overconfident output.
no code implementations • 17 Feb 2022 • Da Yin, Li Dong, Hao Cheng, Xiaodong Liu, Kai-Wei Chang, Furu Wei, Jianfeng Gao
With the increasing of model capacity brought by pre-trained language models, there emerges boosting needs for more knowledgeable natural language processing (NLP) models with advanced functionalities including providing and making flexible use of encyclopedic and commonsense knowledge.
no code implementations • 4 Feb 2022 • Yang Liu, Hao Cheng, Kun Zhang
When label noise transition depends on each instance, the problem of identifying the instance-dependent noise transition matrix becomes substantially more challenging.
no code implementations • 15 Dec 2021 • Robert Tinn, Hao Cheng, Yu Gu, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon
Overall, domainspecific vocabulary and pretraining facilitate more robust models for fine-tuning.
no code implementations • 15 Dec 2021 • Sheng Zhang, Hao Cheng, Shikhar Vashishth, Cliff Wong, Jinfeng Xiao, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon
Zero-shot entity linking has emerged as a promising direction for generalizing to new entities, but it still requires example gold entity mentions during training and canonical descriptions for all entities, both of which are rarely available outside of Wikipedia.
no code implementations • 6 Dec 2021 • Mingcai Chen, Hao Cheng, Yuntao Du, Ming Xu, Wenyu Jiang, Chongjun Wang
We show that our method successfully alleviates the damage of both label noise and confirmation bias.
Ranked #2 on
Image Classification
on mini WebVision 1.0
2 code implementations • 6 Dec 2021 • Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang
In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.
Ranked #2 on
Common Sense Reasoning
on CommonsenseQA
(using extra training data)
1 code implementation • 4 Nov 2021 • Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Greg Yang, Christopher Meek, Ahmed Hassan Awadallah, Jianfeng Gao
We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks.
1 code implementation • 1 Nov 2021 • Fanxu Meng, Hao Cheng, Jiaxin Zhuang, Ke Li, Xing Sun
In this paper, we aim to remedy this problem and propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock.
2 code implementations • ICLR 2022 • Jiaheng Wei, Zhaowei Zhu, Hao Cheng, Tongliang Liu, Gang Niu, Yang Liu
These observations require us to rethink the treatment of noisy labels, and we hope the availability of these two datasets would facilitate the development and evaluation of future learning with noisy label solutions.
1 code implementation • 18 Oct 2021 • Hao Cheng, Zhaowei Zhu, Xing Sun, Yang Liu
Designing robust loss functions is popular in learning with noisy labels while existing designs did not explicitly consider the overfitting property of deep neural networks (DNNs).
1 code implementation • ACL 2022 • Kaixin Ma, Hao Cheng, Xiaodong Liu, Eric Nyberg, Jianfeng Gao
The retriever-reader framework is popular for open-domain question answering (ODQA) due to its ability to use explicit knowledge.
no code implementations • 29 Sep 2021 • Zhaowei Zhu, Zihao Dong, Hao Cheng, Yang Liu
In this paper, given good representations, we propose a universally applicable and training-free solution to detect noisy labels.
1 code implementation • 26 Sep 2021 • Hao Cheng, YuFei Wang, Haoliang Li, Alex C. Kot, Bihan Wen
In this work, we propose a novel Disentangled Feature Representation framework, dubbed DFR, for few-shot learning applications.
1 code implementation • 23 Sep 2021 • Yunshuang Yuan, Hao Cheng, Monika Sester
Sharing collective perception messages (CPM) between vehicles is investigated to decrease occlusions so as to improve the perception accuracy and safety of autonomous driving.
1 code implementation • EMNLP 2021 • Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf
Task-oriented conversational systems often use dialogue state tracking to represent the user's intentions, which involves filling in values of pre-defined slots.
Ranked #1 on
Dialogue State Tracking
on MULTIWOZ 2.1
(MultiWOZ (Joint Goal Acc) metric)
no code implementations • 13 Sep 2021 • YuFei Wang, Haoliang Li, Hao Cheng, Bihan Wen, Lap-Pui Chau, Alex C. Kot
Domain generalization aims to learn an invariant model that can generalize well to the unseen target domain.
no code implementations • 25 Jun 2021 • Yu Wang, Jinchao Li, Tristan Naumann, Chenyan Xiong, Hao Cheng, Robert Tinn, Cliff Wong, Naoto Usuyama, Richard Rogahn, Zhihong Shen, Yang Qin, Eric Horvitz, Paul N. Bennett, Jianfeng Gao, Hoifung Poon
A prominent case in point is the explosion of the biomedical literature on COVID-19, which swelled to hundreds of thousands of papers in a matter of months.
1 code implementation • 1 Jun 2021 • Hao Cheng, Kim-Hui Yap, Bihan Wen
Recent image classification algorithms, by learning deep features from large-scale datasets, have achieved significantly better results comparing to the classic feature-based approaches.
no code implementations • 30 May 2021 • Hao Cheng, Ping Wang, Chun Qi
As important data carriers, the drastically increasing number of multimedia videos often brings many duplicate and near-duplicate videos in the top results of search.
no code implementations • 9 May 2021 • Hao Cheng, Li Feng, Hailong Liu, Takatsugu Hirayama, Hiroshi Murase, Monika Sester
Intersections where vehicles are permitted to turn and interact with vulnerable road users (VRUs) like pedestrians and cyclists are among some of the most challenging locations for automated and accurate recognition of road users' behavior.
no code implementations • 21 Apr 2021 • Kaidi Xu, Chenan Wang, Hao Cheng, Bhavya Kailkhura, Xue Lin, Ryan Goldhahn
To tackle the susceptibility of deep neural networks to examples, the adversarial training has been proposed which provides a notion of robust through an inner maximization problem presenting the first-order embedded within the outer minimization of the training loss.
2 code implementations • 19 Apr 2021 • Yuting Gao, Jia-Xin Zhuang, Shaohui Lin, Hao Cheng, Xing Sun, Ke Li, Chunhua Shen
Specifically, we find the final embedding obtained by the mainstream SSL methods contains the most fruitful information, and propose to distill the final embedding to maximally transmit a teacher's knowledge to a lightweight model by constraining the last embedding of the student to be consistent with that of the teacher.
1 code implementation • NAACL 2021 • Lis Pereira, Xiaodong Liu, Hao Cheng, Hoifung Poon, Jianfeng Gao, Ichiro Kobayashi
We present a simple yet effective Targeted Adversarial Training (TAT) algorithm to improve adversarial training for natural language understanding.
no code implementations • 19 Jan 2021 • Huixiang Luo, Hao Cheng, Fanxu Meng, Yuting Gao, Ke Li, Mengdan Zhang, Xing Sun
Pseudo-labeling (PL) and Data Augmentation-based Consistency Training (DACT) are two approaches widely used in Semi-Supervised Learning (SSL) methods.
no code implementations • 14 Jan 2021 • Yanjun Li, Bihan Wen, Hao Cheng, Yoram Bresler
In this paper, we propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities.
no code implementations • ACL 2021 • Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao
To date, most of recent work under the retrieval-reader framework for open-domain QA focuses on either extractive or generative reader exclusively.
Ranked #1 on
Open-Domain Question Answering
on TriviaQA
no code implementations • 1 Jan 2021 • Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih
We review the EfficientQA competition from NeurIPS 2020.
1 code implementation • 10 Dec 2020 • Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun
Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.
no code implementations • COLING 2020 • Chao Tian, Yifei Wang, Hao Cheng, Yijiang Lian, Zhihua Zhang
In this paper we propose a unified approach for supporting different generation manners of machine translation, including autoregressive, semi-autoregressive, and refinement-based non-autoregressive models.
2 code implementations • 30 Oct 2020 • Hao Cheng, Wentong Liao, Xuejiao Tang, Michael Ying Yang, Monika Sester, Bodo Rosenhahn
In our framework, first, the spatial context between agents is explored by using self-attention architectures.
2 code implementations • NAACL 2021 • Hao Cheng, Xiaodong Liu, Lis Pereira, YaoLiang Yu, Jianfeng Gao
Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework.
1 code implementation • ICLR 2021 • Hao Cheng, Zhaowei Zhu, Xingyu Li, Yifei Gong, Xing Sun, Yang Liu
This high-quality sample sieve allows us to treat clean examples and the corrupted ones separately in training a DNN solution, and such a separation is shown to be advantageous in the instance-dependent noise setting.
Image Classification with Label Noise
Learning with noisy labels
1 code implementation • NeurIPS 2020 • Fanxu Meng, Hao Cheng, Ke Li, Huixiang Luo, Xiaowei Guo, Guangming Lu, Xing Sun
Through extensive experiments, we demonstrate that SWP is more effective compared to the previous FP-based methods and achieves the state-of-art pruning ratio on CIFAR-10 and ImageNet datasets without obvious accuracy drop.
2 code implementations • CVPR 2021 • Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun
Then we force the model to pull the feature of the distracting video and the feature of the original video closer, so that the model is explicitly restricted to resist the background influence, focusing more on the motion changes.
1 code implementation • ECCV 2020 • Shizhen Zhao, Changxin Gao, Jun Zhang, Hao Cheng, Chuchu Han, Xinyang Jiang, Xiaowei Guo, Wei-Shi Zheng, Nong Sang, Xing Sun
In the conventional person Re-ID setting, it is widely assumed that cropped person images are for each individual.
no code implementations • 31 Jul 2020 • Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon
In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models.
Ranked #1 on
Named Entity Recognition
on JNLPBA
no code implementations • 15 Jul 2020 • Hao Cheng, Bao-Hua Sun, Li-Hua Zhu, Tian-Xiao Li, Guang-Shuai Li, Cong-Bo Li, Xiao-Guang Wu, Yun Zheng
The LaBr$_3$(Ce) detector has attracted much attention in recent years for its superior characteristics to other scintillating materials in terms of resolution and efficiency.
Instrumentation and Detectors Nuclear Experiment
no code implementations • 14 Jul 2020 • Hao Cheng, Joey Tianyi Zhou, Wee Peng Tay, Bihan Wen
Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks.
1 code implementation • 15 Jun 2020 • Hao Cheng, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn, Monika Sester
Trajectory prediction is critical for applications of planning safe future movements and remains challenging even for the next few seconds in urban mixed traffic.
1 code implementation • ACL 2020 • Hao Cheng, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
We address the problem of extractive question answering using document-level distant super-vision, pairing questions and relevant documents with answer strings.
1 code implementation • 26 Apr 2020 • Hao Cheng, Fanxu Meng, Ke Li, Yuting Gao, Guangming Lu, Xing Sun, Rongrong Ji
To gain a universal improvement on both valid and invalid filters, we compensate grafting with distillation (\textbf{Cultivation}) to overcome the drawback of grafting .
3 code implementations • 20 Apr 2020 • Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon, Jianfeng Gao
In natural language processing (NLP), pre-training large neural language models such as BERT have demonstrated impressive gain in generalization for a variety of tasks, with further improvement from adversarial fine-tuning.
Ranked #4 on
Natural Language Inference
on ANLI test
(using extra training data)
3 code implementations • ACL 2020 • Xiaodong Liu, Yu Wang, Jianshu ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao
We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models.
1 code implementation • 14 Feb 2020 • Hao Cheng, Wentong Liao, Michael Ying Yang, Monika Sester, Bodo Rosenhahn
In inference time, we combine the past context and motion information of the target agent with samplings of the latent variables to predict multiple realistic trajectories in the future.
2 code implementations • CVPR 2020 • Fanxu Meng, Hao Cheng, Ke Li, Zhixin Xu, Rongrong Ji, Xing Sun, Gaungming Lu
To better perform the grafting process, we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks.
1 code implementation • 3 Dec 2019 • Fengxiang Yang, Ke Li, Zhun Zhong, Zhiming Luo, Xing Sun, Hao Cheng, Xiaowei Guo, Feiyue Huang, Rongrong Ji, Shaozi Li
This procedure encourages that the selected training samples can be both clean and miscellaneous, and that the two models can promote each other iteratively.
Ranked #9 on
Unsupervised Domain Adaptation
on Market to Duke
no code implementations • 14 Oct 2019 • Hao Cheng, Xiaoqing Yang, Zang Li, Yanghua Xiao, Yu-Cheng Lin
Deep neural networks have been widely used in text classification.
no code implementations • 27 Jul 2019 • Yi Zhang, Cheng Zeng, Hao Cheng, Chongjun Wang, Lei Zhang
The quality of data collected from different channels are inconsistent and some of them may not benefit for prediction.
no code implementations • CVPR 2019 • Hao Cheng, Dongze Lian, Bowen Deng, Shenghua Gao, Tao Tan, Yanlin Geng
We propose a new learning paradigm, Local to Global Learning (LGL), for Deep Neural Networks (DNNs) to improve the performance of classification problems.
1 code implementation • NAACL 2019 • Hao Cheng, Hao Fang, Mari Ostendorf
Characterizing these differences can be useful in human-computer interaction, as well as analysis of human-human conversations.
no code implementations • 5 Nov 2018 • Hao Cheng, Ming-Wei Chang, Kenton Lee, Ankur Parikh, Michael Collins, Kristina Toutanova
We study approaches to improve fine-grained short answer Question Answering models by integrating coarse-grained data annotated for paragraph-level relevance and show that coarsely annotated data can bring significant performance gains.
no code implementations • ECCV 2018 • Hao Cheng, Dongze Lian, Shenghua Gao, Yanlin Geng
Inspired by the pioneering work of information bottleneck principle for Deep Neural Networks (DNNs) analysis, we design an information plane based framework to evaluate the capability of DNNs for image classification tasks, which not only helps understand the capability of DNNs, but also helps us choose a neural network which leads to higher classification accuracy more efficiently.
no code implementations • NAACL 2018 • Hao Fang, Hao Cheng, Maarten Sap, Elizabeth Clark, Ari Holtzman, Yejin Choi, Noah A. Smith, Mari Ostendorf
We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize.
1 code implementation • EMNLP 2017 • Hao Cheng, Hao Fang, Mari Ostendorf
We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums.
no code implementations • 16 Aug 2016 • Hao Fang, Hao Cheng, Mari Ostendorf
Many social media platforms offer a mechanism for readers to react to comments, both positively and negatively, which in aggregate can be thought of as community endorsement.
1 code implementation • EMNLP 2016 • Hao Cheng, Hao Fang, Xiaodong He, Jianfeng Gao, Li Deng
We develop a novel bi-directional attention model for dependency parsing, which learns to agree on headword predictions from the forward and backward parsing directions.
Ranked #4 on
Chinese Dependency Parsing
on Chinese Pennbank
no code implementations • IJCNLP 2015 • Jacob Devlin, Hao Cheng, Hao Fang, Saurabh Gupta, Li Deng, Xiaodong He, Geoffrey Zweig, Margaret Mitchell
Two recent approaches have achieved state-of-the-art results in image captioning.
no code implementations • NeurIPS 2013 • Özlem Aslan, Hao Cheng, Xinhua Zhang, Dale Schuurmans
Latent variable prediction models, such as multi-layer networks, impose auxiliary latent variables between inputs and outputs to allow automatic inference of implicit features useful for prediction.
no code implementations • 26 Sep 2013 • Hao Cheng, Xinhua Zhang, Dale Schuurmans
Although many convex relaxations of clustering have been proposed in the past decade, current formulations remain restricted to spherical Gaussian or discriminative models and are susceptible to imbalanced clusters.