no code implementations • 28 Nov 2023 • Cong Wei, Yang Chen, Haonan Chen, Hexiang Hu, Ge Zhang, Jie Fu, Alan Ritter, Wenhu Chen
Existing information retrieval (IR) models often assume a homogeneous format, limiting their applicability to diverse user needs, such as searching for images with text descriptions, searching for a news article with a headline image, or finding a similar photo with a query image.
1 code implementation • 27 Nov 2023 • Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen
We introduce MMMU: a new benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning.
1 code implementation • 4 Oct 2023 • Xichen Pan, Li Dong, Shaohan Huang, Zhiliang Peng, Wenhu Chen, Furu Wei
Recent advancements in text-to-image (T2I) and vision-language-to-image (VL2I) generation have made significant strides.
2 code implementations • 2 Oct 2023 • Max Ku, Tianle Li, Kai Zhang, Yujie Lu, Xingyu Fu, Wenwen Zhuang, Wenhu Chen
Recently, a myriad of conditional image generation and editing models have been developed to serve different downstream tasks, including text-to-image generation, text-guided image editing, subject-driven image generation, control-guided image generation, etc.
1 code implementation • 1 Oct 2023 • Zekun Moore Wang, Zhongyuan Peng, Haoran Que, Jiaheng Liu, Wangchunshu Zhou, Yuhan Wu, Hongcheng Guo, Ruitong Gan, Zehao Ni, Man Zhang, Zhaoxiang Zhang, Wanli Ouyang, Ke Xu, Wenhu Chen, Jie Fu, Junran Peng
The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters.
1 code implementation • 1 Oct 2023 • Dongfu Jiang, Yishan Li, Ge Zhang, Wenhao Huang, Bill Yuchen Lin, Wenhu Chen
To quantitatively assess our metric, we evaluate its correlation with human ratings on 5 held-in datasets, 2 held-out datasets and show that TIGERScore can achieve the highest overall Spearman's correlation with human ratings across these datasets and outperforms other metrics significantly.
1 code implementation • 15 Sep 2023 • Zihao Deng, Yinghao Ma, Yudong Liu, Rongchen Guo, Ge Zhang, Wenhu Chen, Wenhao Huang, Emmanouil Benetos
Large Language Models (LLMs) have shown immense potential in multimodal applications, yet the convergence of textual and musical domains remains relatively unexplored.
1 code implementation • 11 Sep 2023 • Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen
The MAmmoTH models are trained on MathInstruct, our meticulously curated instruction tuning dataset.
no code implementations • 5 Sep 2023 • YuBo Wang, Xueguang Ma, Wenhu Chen
Large-scale language models (LLMs), such as ChatGPT, are capable of generating human-like responses for various downstream tasks, such as task-oriented dialogues and question answering.
1 code implementation • 29 Jun 2023 • Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi Li, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal.
no code implementations • 22 Jun 2023 • Tianle Li, Max Ku, Cong Wei, Wenhu Chen
In this work, we aspire to fill the void and propose two novel subject-driven sub-tasks, i. e., Subject Replacement and Subject Addition.
1 code implementation • NeurIPS 2023 • Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger Dannenberg, Wenhu Chen, Gus Xia, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo, Jie Fu
This is evident in the limited work on deep music representations, the scarcity of large-scale datasets, and the absence of a universal and community-driven benchmark.
1 code implementation • NeurIPS 2023 • Kai Zhang, Lingbo Mo, Wenhu Chen, Huan Sun, Yu Su
To address this issue, we introduce MagicBrush (https://osu-nlp-group. github. io/MagicBrush/), the first large-scale, manually annotated dataset for instruction-guided real image editing that covers diverse scenarios: single-turn, multi-turn, mask-provided, and mask-free editing.
1 code implementation • 31 May 2023 • Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Xingran Chen, Hanzhi Yin, Chenghua Lin, Anton Ragni, Emmanouil Benetos, Norbert Gyenge, Roger Dannenberg, Ruibo Liu, Wenhu Chen, Gus Xia, Yemin Shi, Wenhao Huang, Yike Guo, Jie Fu
To address this research gap, we propose an acoustic Music undERstanding model with large-scale self-supervised Training (MERT), which incorporates teacher models to provide pseudo labels in the masked language modelling (MLM) style acoustic pre-training.
no code implementations • 23 May 2023 • Alfonso Amayuelas, Liangming Pan, Wenhu Chen, William Wang
This paper investigates the capabilities of Large Language Models (LLMs) in the context of understanding their own knowledge and measuring their uncertainty.
1 code implementation • 23 May 2023 • Yikang Pan, Liangming Pan, Wenhu Chen, Preslav Nakov, Min-Yen Kan, William Yang Wang
In this paper, we comprehensively investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation and its subsequent impact on information-intensive applications, particularly Open-Domain Question Answering (ODQA) systems.
1 code implementation • 23 May 2023 • SiQi Liu, Weixi Feng, Tsu-Jui Fu, Wenhu Chen, William Yang Wang
Making image retrieval methods practical for real-world search applications requires significant progress in dataset scales, entity comprehension, and multimodal information fusion.
no code implementations • 22 May 2023 • Zekun Wang, Ge Zhang, Kexin Yang, Ning Shi, Wangchunshu Zhou, Shaochun Hao, Guangzheng Xiong, Yizhi Li, Mong Yuan Sim, Xiuying Chen, Qingqing Zhu, Zhenzhu Yang, Adam Nik, Qi Liu, Chenghua Lin, Shi Wang, Ruibo Liu, Wenhu Chen, Ke Xu, Dayiheng Liu, Yike Guo, Jie Fu
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence.
2 code implementations • 21 May 2023 • Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, Tony Xia
We evaluate a wide spectrum of 16 large language and code models with different prompting strategies like Chain-of-Thoughts and Program-of-Thoughts.
Ranked #1 on
Natural Questions
on TheoremQA
1 code implementation • 2 May 2023 • Tianle Li, Xueguang Ma, Alex Zhuang, Yu Gu, Yu Su, Wenhu Chen
On GrailQA and WebQSP, our model is also on par with other fully-trained models.
1 code implementation • 20 Dec 2022 • Fangyu Liu, Julian Martin Eisenschlos, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Wenhu Chen, Nigel Collier, Yasemin Altun
Compared with a SOTA model finetuned on more than >28k data points, DePlot+LLM with just one-shot prompting achieves a 24. 0% improvement over finetuned SOTA on human-written queries from the task of chart QA.
Ranked #1 on
Chart Question Answering
on ChartQA
3 code implementations • 22 Nov 2022 • Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
By combining PoT with self-consistency decoding, we can achieve SoTA performance on all math problem datasets and near-SoTA performance on financial datasets.
1 code implementation • 20 Nov 2022 • Xichen Pan, Pengda Qin, Yuhong Li, Hui Xue, Wenhu Chen
Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity.
Ranked #1 on
Story Visualization
on Pororo
no code implementations • 13 Oct 2022 • Shiyang Li, Jianshu Chen, Yelong Shen, Zhiyu Chen, Xinlu Zhang, Zekun Li, Hong Wang, Jing Qian, Baolin Peng, Yi Mao, Wenhu Chen, Xifeng Yan
Integrating free-text explanations to in-context learning of large language models (LLM) is shown to elicit strong reasoning capabilities along with reasonable explanations.
1 code implementation • 13 Oct 2022 • Wenhu Chen
Specifically, we evaluated LLMs on popular table QA and fact verification datasets like WikiTableQuestion, FetaQA, TabFact, and FEVEROUS and found that LLMs are competent at complex reasoning over table structures, though these models are not pre-trained on any table corpus.
1 code implementation • 9 Oct 2022 • Zekun Li, Wenhu Chen, Shiyang Li, Hong Wang, Jing Qian, Xifeng Yan
Experimental results on the MultiWOZ dataset demonstrate that training a model on the simulated dialogues leads to even better performance than using the same amount of human-generated dialogues under the challenging low-resource settings, with as few as 85 dialogues as a seed.
no code implementations • 6 Oct 2022 • Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
While language Models store a massive amount of world knowledge implicitly in their parameters, even very large models often fail to encode information about rare entities and events, while incurring huge computational costs.
no code implementations • 29 Sep 2022 • Wenhu Chen, Hexiang Hu, Chitwan Saharia, William W. Cohen
To further evaluate the capabilities of the model, we introduce EntityDrawBench, a new benchmark that evaluates image generation for diverse entities, from frequent to rare, across multiple object categories including dogs, foods, landmarks, birds, and characters.
Ranked #3 on
Text-to-Image Generation
on COCO
no code implementations • 1 Jul 2022 • Wenhu Chen, William W. Cohen, Michiel de Jong, Nitish Gupta, Alessandro Presta, Pat Verga, John Wieting
In this position paper, we propose a new approach to generating a type of knowledge base (KB) from text, based on question generation and entity linking.
no code implementations • Findings (ACL) 2022 • Kai Nakamura, Sharon Levy, Yi-Lin Tuan, Wenhu Chen, William Yang Wang
A pressing challenge in current dialogue systems is to successfully converse with users on topics with information distributed across different modalities.
no code implementations • 10 Apr 2022 • Wenhu Chen, Pat Verga, Michiel de Jong, John Wieting, William Cohen
Retrieval augmented language models have recently become the standard for knowledge intensive tasks.
1 code implementation • 15 Oct 2021 • Liangming Pan, Wenhu Chen, Min-Yen Kan, William Yang Wang
We curate both human-written and model-generated false documents that we inject into the evidence corpus of QA models and assess the impact on the performance of these systems.
no code implementations • Findings (EMNLP) 2021 • Shiyang Li, Semih Yavuz, Wenhu Chen, Xifeng Yan
Task-adaptive pre-training (TAPT) and Self-training (ST) have emerged as the major semi-supervised approaches to improve natural language understanding (NLU) tasks with massive amount of unlabeled data.
1 code implementation • EMNLP 2021 • Zhiyu Chen, Wenhu Chen, Charese Smiley, Sameena Shah, Iana Borova, Dylan Langdon, Reema Moussa, Matt Beane, Ting-Hao Huang, Bryan Routledge, William Yang Wang
In contrast to existing tasks on general domain, the finance domain includes complex numerical reasoning and understanding of heterogeneous representations.
Ranked #3 on
Question Answering
on FinQA
1 code implementation • 13 Aug 2021 • Wenhu Chen, Xinyi Wang, William Yang Wang
Lots of facts can evolve with respect to time.
1 code implementation • NeurIPS 2021 • Yi-Lin Tuan, Connor Pryor, Wenhu Chen, Lise Getoor, William Yang Wang
To gain insights into the reasoning process of a generation model, we propose a new method, local explanation of response generation (LERG) that regards the explanations as the mutual interaction of segments in input and output sentences.
1 code implementation • NeurIPS 2021 • Xinyi Wang, Wenhu Chen, Michael Saxon, William Yang Wang
Although deep learning models have driven state-of-the-art performance on a wide array of tasks, they are prone to spurious correlations that should not be learned as predictive clues.
1 code implementation • ACL 2021 • Vardaan Pahuja, Yu Gu, Wenhu Chen, Mehdi Bahrami, Lei Liu, Wei-Peng Chen, Yu Su
Knowledge bases (KBs) and text often contain complementary knowledge: KBs store structured knowledge that can support long range reasoning, while text stores more comprehensive and timely knowledge in an unstructured way.
1 code implementation • ACL 2021 • Liangming Pan, Wenhu Chen, Wenhan Xiong, Min-Yen Kan, William Yang Wang
However, for each new domain that requires fact verification, creating a dataset by manually writing claims and linking them to their supporting evidence is expensive.
1 code implementation • NAACL 2021 • Liangming Pan, Wenhu Chen, Wenhan Xiong, Min-Yen Kan, William Yang Wang
Obtaining training data for multi-hop question answering (QA) is time-consuming and resource-intensive.
1 code implementation • ICLR 2021 • Wenhu Chen, Ming-Wei Chang, Eva Schlinger, William Wang, William W. Cohen
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Ranked #1 on
Question Answering
on OTT-QA
no code implementations • 16 Oct 2020 • Yilin Shen, Wenhu Chen, Hongxia Jin
We design a Dirichlet Prior RNN to model high-order uncertainty by degenerating as softmax layer for RNN model training.
1 code implementation • EMNLP 2020 • Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
We propose a knowledge-grounded pre-training (KGPT), which consists of two parts, 1) a general knowledge-grounded generation model to generate knowledge-enriched text.
Ranked #9 on
KG-to-Text Generation
on WebNLG 2.0 (Unconstrained)
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhiyu Chen, Wenhu Chen, Hanwen Zha, Xiyou Zhou, Yunkai Zhang, Sairam Sundaresan, William Yang Wang
If only provided with the table, it is hard for existing models to produce controllable and high-fidelity logical generations.
1 code implementation • ACL 2020 • Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset \cite{chen2019tabfact} featured with a wide range of logical/symbolic inferences as our testbed, and propose new automatic metrics to evaluate the fidelity of generation models w. r. t.\ logical inference.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Wenhu Chen, Hanwen Zha, Zhiyu Chen, Wenhan Xiong, Hong Wang, William Wang
3) a hybrid model that combines heterogeneous information to find the answer.
Ranked #4 on
Question Answering
on HybridQA
1 code implementation • CVPR 2020 • Jingzhou Liu, Wenhu Chen, Yu Cheng, Zhe Gan, Licheng Yu, Yiming Yang, Jingjing Liu
We introduce a new task, Video-and-Language Inference, for joint multimodal understanding of video and text.
2 code implementations • 8 Jan 2020 • Pengda Qin, Xin Wang, Wenhu Chen, Chunyun Zhang, Weiran Xu, William Yang Wang
Large-scale knowledge graphs (KGs) are shown to become more important in current information systems.
1 code implementation • 8 Oct 2019 • Wenhu Chen, Zhe Gan, Linjie Li, Yu Cheng, William Wang, Jingjing Liu
To design a more powerful NMN architecture for practical use, we propose Meta Module Network (MMN) centered on a novel meta module, which can take in function recipes and morph into diverse instance modules dynamically.
1 code implementation • ICLR 2020 • Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, William Yang Wang
To this end, we construct a large-scale dataset called TabFact with 16k Wikipedia tables as the evidence for 118k human-annotated natural language statements, which are labeled as either ENTAILED or REFUTED.
Ranked #9 on
Table-based Fact Verification
on TabFact
2 code implementations • NeurIPS 2019 • Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, Xifeng Yan
Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation.
Ranked #27 on
Image Generation
on ImageNet 64x64
(Bits per dim metric)
1 code implementation • ACL 2019 • Zhiyu Chen, Hanwen Zha, Honglei Liu, Wenhu Chen, Xifeng Yan, Yu Su
Pre-trained embeddings such as word embeddings and sentence embeddings are fundamental tools facilitating a wide range of downstream NLP tasks.
Ranked #136 on
Action Classification
on Kinetics-400
2 code implementations • ACL 2019 • Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
Semantically controlled neural response generation on limited-domain has achieved great performance.
Ranked #5 on
Data-to-Text Generation
on MULTIWOZ 2.1
2 code implementations • ACL 2020 • Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
Neural-based end-to-end approaches to natural language generation (NLG) from structured data or knowledge are data-hungry, making their adoption for real-world applications difficult with limited data.
1 code implementation • NAACL 2019 • Wenhu Chen, Yu Su, Yilin Shen, Zhiyu Chen, Xifeng Yan, William Wang
Under deep neural networks, a pre-defined vocabulary is required to vectorize text inputs.
no code implementations • ICLR 2019 • Wenhu Chen, Yilin Shen, Hongxia Jin, William Wang
With the recently rapid development in deep learning, deep neural networks have been widely adopted in many real-life applications.
no code implementations • 24 Aug 2018 • Wenhu Chen, Guanlin Li, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
Then, we interpret sequence-to-sequence learning as learning a transductive model to transform the source local latent distributions to match their corresponding target distributions.
1 code implementation • EMNLP 2018 • Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, William Yang Wang
Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data.
no code implementations • NAACL 2018 • Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).
no code implementations • ACL 2018 • Shuo Ren, Wenhu Chen, Shujie Liu, Mu Li, Ming Zhou, Shuai Ma
Neural Machine Translation (NMT) performs poor on the low-resource language pair $(X, Z)$, especially when $Z$ is a rare language.
2 code implementations • ACL 2018 • Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang
Though impressive results have been achieved in visual captioning, the task of generating abstract stories from photo streams is still a little-tapped problem.
Ranked #13 on
Visual Storytelling
on VIST
no code implementations • NAACL 2018 • Wenhu Chen, Wenhan Xiong, Xifeng Yan, William Wang
Inferring missing links in knowledge graphs (KG) has attracted a lot of attention from the research community.
no code implementations • CVPR 2018 • Xin Wang, Wenhu Chen, Jiawei Wu, Yuan-Fang Wang, William Yang Wang
Video captioning is the task of automatically generating a textual description of the actions in a video.
Hierarchical Reinforcement Learning
reinforcement-learning
+2
no code implementations • 28 Jun 2017 • Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).
1 code implementation • 16 Nov 2016 • Wenhu Chen, Aurelien Lucchi, Thomas Hofmann
We here propose a novel way of using such textual data by artificially generating missing visual information.
1 code implementation • AMTA 2016 • Wenhu Chen, Evgeny Matusov, Shahram Khadivi, Jan-Thorsten Peter
In this paper, we propose an effective way for biasing the attention mechanism of a sequence-to-sequence neural machine translation (NMT) model towards the well-studied statistical word alignment models.