no code implementations • EMNLP 2020 • Qingfu Zhu, Wei-Nan Zhang, Ting Liu, William Yang Wang
Open-domain dialogue generation suffers from the data insufficiency problem due to the vast size of potential responses.
no code implementations • ECCV 2020 • Tsu-Jui Fu, Xin Eric Wang, Matthew F. Peterson,Scott T. Grafton, Miguel P. Eckstein, William Yang Wang
In particular, we present a model-agnostic adversarial path sampler (APS) that learns to sample challenging paths that force the navigator to improve based on the navigation performance.
no code implementations • LREC 2022 • Alex Mei, Anisha Kabir, Rukmini Bapat, John Judge, Tony Sun, William Yang Wang
Neural text summarization has shown great potential in recent years.
no code implementations • 13 Jan 2025 • Weixi Feng, Chao Liu, Sifei Liu, William Yang Wang, Arash Vahdat, Weili Nie
In addition, we introduce a learnable module to interpolate text embeddings so that users can control semantics in specific frames and obtain smooth object transitions.
no code implementations • 22 Dec 2024 • Jundong Xu, Hao Fei, Meng Luo, Qian Liu, Liangming Pan, William Yang Wang, Preslav Nakov, Mong-Li Lee, Wynne Hsu
In the context of large language models (LLMs), current advanced reasoning methods have made impressive strides in various reasoning tasks.
1 code implementation • 18 Dec 2024 • Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, William Yang Wang
Data contamination hinders fair LLM evaluation by introducing test data into newer models' training sets.
no code implementations • 15 Dec 2024 • Shengqiong Wu, Hao Fei, Liangming Pan, William Yang Wang, Shuicheng Yan, Tat-Seng Chua
Our framework systematically addresses potential issues in both visual and textual inputs by verifying and integrating perception-level information with cognition-level commonsense knowledge, ensuring more reliable outputs.
1 code implementation • 12 Dec 2024 • Ruiwen Zhou, Wenyue Hua, Liangming Pan, Sitao Cheng, Xiaobao Wu, En Yu, William Yang Wang
This paper introduces RuleArena, a novel and challenging benchmark designed to evaluate the ability of large language models (LLMs) to follow complex, real-world rules in reasoning.
no code implementations • 27 Nov 2024 • Tiffany Zhu, Kexun Zhang, William Yang Wang
The impressive essay writing and problem-solving capabilities of large language models (LLMs) like OpenAI's ChatGPT have opened up new avenues in education.
1 code implementation • 20 Nov 2024 • Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang
Large Language Models (LLMs) have demonstrated strong performance in handling complex tasks requiring both extensive knowledge and reasoning abilities.
1 code implementation • 29 Oct 2024 • Kexun Zhang, Shang Zhou, Danqing Wang, William Yang Wang, Lei LI
To scale up inference efficiently with a limited compute, it is crucial to find an optimal allocation for sample compute budgets: Which sampling configurations (model, temperature, language, etc.)
no code implementations • 17 Oct 2024 • Mian Zhang, Xianjun Yang, Xinlu Zhang, Travis Labrum, Jamie C. Chiu, Shaun M. Eack, Fei Fang, William Yang Wang, Zhiyu Zoey Chen
There is a significant gap between patient needs and available mental health support today.
no code implementations • 15 Oct 2024 • Wenda Xu, Rujun Han, Zifeng Wang, Long T. Le, Dhruv Madeka, Lei LI, William Yang Wang, Rishabh Agarwal, Chen-Yu Lee, Tomas Pfister
To address these limitations, we introduce Speculative Knowledge Distillation (SKD), a novel approach that leverages cooperation between student and teacher models to generate high-quality training data on-the-fly while aligning with the student's inference-time distribution.
1 code implementation • 12 Oct 2024 • Yuxi Xie, Anirudh Goyal, Xiaobao Wu, Xunjian Yin, Xiao Xu, Min-Yen Kan, Liangming Pan, William Yang Wang
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally during the generation process.
1 code implementation • 10 Oct 2024 • Gyuwan Kim, Yang Li, Evangelia Spiliopoulou, Jie Ma, Miguel Ballesteros, William Yang Wang
In this paper, we introduce EM-MIA, a novel MIA method for LLMs that iteratively refines membership scores and prefix scores via an expectation-maximization algorithm, leveraging the duality that the estimates of these scores can be improved by each other.
1 code implementation • 10 Oct 2024 • Sitao Cheng, Liangming Pan, Xunjian Yin, Xinyi Wang, William Yang Wang
To support this investigation, we introduce ECHOQA, a benchmark spanning scientific, factual, and commonsense knowledge.
no code implementations • 9 Oct 2024 • Juhyun Oh, Eunsu Kim, Jiseon Kim, Wenda Xu, Inha Cha, William Yang Wang, Alice Oh
Our factor level analysis reveals a substantial discrepancy between human and LLM preferences in generation tasks, whereas LLMs show strong alignment with human preferences in evaluation tasks.
1 code implementation • 8 Oct 2024 • Jiachen Li, Qian Long, Jian Zheng, Xiaofeng Gao, Robinson Piramuthu, Wenhu Chen, William Yang Wang
In this paper, we focus on enhancing a diffusion-based text-to-video (T2V) model during the post-training phase by distilling a highly capable consistency model from a pretrained T2V model.
2 code implementations • 6 Oct 2024 • Xunjian Yin, Xinyi Wang, Liangming Pan, Xiaojun Wan, William Yang Wang
The rapid advancement of large language models (LLMs) has significantly enhanced the capabilities of AI-driven agents across various tasks.
no code implementations • 29 Sep 2024 • Tiffany Zhu, Iain Weissburg, Kexun Zhang, William Yang Wang
As AI advances in text generation, human trust in AI generated content remains constrained by biases that go beyond concerns of accuracy.
no code implementations • 29 Aug 2024 • Yi-Lin Tuan, William Yang Wang
Beyond maximum likelihood estimation (MLE), the standard objective of a language model (LM) that optimizes good examples probabilities, many studies have explored ways that also penalize bad examples for enhancing the quality of output distribution, including unlikelihood training, exponential maximizing average treatment effect (ExMATE), and direct preference optimization (DPO).
1 code implementation • 29 Jul 2024 • Canyu Chen, Baixiang Huang, Zekun Li, Zhaorun Chen, Shiyang Lai, Xiongxiao Xu, Jia-Chen Gu, Jindong Gu, Huaxiu Yao, Chaowei Xiao, Xifeng Yan, William Yang Wang, Philip Torr, Dawn Song, Kai Shu
Then, we find that editing attacks can inject both types of misinformation into LLMs, and the effectiveness is particularly high for commonsense misinformation injection.
no code implementations • 22 Jul 2024 • Michael Saxon, Ari Holtzman, Peter West, William Yang Wang, Naomi Saphra
Modern language models (LMs) pose a new challenge in capability assessment.
no code implementations • 20 Jul 2024 • Xinyi Wang, Antonis Antoniades, Yanai Elazar, Alfonso Amayuelas, Alon Albalak, Kexun Zhang, William Yang Wang
Furthermore, while model performance improves across all tasks as LLM size increases, only factual question answering shows an increase in memorization, whereas machine translation and reasoning tasks exhibit greater generalization, producing more novel outputs.
1 code implementation • 19 Jul 2024 • Rujun Han, Yuhao Zhang, Peng Qi, Yumo Xu, Jenyuan Wang, Lan Liu, William Yang Wang, Bonan Min, Vittorio Castelli
Question answering based on retrieval augmented generation (RAG-QA) is an important research topic in NLP and has a wide range of real-world applications.
1 code implementation • 8 Jul 2024 • Luke Yoffe, Alfonso Amayuelas, William Yang Wang
To enhance Large Language Model (LLM) capabilities, multi-agent debates have been introduced, where multiple LLMs discuss solutions to a problem over several rounds of debate.
1 code implementation • 6 Jul 2024 • Zekun Li, Xianjun Yang, Kyuri Choi, Wanrong Zhu, Ryan Hsieh, HyeonJung Kim, Jin Hyuk Lim, Sungyoung Ji, Byungju Lee, Xifeng Yan, Linda Ruth Petzold, Stephen D. Wilson, Woosang Lim, William Yang Wang
The results highlight the high difficulty of these tasks and the significant performance gap among models.
1 code implementation • 2 Jul 2024 • Qiucheng Wu, Handong Zhao, Michael Saxon, Trung Bui, William Yang Wang, Yang Zhang, Shiyu Chang
One understudied capability in VLMs is visual spatial planning -- the ability to comprehend the spatial arrangements of objects and devise action plans to achieve desired outcomes in visual scenes.
no code implementations • 24 Jun 2024 • Aditya Sharma, Michael Saxon, William Yang Wang
We present LoCoVQA, a dynamic benchmark generator for evaluating long-context extractive reasoning in vision language models (VLMs).
no code implementations • 21 Jun 2024 • Kyle Wong, Alfonso Amayuelas, Liangming Pan, William Yang Wang
To explain this behavior, we perform a further analysis and find that contrary to preexisting beliefs, the correlation between reasoning ability and code correction ability is weak.
1 code implementation • 19 Jun 2024 • Danqing Wang, Antonis Antoniades, Kha-Dinh Luong, Edwin Zhang, Mert Kosan, Jiachen Li, Ambuj Singh, William Yang Wang, Lei LI
RLHEX provides a flexible framework to incorporate different human-designed principles into the counterfactual explanation generation process, aligning these explanations with domain expertise.
1 code implementation • 18 Jun 2024 • Wenda Xu, Jiachen Li, William Yang Wang, Lei LI
Direct alignment from preferences (DAP) has emerged as a promising paradigm for aligning large language models (LLMs) to human desiderata from pre-collected, offline preference datasets.
no code implementations • 16 Jun 2024 • Yujie Lu, Dongfu Jiang, Wenhu Chen, William Yang Wang, Yejin Choi, Bill Yuchen Lin
Recent breakthroughs in vision-language models (VLMs) emphasize the necessity of benchmarking human preferences in real-world multimodal interactions.
1 code implementation • 12 Jun 2024 • Xuehai He, Weixi Feng, Kaizhi Zheng, Yujie Lu, Wanrong Zhu, Jiachen Li, Yue Fan, JianFeng Wang, Linjie Li, Zhengyuan Yang, Kevin Lin, William Yang Wang, Lijuan Wang, Xin Eric Wang
Multimodal Language Language Models (MLLMs) demonstrate the emerging abilities of "world models" -- interpreting and reasoning about complex real-world dynamics.
1 code implementation • 12 Jun 2024 • Weixi Feng, Jiachen Li, Michael Saxon, Tsu-Jui Fu, Wenhu Chen, William Yang Wang
Video generation has many unique challenges beyond those of image generation.
no code implementations • 11 Jun 2024 • Xingyu Fu, Muyu He, Yujie Lu, William Yang Wang, Dan Roth
We present a novel task and benchmark for evaluating the ability of text-to-image(T2I) generation models to produce images that align with commonsense in real life, which we call Commonsense-T2I.
no code implementations • 30 May 2024 • Xinlu Zhang, Zhiyu Zoey Chen, Xi Ye, Xianjun Yang, Lichang Chen, William Yang Wang, Linda Ruth Petzold
First, coding data tuning enhances the overall reasoning capabilities of LLMs across different model families and scales.
1 code implementation • 29 May 2024 • Jiachen Li, Weixi Feng, Tsu-Jui Fu, Xinyi Wang, Sugato Basu, Wenhu Chen, William Yang Wang
In this work, we aim to break the quality bottleneck of a video consistency model (VCM) to achieve $\textbf{both fast and high-quality video generation}$.
2 code implementations • 28 May 2024 • Xiaobao Wu, Thong Nguyen, Delvin Ce Zhang, William Yang Wang, Anh Tuan Luu
We further propose a novel Embedding Transport Plan (ETP) method.
1 code implementation • 23 May 2024 • Yujie Lu, Xiujun Li, Tsu-Jui Fu, Miguel Eckstein, William Yang Wang
The rapid progress in Multimodal Large Language Models (MLLMs) has significantly advanced their ability to process and understand complex visual and textual information.
1 code implementation • 2 May 2024 • Zhiyu Zoey Chen, Jing Ma, Xinlu Zhang, Nan Hao, An Yan, Armineh Nourbakhsh, Xianjun Yang, Julian McAuley, Linda Petzold, William Yang Wang
In the fast-evolving domain of artificial intelligence, large language models (LLMs) such as GPT-3 and GPT-4 are revolutionizing the landscapes of finance, healthcare, and law: domains characterized by their reliance on professional expertise, challenging data acquisition, high-stakes, and stringent regulatory compliance.
no code implementations • 23 Apr 2024 • Wanrong Zhu, Jennifer Healey, Ruiyi Zhang, William Yang Wang, Tong Sun
Recent advancements in instruction-following models have made user interactions with models more user-friendly and efficient, broadening their applicability.
1 code implementation • 11 Apr 2024 • Haotian Zhang, Haoxuan You, Philipp Dufter, BoWen Zhang, Chen Chen, Hong-You Chen, Tsu-Jui Fu, William Yang Wang, Shih-Fu Chang, Zhe Gan, Yinfei Yang
While Ferret seamlessly integrates regional understanding into the Large Language Model (LLM) to facilitate its referring and grounding capability, it poses certain limitations: constrained by the pre-trained fixed visual encoder and failed to perform well on broader tasks.
Ranked #150 on Visual Question Answering on MM-Vet
1 code implementation • 5 Apr 2024 • Michael Saxon, Fatima Jahara, Mahsa Khoshnoodi, Yujie Lu, Aditya Sharma, William Yang Wang
With advances in the quality of text-to-image (T2I) models has come interest in benchmarking their prompt faithfulness -- the semantic coherence of generated images to the prompts they were conditioned on.
no code implementations • 1 Apr 2024 • Yi-Lin Tuan, Xilun Chen, Eric Michael Smith, Louis Martin, Soumya Batra, Asli Celikyilmaz, William Yang Wang, Daniel M. Bikel
As large language models (LLMs) become easily accessible nowadays, the trade-off between safety and helpfulness can significantly impact user experience.
no code implementations • 17 Mar 2024 • Michael Saxon, Yiran Luo, Sharon Levy, Chitta Baral, Yezhou Yang, William Yang Wang
Benchmarks of the multilingual capabilities of text-to-image (T2I) models compare generated images prompted in a test language to an expected image distribution over a concept set.
no code implementations • 16 Mar 2024 • Jiachen Li, Weixi Feng, Wenhu Chen, William Yang Wang
By distilling a latent consistency model (LCM) from a pre-trained teacher latent diffusion model (LDM), LCD facilitates the generation of high-fidelity images within merely 2 to 4 inference steps.
1 code implementation • 29 Feb 2024 • Xiaobao Wu, Liangming Pan, William Yang Wang, Anh Tuan Luu
Knowledge editing injects knowledge updates into language models to keep them correct and up-to-date.
2 code implementations • 28 Feb 2024 • Kexun Zhang, Yee Man Choi, Zhenqiao Song, Taiqi He, William Yang Wang, Lei LI
On the contrary, we observe that 2000 endangered languages, though without a large corpus, have a grammar book or a dictionary.
1 code implementation • 26 Feb 2024 • Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang
A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training.
1 code implementation • 18 Feb 2024 • Wenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei LI, William Yang Wang
We discovered that such a contrary is due to LLM's bias in evaluating their own output.
1 code implementation • 5 Feb 2024 • Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan, Wenhu Chen, William Yang Wang
To understand how pre-training with a next-token prediction objective contributes to the emergence of such reasoning capability, we propose that we can view an LM as deriving new conclusions by aggregating indirect reasoning paths seen at pre-training time.
1 code implementation • 30 Jan 2024 • Xuandong Zhao, Xianjun Yang, Tianyu Pang, Chao Du, Lei LI, Yu-Xiang Wang, William Yang Wang
In this paper, we propose the weak-to-strong jailbreaking attack, an efficient method to attack aligned LLMs to produce harmful text.
no code implementations • 24 Jan 2024 • Iain Xie Weissburg, Mehir Arora, Xinyi Wang, Liangming Pan, William Yang Wang
As the number of accepted papers at AI and ML conferences reaches into the thousands, it has become unclear how researchers access and read research publications.
1 code implementation • 5 Dec 2023 • Alon Albalak, Liangming Pan, Colin Raffel, William Yang Wang
The data used to pretrain large language models has a decisive impact on a model's downstream performance, which has led to a large body of work on data selection methods that aim to automatically determine the most suitable data to use for pretraining.
1 code implementation • 29 Nov 2023 • Xiujun Li, Yujie Lu, Zhe Gan, Jianfeng Gao, William Yang Wang, Yejin Choi
Recent multimodal large language models (MLLMs) have shown promising instruction following capabilities on vision-language tasks.
no code implementations • 15 Nov 2023 • Wenda Xu, Daniel Deutsch, Mara Finkelstein, Juraj Juraska, Biao Zhang, Zhongtao Liu, William Yang Wang, Lei LI, Markus Freitag
Recent large language models (LLM) are leveraging human feedback to improve their generation quality.
no code implementations • 2 Nov 2023 • Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan, Lianke Qin, Heng Wang, Xifeng Yan, William Yang Wang, Linda Ruth Petzold
Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details.
1 code implementation • 24 Oct 2023 • Xianjun Yang, Liangming Pan, Xuandong Zhao, Haifeng Chen, Linda Petzold, William Yang Wang, Wei Cheng
The burgeoning capabilities of advanced large language models (LLMs) such as ChatGPT have led to an increase in synthetic content generation with implications across a variety of sectors, including media, cybersecurity, public discourse, and education.
1 code implementation • 19 Oct 2023 • Deepak Nathani, David Wang, Liangming Pan, William Yang Wang
Language Models (LMs) have shown impressive performance in various natural language tasks.
no code implementations • 14 Oct 2023 • Jiachen Li, Qiaozi Gao, Michael Johnston, Xiaofeng Gao, Xuehai He, Suhaila Shakiah, Hangjie Shi, Reza Ghanadan, William Yang Wang
In this work, we tackle the problem of training a robot to understand multimodal prompts, interleaving vision signals with text descriptions.
1 code implementation • 14 Oct 2023 • Alex Mei, Sharon Levy, William Yang Wang
As large language models are integrated into society, robustness toward a suite of prompts is increasingly important to maintain reliability in a high-variance environment. Robustness evaluations must comprehensively encapsulate the various settings in which a user may invoke an intelligent system.
1 code implementation • 11 Oct 2023 • Zhiyu Chen, Yujie Lu, William Yang Wang
Mental illness remains one of the most critical public health issues of our time, due to the severe scarcity and accessibility limit of professionals.
no code implementations • 9 Oct 2023 • Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni
To encourage a more structural generation of CoT steps, we propose a hierarchical generation scheme: we let the LM generate a planning token at the start of each reasoning step, intuitively serving as a high-level plan of the current step, and add their embeddings to the model parameters.
1 code implementation • 8 Oct 2023 • Xianjun Yang, Kexun Zhang, Haifeng Chen, Linda Petzold, William Yang Wang, Wei Cheng
We then modify the previous zero-shot text detection method, DetectGPT (Mitchell et al., 2023) by utilizing a surrogate white-box model to estimate the probability of the rightmost tokens, allowing us to identify code snippets generated by language models.
no code implementations • 4 Oct 2023 • Xianjun Yang, Xiao Wang, Qi Zhang, Linda Petzold, William Yang Wang, Xun Zhao, Dahua Lin
This study serves as a clarion call for a collective effort to overhaul and fortify the safety of open-source LLMs against malicious attackers.
2 code implementations • 29 Sep 2023 • Tsu-Jui Fu, Wenze Hu, Xianzhi Du, William Yang Wang, Yinfei Yang, Zhe Gan
Extensive experimental results demonstrate that expressive instructions are crucial to instruction-based image editing, and our MGIE can lead to a notable improvement in automatic metrics and human evaluation while maintaining competitive inference efficiency.
1 code implementation • 6 Aug 2023 • Liangming Pan, Michael Saxon, Wenda Xu, Deepak Nathani, Xinyi Wang, William Yang Wang
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks.
1 code implementation • 12 Jul 2023 • Raphael Schumann, Wanrong Zhu, Weixi Feng, Tsu-Jui Fu, Stefan Riezler, William Yang Wang
In this work, we propose VELMA, an embodied LLM agent that uses a verbalization of the trajectory and of visual environment observations as contextual prompt for the next action.
1 code implementation • 2 Jun 2023 • Michael Saxon, William Yang Wang
We propose "Conceptual Coverage Across Languages" (CoCo-CroLa), a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns.
no code implementations • 30 May 2023 • Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang
The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge.
1 code implementation • 27 May 2023 • Xianjun Yang, Wei Cheng, Yue Wu, Linda Petzold, William Yang Wang, Haifeng Chen
However, this progress also presents a significant challenge in detecting the origin of a given text, and current research on detection methods lags behind the rapid evolution of LLMs.
1 code implementation • NeurIPS 2023 • Weixi Feng, Wanrong Zhu, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Xuehai He, Sugato Basu, Xin Eric Wang, William Yang Wang
When combined with a downstream image generation model, LayoutGPT outperforms text-to-image models/systems by 20-40% and achieves comparable performance as human users in designing visual layouts for numerical and spatial correctness.
1 code implementation • NeurIPS 2023 • Kexun Zhang, Danqing Wang, Jingtao Xia, William Yang Wang, Lei LI
To address these challenges, we propose ALGO, a framework that synthesizes Algorithmic programs with LLM-Generated Oracles to guide the generation and verify their correctness.
1 code implementation • 23 May 2023 • Yikang Pan, Liangming Pan, Wenhu Chen, Preslav Nakov, Min-Yen Kan, William Yang Wang
In this paper, we comprehensively investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation and its subsequent impact on information-intensive applications, particularly Open-Domain Question Answering (ODQA) systems.
1 code implementation • 23 May 2023 • SiQi Liu, Weixi Feng, Tsu-Jui Fu, Wenhu Chen, William Yang Wang
Making image retrieval methods practical for real-world search applications requires significant progress in dataset scales, entity comprehension, and multimodal information fusion.
1 code implementation • 23 May 2023 • Vaishnavi Himakunthala, Andy Ouyang, Daniel Rose, Ryan He, Alex Mei, Yujie Lu, Chinmay Sonar, Michael Saxon, William Yang Wang
Despite exciting recent results showing vision-language systems' capacity to reason about images using natural language, their capacity for video reasoning remains under-explored.
no code implementations • 23 May 2023 • Tsu-Jui Fu, Wenhan Xiong, Yixin Nie, Jingyu Liu, Barlas Oğuz, William Yang Wang
To address this \texttt{T3H} task, we propose Compositional Cross-modal Human (CCH).
Ranked #1 on Text-to-3D-Human Generation on SHHQ
1 code implementation • 23 May 2023 • Shuo Zhang, Liangming Pan, Junzhou Zhao, William Yang Wang
Large language models often necessitate grounding on external knowledge to generate faithful and reliable answers.
2 code implementations • 23 May 2023 • Wenda Xu, Danqing Wang, Liangming Pan, Zhenqiao Song, Markus Freitag, William Yang Wang, Lei LI
By harnessing both explicit human instruction and the implicit knowledge of GPT-4, we fine-tune a text evaluation metric based on LLaMA, producing both a score for generated text and a human readable diagnostic report.
1 code implementation • 22 May 2023 • Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov
Fact-checking real-world claims often requires collecting multiple pieces of evidence and applying complex multi-step reasoning.
1 code implementation • 20 May 2023 • Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
We also introduce a self-refinement module, which utilizes the symbolic solver's error messages to revise symbolic formalizations.
no code implementations • 18 May 2023 • Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
We conduct a series of experiments to compare the common edits made by humans and GPT-k, evaluate the performance of GPT-k in prompting T2I, and examine factors that may influence this process.
1 code implementation • NeurIPS 2023 • Yujie Lu, Xianjun Yang, Xiujun Li, Xin Eric Wang, William Yang Wang
Existing automatic evaluation on text-to-image synthesis can only provide an image-text matching score, without considering the object-level compositionality, which results in poor correlation with human judgments.
1 code implementation • 18 May 2023 • Xuehai He, Weixi Feng, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang
Diffusion models, such as Stable Diffusion, have shown incredible performance on text-to-image generation.
no code implementations • 18 May 2023 • Avani Tanna, Michael Saxon, Amr El Abbadi, William Yang Wang
Voice conversion (VC) models have demonstrated impressive few-shot conversion quality on the clean, native speech populations they're trained on.
no code implementations • 3 May 2023 • Daniel Rose, Vaishnavi Himakunthala, Andy Ouyang, Ryan He, Alex Mei, Yujie Lu, Michael Saxon, Chinmay Sonar, Diba Mirza, William Yang Wang
Recent advances in large language models elicit reasoning in a chain-of-thought that allows models to decompose problems in a human-like fashion.
1 code implementation • 2 May 2023 • Yujie Lu, Pan Lu, Zhiyu Chen, Wanrong Zhu, Xin Eric Wang, William Yang Wang
The key challenges of MPP are to ensure the informativeness, temporal coherence, and accuracy of plans across modalities.
1 code implementation • NeurIPS 2023 • Wanrong Zhu, Jack Hessel, Anas Awadalla, Samir Yitzhak Gadre, Jesse Dodge, Alex Fang, Youngjae Yu, Ludwig Schmidt, William Yang Wang, Yejin Choi
We release Multimodal C4, an augmentation of the popular text-only C4 corpus with images interleaved.
no code implementations • 9 Mar 2023 • Alex Mei, Michael Saxon, Shiyu Chang, Zachary C. Lipton, William Yang Wang
We conduct a broad literature survey, identifying many clusters of similar conceptions of transparency, tying each back to our north star with analysis of how it furthers or hinders our ideal AI transparency goals.
1 code implementation • 5 Feb 2023 • Kexun Zhang, Xianjun Yang, William Yang Wang, Lei LI
Diffusion models show promising generation capability for a variety of data.
1 code implementation • NeurIPS 2023 • Alon Albalak, Colin Raffel, William Yang Wang
In this work, we focus on Few-shot Learning with Auxiliary Data (FLAD), a training paradigm that assumes access to auxiliary data during few-shot learning in hopes of improving generalization.
1 code implementation • NeurIPS 2023 • Xinyi Wang, Wanrong Zhu, Michael Saxon, Mark Steyvers, William Yang Wang
This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
1 code implementation • 25 Jan 2023 • Kung-Hsiang Huang, Siffi Singh, Xiaofei Ma, Wei Xiao, Feng Nan, Nicholas Dingwall, William Yang Wang, Kathleen McKeown
Missing information is a common issue of dialogue summarization where some information in the reference summaries is not covered in the generated summaries.
2 code implementations • 21 Jan 2023 • Shuaichen Chang, Jun Wang, Mingwen Dong, Lin Pan, Henghui Zhu, Alexander Hanbo Li, Wuwei Lan, Sheng Zhang, Jiarong Jiang, Joseph Lilien, Steve Ash, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Bing Xiang
Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries.
1 code implementation • 20 Dec 2022 • Yi-Lin Tuan, Alon Albalak, Wenda Xu, Michael Saxon, Connor Pryor, Lise Getoor, William Yang Wang
Despite their widespread adoption, neural conversation models have yet to exhibit natural chat capabilities with humans.
1 code implementation • 19 Dec 2022 • Kaiser Sun, Peng Qi, Yuhao Zhang, Lan Liu, William Yang Wang, Zhiheng Huang
We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets, with a notable average of +1. 7 F2 gain when a BART model is trained on SQuAD and evaluated on 8 QA datasets.
1 code implementation • 19 Dec 2022 • Alex Mei, Sharon Levy, William Yang Wang
Users' physical safety is an increasing concern as the market for intelligent systems continues to grow, where unconstrained systems may recommend users dangerous actions that can lead to serious injury.
1 code implementation • 19 Dec 2022 • Wenda Xu, Xian Qian, Mingxuan Wang, Lei LI, William Yang Wang
In this paper, we propose SESCORE2, a self-supervised approach for training a model-based metric for text generation evaluation.
no code implementations • 17 Dec 2022 • Jifan Chen, Yuhao Zhang, Lan Liu, Rui Dong, Xinchi Chen, Patrick Ng, William Yang Wang, Zhiheng Huang
There has been great progress in unifying various table-to-text tasks using a single encoder-decoder model trained via multi-task learning (Xie et al., 2022).
1 code implementation • 9 Dec 2022 • Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang
In this work, we improve the compositional skills of T2I models, specifically more accurate attribute binding and better image compositions.
no code implementations • 29 Nov 2022 • Jiachen Li, Edwin Zhang, Ming Yin, Qinxun Bai, Yu-Xiang Wang, William Yang Wang
Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning.
1 code implementation • CVPR 2023 • Tsu-Jui Fu, Licheng Yu, Ning Zhang, Cheng-Yang Fu, Jong-Chyi Su, William Yang Wang, Sean Bell
Inspired by this, we introduce a novel task, text-guided video completion (TVC), which requests the model to generate a video from partial frames guided by an instruction.
Ranked #3 on Video Prediction on BAIR Robot Pushing
no code implementations • 25 Oct 2022 • Gyuwan Kim, Jinhyuk Lee, Barlas Oguz, Wenhan Xiong, Yizhe Zhang, Yashar Mehdad, William Yang Wang
Building dense retrievers requires a series of standard procedures, including training and validating neural models and creating indexes for efficient search.
no code implementations • 21 Oct 2022 • Matthew Ho, Aditya Sharma, Justin Chang, Michael Saxon, Sharon Levy, Yujie Lu, William Yang Wang
As large language models (LLMs) grow larger and more sophisticated, assessing their "reasoning" capabilities in natural language grows more challenging.
no code implementations • 21 Oct 2022 • Josiah Ross, Luke Yoffe, Alon Albalak, William Yang Wang
Transfer learning is an exciting area of Natural Language Processing that has the potential to both improve model performance and increase data efficiency.
no code implementations • 19 Oct 2022 • Xuehai He, Diji Yang, Weixi Feng, Tsu-Jui Fu, Arjun Akula, Varun Jampani, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang
Prompt tuning is a new few-shot transfer learning technique that only tunes the learnable prompt for pre-trained vision and language models such as CLIP.
no code implementations • 18 Oct 2022 • Sharon Levy, Emily Allaway, Melanie Subbiah, Lydia Chilton, Desmond Patton, Kathleen McKeown, William Yang Wang
Understanding what constitutes safe text is an important issue in natural language processing and can often prevent the deployment of models deemed harmful and unsafe.
1 code implementation • 18 Oct 2022 • Weixi Feng, Tsu-Jui Fu, Yujie Lu, William Yang Wang
Vision-and-Language Navigation (VLN) is a task to guide an embodied agent moving to a target position using language instructions.
no code implementations • 17 Oct 2022 • Alex Mei, Anisha Kabir, Sharon Levy, Melanie Subbiah, Emily Allaway, John Judge, Desmond Patton, Bruce Bimber, Kathleen McKeown, William Yang Wang
An increasingly prevalent problem for intelligent technologies is text safety, as uncontrolled systems may generate recommendations to their users that lead to injury or life-threatening consequences.
1 code implementation • 12 Oct 2022 • Xiyang Hu, Xinchi Chen, Peng Qi, Deguang Kong, Kunlun Liu, William Yang Wang, Zhiheng Huang
Multilingual information retrieval (IR) is challenging since annotated training data is costly to obtain in many languages.
no code implementations • 11 Oct 2022 • An Yan, Jiacheng Li, Wanrong Zhu, Yujie Lu, William Yang Wang, Julian McAuley
However, the application of its text encoder solely for text understanding has been less explored.
1 code implementation • 10 Oct 2022 • Wenda Xu, YiLin Tuan, Yujie Lu, Michael Saxon, Lei LI, William Yang Wang
Is it possible to build a general and automatic natural language generation (NLG) evaluation metric?
1 code implementation • 7 Oct 2022 • Zhiyu Chen, Shiyang Li, Charese Smiley, Zhiqiang Ma, Sameena Shah, William Yang Wang
With the recent advance in large pre-trained language models, researchers have achieved record performances in NLP tasks that mostly focus on language pattern matching.
Ranked #2 on Question Answering on ConvFinQA
1 code implementation • NeurIPS 2023 • Zih-Yun Chiu, Yi-Lin Tuan, William Yang Wang, Michael C. Yip
In this work, we present Knowledge-Grounded RL (KGRL), an RL paradigm fusing multiple knowledge policies and aiming for human-like efficiency and flexibility.
no code implementations • 7 Oct 2022 • Yi-Lin Tuan, Zih-Yun Chiu, William Yang Wang
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data that involves multiple sub-components in a flexible and interpretable fashion.
1 code implementation • 7 Oct 2022 • Wanrong Zhu, An Yan, Yujie Lu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
Recent advances in text-to-image synthesis make it possible to visualize machine imaginations for a given context.
no code implementations • 10 Sep 2022 • Yujie Lu, Huiliang Zhang, Ping Nie, Weixi Feng, Wenda Xu, Xin Eric Wang, William Yang Wang
In this paper, we propose an Unseen Discrepancy Anticipating Vision and Language Navigation (DAVIS) that learns to generalize to unseen environments via encouraging test-time visual consistency.
1 code implementation • CVPR 2023 • Tsu-Jui Fu, Linjie Li, Zhe Gan, Kevin Lin, William Yang Wang, Lijuan Wang, Zicheng Liu
Masked visual modeling (MVM) has been recently proven effective for visual pre-training.
Ranked #1 on Video Question Answering on LSMDC-MC
1 code implementation • 10 Jun 2022 • Xinyi Wang, Michael Saxon, Jiachen Li, Hongyang Zhang, Kun Zhang, William Yang Wang
While machine learning models rapidly advance the state-of-the-art on various real-world tasks, out-of-domain (OOD) generalization remains a challenging problem given the vulnerability of these models to spurious correlations.
no code implementations • 6 Jun 2022 • Yujie Lu, Weixi Feng, Wanrong Zhu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
Procedural planning aims to implement complex high-level goals by decomposition into sequential simpler low-level steps.
1 code implementation • LREC 2022 • Samhita Honnavalli, Aesha Parekh, Lily Ou, Sophie Groenwold, Sharon Levy, Vicente Ordonez, William Yang Wang
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
1 code implementation • 12 May 2022 • Alon Albalak, Yi-Lin Tuan, Pegah Jandaghi, Connor Pryor, Luke Yoffe, Deepak Ramachandran, Lise Getoor, Jay Pujara, William Yang Wang
Task transfer, transferring knowledge contained in related tasks, holds the promise of reducing the quantity of labeled data required to fine-tune language models.
no code implementations • Findings (NAACL) 2022 • Zhiyu Chen, Bing Liu, Seungwhan Moon, Chinnadhurai Sankar, Paul Crook, William Yang Wang
We also propose two new models, SimpleToDPlus and Combiner, for the proposed task.
no code implementations • Findings (ACL) 2022 • Kai Nakamura, Sharon Levy, Yi-Lin Tuan, Wenhu Chen, William Yang Wang
A pressing challenge in current dialogue systems is to successfully converse with users on topics with information distributed across different modalities.
1 code implementation • NAACL 2022 • Yujie Lu, Wanrong Zhu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
Human brains integrate linguistic and perceptual information simultaneously to understand natural language, and hold the critical ability to render imaginations.
no code implementations • COLING 2022 • Wanrong Zhu, Bo Pang, Ashish V. Thapliyal, William Yang Wang, Radu Soricut
Dense video captioning aims to identify the events of interest in an input video, and generate descriptive captions for each event.
Ranked #4 on Dense Video Captioning on ViTT (CIDEr metric, using extra training data)
1 code implementation • Findings (ACL) 2022 • Yi-Lin Tuan, Sajjad Beygi, Maryam Fazel-Zarandi, Qiaozi Gao, Alessandra Cervone, William Yang Wang
Our proposed method allows a single transformer model to directly walk on a large-scale knowledge graph to generate responses.
1 code implementation • 26 Jan 2022 • Alon Albalak, Sharon Levy, William Yang Wang
Open-retrieval question answering systems are generally trained and tested on large datasets in well-established domains.
no code implementations • 16 Dec 2021 • Michael Saxon, Xinyi Wang, Wenda Xu, William Yang Wang
Building natural language inference (NLI) benchmarks that are both challenging for modern techniques, and free from shortcut biases is difficult.
no code implementations • 2 Dec 2021 • Wenqiao Zhang, Xin Eric Wang, Siliang Tang, Haizhou Shi, Haocheng Shi, Jun Xiao, Yueting Zhuang, William Yang Wang
Such a setting can help explain the decisions of captioning models and prevents the model from hallucinating object words in its description.
1 code implementation • 24 Nov 2021 • Tsu-Jui Fu, Linjie Li, Zhe Gan, Kevin Lin, William Yang Wang, Lijuan Wang, Zicheng Liu
Further, unlike previous studies that found pre-training tasks on video inputs (e. g., masked frame modeling) not very effective, we design a new pre-training task, Masked Visual-token Modeling (MVM), for better video modeling.
Ranked #21 on Zero-Shot Video Retrieval on DiDeMo
no code implementations • 22 Oct 2021 • Yujie Lu, Ping Nie, Shengyu Zhang, Ming Zhao, Ruobing Xie, William Yang Wang, Yi Ren
However, existing work are primarily built upon pre-defined retrieval channels, including User-CF (U2U), Item-CF (I2I), and Embedding-based Retrieval (U2I), thus access to the limited correlation between users and items which solely entail from partial information of latent interactions.
no code implementations • 22 Oct 2021 • Jiachen Li, Shuo Cheng, Zhenyu Liao, Huayan Wang, William Yang Wang, Qinxun Bai
Improving the sample efficiency of reinforcement learning algorithms requires effective exploration.
1 code implementation • 15 Oct 2021 • Liangming Pan, Wenhu Chen, Min-Yen Kan, William Yang Wang
We curate both human-written and model-generated false documents that we inject into the evidence corpus of QA models and assess the impact on the performance of these systems.
1 code implementation • EMNLP (ACL) 2021 • Sharon Levy, Kevin Mo, Wenhan Xiong, William Yang Wang
In this work, we present such a system for the emergent domain of COVID-19.
1 code implementation • 6 Oct 2021 • Wenda Xu, Michael Saxon, Misha Sra, William Yang Wang
This is a particularly notable issue in the medical domain, where layman are often confused by medical text online.
1 code implementation • EMNLP 2021 • Alex Jones, William Yang Wang, Kyle Mahowald
We verify some of our linguistic findings by looking at the effect of morphological segmentation on English-Inuktitut alignment, in addition to examining the effect of word order agreement on isomorphism for 66 zero-shot language pairs from a different corpus.
1 code implementation • NLP4ConvAI (ACL) 2022 • Alon Albalak, Varun Embar, Yi-Lin Tuan, Lise Getoor, William Yang Wang
Existing research studies on cross-sentence relation extraction in long-form multi-party conversations aim to improve relation extraction without considering the explainability of such methods.
Ranked #7 on Dialog Relation Extraction on DialogRE
1 code implementation • EMNLP 2021 • Zhiyu Chen, Wenhu Chen, Charese Smiley, Sameena Shah, Iana Borova, Dylan Langdon, Reema Moussa, Matt Beane, Ting-Hao Huang, Bryan Routledge, William Yang Wang
In contrast to existing tasks on general domain, the finance domain includes complex numerical reasoning and understanding of heterogeneous representations.
Ranked #4 on Question Answering on FinQA
1 code implementation • 13 Aug 2021 • Wenhu Chen, Xinyi Wang, William Yang Wang
Lots of facts can evolve with respect to time.
no code implementations • ACL 2021 • Qingfu Zhu, Wei-Nan Zhang, Ting Liu, William Yang Wang
Generating open-domain conversational responses in the desired style usually suffers from the lack of p