1 code implementation • 20 Jan 2025 • Zhiqi Li, Guo Chen, Shilong Liu, Shihao Wang, Vibashan VS, Yishen Ji, Shiyi Lan, Hao Zhang, Yilin Zhao, Subhashree Radhakrishnan, Nadine Chang, Karan Sapra, Amala Sanjay Deshmukh, Tuomas Rintamaki, Matthieu Le, Ilia Karmanov, Lukas Voegtle, Philipp Fischer, De-An Huang, Timo Roman, Tong Lu, Jose M. Alvarez, Bryan Catanzaro, Jan Kautz, Andrew Tao, Guilin Liu, Zhiding Yu
Recently, promising progress has been made by open-source vision-language models (VLMs) in bringing their capabilities closer to those of proprietary frontier models.
no code implementations • 14 Oct 2024 • Han Wang, Yilin Zhao, Dian Li, Xiaohan Wang, Gang Liu, Xuguang Lan, Hui Wang
Humor is a culturally nuanced aspect of human language that presents challenges for understanding and generation, requiring participants to possess good creativity and strong associative thinking.
2 code implementations • 28 Aug 2024 • Min Shi, Fuxiao Liu, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, Yilin Zhao, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu
The ability to accurately interpret complex visual information is a crucial topic of multimodal large language models (MLLMs).
no code implementations • 15 Jul 2024 • Xiaohan Wang, Dian Li, Yilin Zhao, Sinbadliu, Hui Wang
Training on solution paths is also hindered by the high cost of expert annotations and generalizing to new tools.
1 code implementation • arXiv 2024 • Lin Xu, Yilin Zhao, Daquan Zhou, Zhijie Lin, See Kiong Ng, Jiashi Feng
PLLaVA achieves new state-of-the-art performance on modern benchmark datasets for both video question-answer and captioning tasks.
no code implementations • 12 Nov 2023 • Yilin Zhao, Xinbin Yuan, ShangHua Gao, Zhijie Lin, Qibin Hou, Jiashi Feng, Daquan Zhou
For MoV, we utilize the text-to-speech (TTS) algorithms with a variety of pre-defined tones and select the most matching one based on the user-provided text description automatically.
no code implementations • 27 Oct 2023 • Yilin Zhao, Hai Zhao, Sufeng Duan
Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options.
1 code implementation • ACL 2022 • Yilin Zhao, Hai Zhao, Libin Shen, Yinggong Zhao
As a broad and major category in machine reading comprehension (MRC), the generalized goal of discriminative MRC is answer prediction from the given materials.
1 code implementation • Findings (ACL) 2021 • Kuicai Dong, Yilin Zhao, Aixin Sun, Jung-jae Kim, XiaoLi Li
Both DocOIE dataset and DocIE model are released for public.
Ranked #1 on
Open Information Extraction
on DocOIE-transportation
1 code implementation • 7 Dec 2020 • Yilin Zhao, Zhuosheng Zhang, Hai Zhao
Thus we propose a novel reference-based knowledge enhancement model called Reference Knowledgeable Network (RekNet), which simulates human reading strategies to refine critical information from the passage and quote explicit knowledge in necessity.