1 code implementation • 17 Feb 2025 • Ling Yang, Xinchen Zhang, Ye Tian, Chenming Shang, Minghao Xu, Wentao Zhang, Bin Cui
The remarkable success of the autoregressive paradigm has made significant advancement in Multimodal Large Language Models (MLLMs), with powerful models like Show-o, Transfusion and Emu3 achieving notable progress in unified image understanding and generation.
no code implementations • 12 Feb 2025 • Zhao Wang, Hao Wen, Lingting Zhu, Chenming Shang, Yujiu Yang, Qi Dou
In the first stage, we develop a base model capable of integrating the source character with the target scene using pose guidance.
1 code implementation • 25 Oct 2024 • Hengyuan Zhang, Chenming Shang, Sizhe Wang, Dongdong Zhang, Feng Yao, Renliang Sun, Yiyao Yu, Yujiu Yang, Furu Wei
Although fine-tuning Large Language Models (LLMs) with multilingual data can rapidly enhance the multilingual capabilities of LLMs, they still exhibit a performance gap between the dominant language (e. g., English) and non-dominant ones due to the imbalance of training data across languages.
1 code implementation • CVPR 2024 • Chenming Shang, Shiji Zhou, Hengyuan Zhang, Xinzhe Ni, Yujiu Yang, Yuwang Wang
Concept Bottleneck Models (CBMs) map the black-box visual representations extracted by deep neural networks onto a set of interpretable concepts and use the concepts to make predictions, enhancing the transparency of the decision-making process.
no code implementations • 13 Apr 2024 • Chenming Shang, Hengyuan Zhang, Hao Wen, Yujiu Yang
The multimodal deep neural networks, represented by CLIP, have generated rich downstream applications owing to their excellent performance, thus making understanding the decision-making process of CLIP an essential research topic.
1 code implementation • 12 Mar 2024 • Hengyuan Zhang, Zitao Liu, Chenming Shang, Dawei Li, Yong Jiang
However, the inherent black-box nature of deep learning techniques often poses a hurdle for teachers to fully embrace the model's prediction results.
1 code implementation • 11 Mar 2024 • Hengyuan Zhang, Zitao Liu, Shuyan Huang, Chenming Shang, Bojun Zhan, Yong Jiang
Knowledge tracing (KT) aims to estimate student's knowledge mastery based on their historical interactions.
no code implementations • 9 Jun 2023 • Hengyuan Zhang, Dawei Li, Yanran Li, Chenming Shang, Chufan Shi, Yong Jiang
The standard definition generation task requires to automatically produce mono-lingual definitions (e. g., English definitions for English words), but ignores that the generated definitions may also consist of unfamiliar words for language learners.
no code implementations • 3 May 2020 • Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He, Wenhao Wu, Yukang Ding, Chao Li, Fu Li, Shilei Wen, Jianwei Li, Fuzhi Yang, Huan Yang, Jianlong Fu, Byung-Hoon Kim, JaeHyun Baek, Jong Chul Ye, Yuchen Fan, Thomas S. Huang, Junyeop Lee, Bokyeung Lee, Jungki Min, Gwantae Kim, Kanghyu Lee, Jaihyun Park, Mykola Mykhailych, Haoyu Zhong, Yukai Shi, Xiaojun Yang, Zhijing Yang, Liang Lin, Tongtong Zhao, Jinjia Peng, Huibing Wang, Zhi Jin, Jiahao Wu, Yifu Chen, Chenming Shang, Huanrong Zhang, Jeongki Min, Hrishikesh P. S, Densen Puthussery, Jiji C. V
This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results.