Search Results for author: Chenming Shang

Found 9 papers, 5 papers with code

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

1 code implementation17 Feb 2025 Ling Yang, Xinchen Zhang, Ye Tian, Chenming Shang, Minghao Xu, Wentao Zhang, Bin Cui

The remarkable success of the autoregressive paradigm has made significant advancement in Multimodal Large Language Models (MLLMs), with powerful models like Show-o, Transfusion and Emu3 achieving notable progress in unified image understanding and generation.

AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance

no code implementations12 Feb 2025 Zhao Wang, Hao Wen, Lingting Zhu, Chenming Shang, Yujiu Yang, Qi Dou

In the first stage, we develop a base model capable of integrating the source character with the target scene using pose guidance.

Video Generation

ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework

1 code implementation25 Oct 2024 Hengyuan Zhang, Chenming Shang, Sizhe Wang, Dongdong Zhang, Feng Yao, Renliang Sun, Yiyao Yu, Yujiu Yang, Furu Wei

Although fine-tuning Large Language Models (LLMs) with multilingual data can rapidly enhance the multilingual capabilities of LLMs, they still exhibit a performance gap between the dominant language (e. g., English) and non-dominant ones due to the imbalance of training data across languages.

Contrastive Learning

Incremental Residual Concept Bottleneck Models

1 code implementation CVPR 2024 Chenming Shang, Shiji Zhou, Hengyuan Zhang, Xinzhe Ni, Yujiu Yang, Yuwang Wang

Concept Bottleneck Models (CBMs) map the black-box visual representations extracted by deep neural networks onto a set of interpretable concepts and use the concepts to make predictions, enhancing the transparency of the decision-making process.

Decision Making Descriptive

Understanding Multimodal Deep Neural Networks: A Concept Selection View

no code implementations13 Apr 2024 Chenming Shang, Hengyuan Zhang, Hao Wen, Yujiu Yang

The multimodal deep neural networks, represented by CLIP, have generated rich downstream applications owing to their excellent performance, thus making understanding the decision-making process of CLIP an essential research topic.

Decision Making

Assisting Language Learners: Automated Trans-Lingual Definition Generation via Contrastive Prompt Learning

no code implementations9 Jun 2023 Hengyuan Zhang, Dawei Li, Yanran Li, Chenming Shang, Chufan Shi, Yong Jiang

The standard definition generation task requires to automatically produce mono-lingual definitions (e. g., English definitions for English words), but ignores that the generated definitions may also consist of unfamiliar words for language learners.

Machine Translation Prompt Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.