no code implementations • 12 May 2025 • Yuanhang Yang, Chaozheng Wang, Jing Li
Sparse Mixture of Experts (MoE) architectures have emerged as a promising approach for scaling Transformer models.
no code implementations • 19 Apr 2025 • Man Ho Lam, Chaozheng Wang, Jen-tse Huang, Michael R. Lyu
Large Language Models (LLMs) have recently demonstrated strong capabilities in code-related tasks, yet their robustness in code comprehension and reasoning remains insufficiently explored.
no code implementations • 6 Mar 2025 • Yu Pan, Chaozheng Wang, Zekai Wu, Qifan Wang, Min Zhang, Zenglin Xu
Addressing this concern, we introduce fully identical initialization (IDInit), a novel method that preserves identity in both the main and sub-stem layers of residual networks.
no code implementations • 18 Jan 2025 • Jialun Cao, Yuk-Kit Chan, Zixuan Ling, Wenxuan Wang, Shuqing Li, Mingwei Liu, Ruixi Qiao, Yuting Han, Chaozheng Wang, Boxi Yu, Pinjia He, Shuai Wang, Zibin Zheng, Michael R. Lyu, Shing-Chi Cheung
We propose How2Bench, which is comprised of a 55-criteria checklist as a set of guidelines to govern the development of code-related benchmarks comprehensively.
no code implementations • 2 Jan 2025 • Shuzheng Gao, Chaozheng Wang, Cuiyun Gao, Xiaoqian Jiao, Chun Yong Chong, Shan Gao, Michael Lyu
Test cases are essential for validating the reliability and quality of software applications.
no code implementations • 31 Aug 2024 • Wenxuan Wang, Juluan Shi, Zixuan Ling, Yuk-Kit Chan, Chaozheng Wang, Cheryl Lee, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu
Equipped with the capability to call functions, modern large language models (LLMs) can leverage external tools for addressing a range of tasks unattainable through language skills alone.
no code implementations • 24 Jun 2024 • Yuxuan Wan, Chaozheng Wang, Yi Dong, Wenxuan Wang, Shuqing Li, Yintong Huo, Michael R. Lyu
We further reveal that a focus on smaller visual segments can help multimodal large language models (MLLMs) mitigate these failures in the generation process.
1 code implementation • 27 Feb 2024 • Yuanhang Yang, shiyi qi, Wenchao Gu, Chaozheng Wang, Cuiyun Gao, Zenglin Xu
To address this issue, we present \tool, a novel MoE designed to enhance both the efficacy and efficiency of sparse MoE models.
no code implementations • 7 Dec 2023 • Zongjie Li, Chaozheng Wang, Chaowei Liu, Pingchuan Ma, Daoyuan Wu, Shuai Wang, Cuiyun Gao
With recent advancements in Large Multimodal Models (LMMs) across various domains, a novel prompting method called visual referring prompting has emerged, showing significant potential in enhancing human-computer interaction within multimodal systems.
no code implementations • 29 Sep 2023 • Zongjie Li, Chaozheng Wang, Pingchuan Ma, Daoyuan Wu, Shuai Wang, Cuiyun Gao, Yang Liu
Specifically, PORTIA splits the answers into multiple segments, aligns similar content across candidate answers, and then merges them back into a single prompt for evaluation by LLMs.
1 code implementation • 24 Jul 2022 • Chaozheng Wang, Yuanhang Yang, Cuiyun Gao, Yun Peng, Hongyu Zhang, Michael R. Lyu
Besides, the performance of fine-tuning strongly relies on the amount of downstream data, while in practice, the scenarios with scarce data are common.
no code implementations • 9 Nov 2021 • Chaozheng Wang, Shuzheng Gao, Cuiyun Gao, Pengyun Wang, Wenjie Pei, Lujia Pan, Zenglin Xu
Real-world data usually present long-tailed distributions.
2 code implementations • ICML 2020 • Xiao Li, Chenghua Lin, Ruizhe Li, Chaozheng Wang, Frank Guerin
We demonstrate the utility of our method for attribute manipulation in autoencoders trained across varied domains, using both human evaluation and automated methods.
Ranked #7 on
Image Generation
on CelebA 256x256
(FID metric)