Search Results for author: Chaozheng Wang

Found 13 papers, 3 papers with code

UMoE: Unifying Attention and FFN with Shared Experts

no code implementations12 May 2025 Yuanhang Yang, Chaozheng Wang, Jing Li

Sparse Mixture of Experts (MoE) architectures have emerged as a promising approach for scaling Transformer models.

Mixture-of-Experts

CodeCrash: Stress Testing LLM Reasoning under Structural and Semantic Perturbations

no code implementations19 Apr 2025 Man Ho Lam, Chaozheng Wang, Jen-tse Huang, Michael R. Lyu

Large Language Models (LLMs) have recently demonstrated strong capabilities in code-related tasks, yet their robustness in code comprehension and reasoning remains insufficiently explored.

Benchmarking

IDInit: A Universal and Stable Initialization Method for Neural Network Training

no code implementations6 Mar 2025 Yu Pan, Chaozheng Wang, Zekai Wu, Qifan Wang, Min Zhang, Zenglin Xu

Addressing this concern, we introduce fully identical initialization (IDInit), a novel method that preserves identity in both the main and sub-stem layers of residual networks.

Inductive Bias

How Should We Build A Benchmark? Revisiting 274 Code-Related Benchmarks For LLMs

no code implementations18 Jan 2025 Jialun Cao, Yuk-Kit Chan, Zixuan Ling, Wenxuan Wang, Shuqing Li, Mingwei Liu, Ruixi Qiao, Yuting Han, Chaozheng Wang, Boxi Yu, Pinjia He, Shuai Wang, Zibin Zheng, Michael R. Lyu, Shing-Chi Cheung

We propose How2Bench, which is comprised of a 55-criteria checklist as a set of guidelines to govern the development of code-related benchmarks comprehensively.

Learning to Ask: When LLM Agents Meet Unclear Instruction

no code implementations31 Aug 2024 Wenxuan Wang, Juluan Shi, Zixuan Ling, Yuk-Kit Chan, Chaozheng Wang, Cheryl Lee, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu

Equipped with the capability to call functions, modern large language models (LLMs) can leverage external tools for addressing a range of tasks unattainable through language skills alone.

Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach

no code implementations24 Jun 2024 Yuxuan Wan, Chaozheng Wang, Yi Dong, Wenxuan Wang, Shuqing Li, Yintong Huo, Michael R. Lyu

We further reveal that a focus on smaller visual segments can help multimodal large language models (MLLMs) mitigate these failures in the generation process.

Layout Design

XMoE: Sparse Models with Fine-grained and Adaptive Expert Selection

1 code implementation27 Feb 2024 Yuanhang Yang, shiyi qi, Wenchao Gu, Chaozheng Wang, Cuiyun Gao, Zenglin Xu

To address this issue, we present \tool, a novel MoE designed to enhance both the efficacy and efficiency of sparse MoE models.

Language Modeling Language Modelling +2

VRPTEST: Evaluating Visual Referring Prompting in Large Multimodal Models

no code implementations7 Dec 2023 Zongjie Li, Chaozheng Wang, Chaowei Liu, Pingchuan Ma, Daoyuan Wu, Shuai Wang, Cuiyun Gao

With recent advancements in Large Multimodal Models (LMMs) across various domains, a novel prompting method called visual referring prompting has emerged, showing significant potential in enhancing human-computer interaction within multimodal systems.

Split and Merge: Aligning Position Biases in LLM-based Evaluators

no code implementations29 Sep 2023 Zongjie Li, Chaozheng Wang, Pingchuan Ma, Daoyuan Wu, Shuai Wang, Cuiyun Gao, Yang Liu

Specifically, PORTIA splits the answers into multiple segments, aligns similar content across candidate answers, and then merges them back into a single prompt for evaluation by LLMs.

Language Modelling Large Language Model +1

No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence

1 code implementation24 Jul 2022 Chaozheng Wang, Yuanhang Yang, Cuiyun Gao, Yun Peng, Hongyu Zhang, Michael R. Lyu

Besides, the performance of fine-tuning strongly relies on the amount of downstream data, while in practice, the scenarios with scarce data are common.

Code Summarization Code Translation

Latent Space Factorisation and Manipulation via Matrix Subspace Projection

2 code implementations ICML 2020 Xiao Li, Chenghua Lin, Ruizhe Li, Chaozheng Wang, Frank Guerin

We demonstrate the utility of our method for attribute manipulation in autoencoders trained across varied domains, using both human evaluation and automated methods.

Ranked #7 on Image Generation on CelebA 256x256 (FID metric)

Attribute Face Generation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.