1 code implementation • 1 Apr 2025 • Jianhao Chen, Zishuo Xun, Bocheng Zhou, Han Qi, Hangfan Zhang, Qiaosheng Zhang, Yang Chen, Wei Hu, Yuzhong Qu, Wanli Ouyang, Shuyue Hu
This paper presents a simple, effective, and cost-efficient strategy to improve LLM performance by scaling test-time compute.
1 code implementation • 10 Mar 2025 • Fanqing Meng, Lingxiao Du, Zongkai Liu, Zhixiang Zhou, Quanfeng Lu, Daocheng Fu, Botian Shi, Wenhai Wang, Junjun He, Kaipeng Zhang, Ping Luo, Yu Qiao, Qiaosheng Zhang, Wenqi Shao
We present MM-Eureka, a multimodal reasoning model that successfully extends large-scale rule-based reinforcement learning (RL) to multimodal reasoning.
no code implementations • 12 Feb 2025 • Hangfan Zhang, Zhiyao Cui, Xinrun Wang, Qiaosheng Zhang, Zhen Wang, Dinghao Wu, Shuyue Hu
Multi-agent debate (MAD) has emerged as a promising approach to enhance the factual accuracy and reasoning quality of large language models (LLMs) by engaging multiple agents in iterative discussions during inference.
1 code implementation • 24 Jan 2025 • Han Qi, Fei Guo, Li Zhu, Qiaosheng Zhang, Xuelong Li
In this paper, we study the stochastic multi-armed bandit problem with graph feedback.
1 code implementation • 22 Jan 2025 • Chenjia Bai, Yang Zhang, Shuang Qiu, Qiaosheng Zhang, Kang Xu, Xuelong Li
Then, we reformulate our objective to direct preference optimization with an exploration term, where the UCB-term can be converted to a count-based exploration bonus.
no code implementations • 19 Jan 2025 • Dian Jin, Yuqian Zhang, Qiaosheng Zhang
The integration of network information and node attribute information has recently gained significant attention in the community detection literature.
no code implementations • 20 Dec 2024 • Zhongtian Ma, Qiaosheng Zhang, Bocheng Zhou, Yexin Zhang, Shuyue Hu, Zhen Wang
Specifically, by appropriately defining \emph{structure noise} and \emph{feature noise} in graphs, we show that graph attention mechanisms can enhance classification performance when structure noise exceeds feature noise.
no code implementations • 3 Jun 2024 • Chenjie Mao, Qiaosheng Zhang
This paper proposes the first generic fast convergence result in general function approximation for offline decision making problems, which include offline reinforcement learning (RL) and off-policy evaluation (OPE) as special cases.
no code implementations • 12 May 2024 • Changhong Wang, Xudong Yu, Chenjia Bai, Qiaosheng Zhang, Zhen Wang
To address this problem, our work builds upon the investigation of successor representations for task generalization in online RL and extends the framework to incorporate offline-to-online learning.
no code implementations • 30 Apr 2024 • Qiaosheng Zhang, Chenjia Bai, Shuyue Hu, Zhen Wang, Xuelong Li
Finally, we extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
no code implementations • 17 Jan 2024 • Yexin Zhang, Zhongtian Ma, Qiaosheng Zhang, Zhen Wang, Xuelong Li
This paper considers the problem of community detection on multiple potentially correlated graphs from an information-theoretical perspective.
no code implementations • 16 Jan 2024 • Zhongtian Ma, Qiaosheng Zhang, Zhen Wang
Theoretical analyses show that our algorithm succeeds with high probability as long as the sample probability exceeds the aforementioned threshold, and this theoretical result is further validated by synthetic experiments.
no code implementations • 29 Dec 2023 • Zheng Zhou, Hongbo Zhao, Ju Liu, Qiaosheng Zhang, Liwei Geng, Shuchang Lyu, Wenquan Feng
The regularization terms further enhance the practicality of the generated APs in the physical domain.
no code implementations • 23 Nov 2022 • Jirong Yi, Qiaosheng Zhang, Zhen Chen, Qiao Liu, Wei Shao, Yusen He, Yaohua Wang
We first argue that the MSE minimization approach is equivalent to a conditional entropy learning problem, and then propose a mutual information learning formulation for solving regression problems by using a reparameterization technique.
no code implementations • 3 Oct 2022 • Jirong Yi, Qiaosheng Zhang, Zhen Chen, Qiao Liu, Wei Shao
Deep learning systems have been reported to acheive state-of-the-art performances in many applications, and one of the keys for achieving this is the existence of well trained classifiers on benchmark datasets which can be used as backbone feature extractors in downstream tasks.
no code implementations • 21 Sep 2022 • Jirong Yi, Qiaosheng Zhang, Zhen Chen, Qiao Liu, Wei Shao
Deep learning systems have been reported to achieve state-of-the-art performances in many applications, and a key is the existence of well trained classifiers on benchmark datasets.
no code implementations • 11 May 2021 • Qiaosheng Zhang, Vincent Y. F. Tan
This paper investigates fundamental limits of exact recovery in the general d-uniform hypergraph stochastic block model (d-HSBM), wherein n nodes are partitioned into k disjoint communities with relative sizes (p1,..., pk).
no code implementations • 8 Jun 2020 • Qiaosheng Zhang, Geewon Suh, Changho Suh, Vincent Y. F. Tan
In this paper, we design and analyze MC2G (Matrix Completion with 2 Graphs), an algorithm that performs matrix completion in the presence of social and item similarity graphs.
no code implementations • 13 Mar 2020 • Haiyun He, Qiaosheng Zhang, Vincent Y. F. Tan
This paper investigates a novel offline change-point detection problem from an information-theoretic perspective.
no code implementations • 6 Dec 2019 • Qiaosheng Zhang, Vincent Y. F. Tan, Changho Suh
We consider the problem of recovering a binary rating matrix as well as clusters of users and items based on a partially observed matrix together with side-information in the form of social and item similarity graphs.
no code implementations • 25 Aug 2016 • Haozhe Xie, Jie Li, Qiaosheng Zhang, Yadong Wang
FS followed by RP outperforms other combination methods in classification accuracy on most of the datasets.