1 code implementation • 30 Jan 2025 • Haoxiong Liu, Jiacheng Sun, Zhenguo Li, Andrew C Yao
The superiority of our method is validated on the miniF2F-test benchmark using the open-source deepseek-math-7b-base model and the Isabelle proof assistant.
Ranked #1 on
Automated Theorem Proving
on miniF2F-test
1 code implementation • 12 Nov 2024 • Jiacheng Sun, Hsiang-Wei Huang, Cheng-Yen Yang, Zhongyu Jiang, Jenq-Neng Hwang
The proposed method achieved a new state-of-the-art performance on the SportsMOT dataset with HOTA score of 81. 04%.
Ranked #1 on
Multiple Object Tracking
on SportsMOT
(using extra training data)
no code implementations • 17 Oct 2024 • Guhao Feng, Kai Yang, Yuntian Gu, Xinyue Ai, Shengjie Luo, Jiacheng Sun, Di He, Zhenguo Li, LiWei Wang
Despite the remarkable success of Transformer-based Large Language Models (LLMs) across various domains, understanding and enhancing their mathematical capabilities remains a significant challenge.
1 code implementation • 6 Jun 2024 • Jingyang Ou, Shen Nie, Kaiwen Xue, Fengqi Zhu, Jiacheng Sun, Zhenguo Li, Chongxuan Li
In this paper, we reveal that the concrete score in absorbing diffusion can be expressed as conditional probabilities of clean data, multiplied by a time-dependent scalar in an analytic form.
no code implementations • 24 May 2024 • Zhiwei Wang, Yunji Wang, Zhongwang Zhang, Zhangchen Zhou, Hui Jin, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Yaoyu Zhang, Zhi-Qin John Xu
Large language models have consistently struggled with complex reasoning tasks, such as mathematical problem-solving.
1 code implementation • 17 Oct 2023 • Jiajun Ma, Tianyang Hu, Wenjia Wang, Jiacheng Sun
Guidance in conditional diffusion generation is of great importance for sample quality and controllability.
Ranked #1 on
Conditional Image Generation
on ImageNet 128x128
1 code implementation • NeurIPS 2023 • Shuchen Xue, Mingyang Yi, Weijian Luo, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, Zhi-Ming Ma
Based on our analysis, we propose SA-Solver, which is an improved efficient stochastic Adams method for solving diffusion SDE to generate data with high quality.
Ranked #32 on
Image Generation
on ImageNet 512x512
no code implementations • 4 Jul 2023 • Weijian Luo, Hao Jiang, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Zhihua Zhang
In image generation experiments, the proposed DCD is capable of training an energy-based model for generating the Celab-A $32\times 32$ dataset, which is comparable to existing EBMs.
1 code implementation • 22 Jun 2023 • Hsiang-Wei Huang, Cheng-Yen Yang, Jiacheng Sun, Pyong-Kun Kim, Kwang-Ju Kim, Kyoungoh Lee, Chung-I Huang, Jenq-Neng Hwang
Additionally, relying on the Kalman filter in recent tracking algorithms falls short when object motion defies its linear assumption.
Ranked #5 on
Multiple Object Tracking
on SportsMOT
(using extra training data)
2 code implementations • NeurIPS 2023 • Weijian Luo, Tianyang Hu, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, Zhihua Zhang
To demonstrate the effectiveness and universality of Diff-Instruct, we consider two scenarios: distilling pre-trained diffusion models and refining existing GAN models.
no code implementations • 24 May 2023 • Mingyang Yi, Jiacheng Sun, Zhenguo Li
To understand this contradiction, we empirically verify the difference between the sufficiently trained diffusion model and the empirical optima.
3 code implementations • NeurIPS 2023 • Zebin You, Yong Zhong, Fan Bao, Jiacheng Sun, Chongxuan Li, Jun Zhu
In an effort to further advance semi-supervised generative and classification tasks, we propose a simple yet effective training strategy called dual pseudo training (DPT), built upon strong semi-supervised learners and diffusion models.
no code implementations • 1 Dec 2022 • Fan Bao, Chongxuan Li, Jiacheng Sun, Jun Zhu
Extensive empirical evidence demonstrates that conditional generative models are easier to train and perform better than unconditional ones by exploiting the labels of data.
1 code implementation • 15 Jun 2022 • Fan Bao, Chongxuan Li, Jiacheng Sun, Jun Zhu, Bo Zhang
Thus, the generation performance on a subset of timesteps is crucial, which is greatly influenced by the covariance design in DPMs.
no code implementations • 10 Dec 2021 • Qi Sun, Hexin Dong, Zewei Chen, Jiacheng Sun, Zhenguo Li, Bin Dong
Gradient-based methods for the distributed training of residual networks (ResNets) typically require a forward pass of the input data, followed by back-propagating the error gradient to update model parameters, which becomes time-consuming as the network goes deeper.
no code implementations • ICLR 2022 • Xiaojiang Yang, Yi Wang, Jiacheng Sun, Xing Zhang, Shifeng Zhang, Zhenguo Li, Junchi Yan
Nonlinear ICA is a fundamental problem in machine learning, aiming to identify the underlying independent components (sources) from data which is assumed to be a nonlinear function (mixing function) of these sources.
no code implementations • ICLR 2022 • Yao Zhu, Jiacheng Sun, Zhenguo Li
Adversarial transferability enables attackers to generate adversarial examples from the source model to attack the target model, which has raised security concerns about the deployment of DNNs in practice.
no code implementations • NeurIPS Workshop DLDE 2021 • Qi Sun, Hexin Dong, Zewei Chen, Weizhen Dian, Jiacheng Sun, Yitong Sun, Zhenguo Li, Bin Dong
Backpropagation algorithm is indispensable for training modern residual networks (ResNets) and usually tends to be time-consuming due to its inherent algorithmic lockings.
no code implementations • ICCV 2021 • Yao Zhu, Jiacheng Ma, Jiacheng Sun, Zewei Chen, Rongxin Jiang, Zhenguo Li
We find that adversarial training contributes to obtaining an energy function that is flat and has low energy around the real data, which is the key for generative capability.
no code implementations • 24 May 2021 • Mingyang Yi, Lu Hou, Jiacheng Sun, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma
In this paper, after defining OOD generalization via Wasserstein distance, we theoretically show that a model robust to input perturbation generalizes well on OOD data.
no code implementations • 1 Jan 2021 • Yao Zhu, Jiacheng Sun, Zewei Chen, Zhenguo Li
We justify the algorithm with a linear model that the added saliency maps pull data away from its closest decision boundary.
no code implementations • 4 Dec 2020 • Xiao-Yun Zhou, Jiacheng Sun, Nanyang Ye, Xu Lan, Qijun Luo, Bo-Lin Lai, Pedro Esperanca, Guang-Zhong Yang, Zhenguo Li
Among previous normalization methods, Batch Normalization (BN) performs well at medium and large batch sizes and is with good generalizability to multiple vision tasks, while its performance degrades significantly at small batch sizes.
no code implementations • 3 Sep 2020 • Qi Sun, Hexin Dong, Zewei Chen, Weizhen Dian, Jiacheng Sun, Yitong Sun, Zhenguo Li, Bin Dong
Gradient-based algorithms for training ResNets typically require a forward pass of the input data, followed by back-propagating the objective gradient to update parameters, which are time-consuming for deep ResNets.
no code implementations • 16 Jun 2020 • Jiacheng Sun, Xiangyong Cao, Hanwen Liang, Weiran Huang, Zewei Chen, Zhenguo Li
In recent years, a variety of normalization methods have been proposed to help train neural networks, such as batch normalization (BN), layer normalization (LN), weight normalization (WN), group normalization (GN), etc.
no code implementations • 13 Sep 2019 • Hanwen Liang, Shifeng Zhang, Jiacheng Sun, Xingqiu He, Weiran Huang, Kechen Zhuang, Zhenguo Li
Therefore, we propose a simple and effective algorithm, named "DARTS+", to avoid the collapse and improve the original DARTS, by "early stopping" the search procedure when meeting a certain criterion.