no code implementations • 12 Mar 2025 • Haoxuan Wang, Jinlong Peng, Qingdong He, Hao Yang, Ying Jin, Jiafu Wu, Xiaobin Hu, Yanjie Pan, Zhenye Gan, Mingmin Chi, Bo Peng, Yabiao Wang
With the rapid development of diffusion models in image generation, the demand for more powerful and flexible controllable frameworks is increasing.
no code implementations • 9 Mar 2025 • Yanjie Pan, Qingdong He, Zhengkai Jiang, Pengcheng Xu, Chaoyi Wang, Jinlong Peng, Haoxuan Wang, Yun Cao, Zhenye Gan, Mingmin Chi, Bo Peng, Yabiao Wang
Recent advances in diffusion-based text-to-image generation have demonstrated promising results through visual condition control.
no code implementations • 22 Feb 2025 • Haoxuan Wang
In this study, we investigate the integration of a large language model (LLM) with an automatic speech recognition (ASR) system, specifically focusing on enhancing rare word recognition performance.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 13 Oct 2024 • Zhiguang Zhou, Haoxuan Wang, Zhengqing Zhao, Fengling Zheng, Yongheng Wang, Wei Chen, Yong Wang
We present four cases to illustrate how our knowledge-graph-based representation can model the detailed visual elements and semantic relations in charts, and further demonstrate how our approach can benefit downstream applications such as semantic-aware chart retrieval and chart question answering.
1 code implementation • 13 Sep 2024 • Haoxuan Wang, Qingdong He, Jinlong Peng, Hao Yang, Mingmin Chi, Yabiao Wang
However, its performance is hindered by its neck feature fusion mechanism, which causes the quadratic complexity and the limited guided receptive fields.
1 code implementation • 24 Aug 2024 • Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang, Kai Wang, Yan Yan
It reduces the distance between the student and the biased expert trajectories and prevents the tail class bias from being distilled to the synthetic dataset.
1 code implementation • 25 May 2024 • Junyi Wu, Haoxuan Wang, Yuzhang Shang, Mubarak Shah, Yan Yan
SSC extends this approach by dynamically adjusting the balanced salience to capture the temporal variations in activation.
no code implementations • 15 Feb 2024 • Yuxuan Gu, Yi Jin, Ben Wang, Zhixiang Wei, Xiaoxiao Ma, Pengyang Ling, Haoxuan Wang, Huaian Chen, Enhong Chen
In this work, we observe that the generators, which are pre-trained on massive natural images, inherently hold the promising potential for superior low-light image enhancement against varying scenarios. Specifically, we embed a pre-trained generator to Retinex model to produce reflectance maps with enhanced detail and vividness, thereby recovering features degraded by low-light conditions. Taking one step further, we introduce a novel optimization strategy, which backpropagates the gradients to the input seeds rather than the parameters of the low-light enhancement model, thus intactly retaining the generative knowledge learned from natural images and achieving faster convergence speed.
1 code implementation • 6 Feb 2024 • Haoxuan Wang, Yuzhang Shang, Zhihang Yuan, Junyi Wu, Junchi Yan, Yan Yan
We empirically verify that our approach modifies the activation distribution and provides meaningful temporal information, facilitating easier and more accurate quantization.
no code implementations • 18 Oct 2023 • Tianyang Xue, Mingdong Wu, Lin Lu, Haoxuan Wang, Hao Dong, Baoquan Chen
In this work, we delve deeper into a novel machine learning-based approach that formulates the packing problem as conditional generative modeling.
no code implementations • 3 Dec 2022 • Haoxuan Wang, Junchi Yan
Deep neural networks still struggle on long-tailed image datasets, and one of the reasons is that the imbalance of training data across categories leads to the imbalance of trained model parameters.
Ranked #21 on
Long-tail Learning
on CIFAR-10-LT (ρ=10)
no code implementations • 8 Oct 2020 • Haoxuan Wang, Zhiding Yu, Yisong Yue, Anima Anandkumar, Anqi Liu, Junchi Yan
We propose a framework for learning calibrated uncertainties under domain shifts, where the source (training) distribution differs from the target (test) distribution.
no code implementations • 28 Sep 2020 • Haoxuan Wang, Anqi Liu, Zhiding Yu, Yisong Yue, Anima Anandkumar
This formulation motivates the use of two neural networks that are jointly trained --- a discriminative network between the source and target domains for density-ratio estimation, in addition to the standard classification network.