no code implementations • 14 Jun 2025 • Tianze Wang, Yifei Liu, Chen Chen, Pengfei Zuo, Jiawei Zhang, Qizhen Weng, Yin Chen, Zhenhua Han, Jieru Zhao, Quan Chen, Minyi Guo
Modern AI clusters, which host diverse workloads like data pre-processing, training and inference, often store the large-volume data in cloud storage and employ caching frameworks to facilitate remote data access.
no code implementations • 21 May 2025 • Tianqi Du, Zeming Wei, Quan Chen, Chenheng Zhang, Yisen Wang
The rapid advancement of large language models (LLMs) has demonstrated milestone success in a variety of tasks, yet their potential for generating harmful content has raised significant safety concerns.
no code implementations • 20 Apr 2025 • Lifeng Lin, Rongfeng Lu, Quan Chen, Haofan Ren, Ming Lu, Yaoqi Sun, Chenggang Yan, Anke Xue
Recently, many methods based on the 3D Gaussian Splatting (3DGS) framework have been proposed to address sparse-view 3D reconstruction.
no code implementations • 19 Apr 2025 • Hang Zhang, Jiuchen Shi, Yixiao Wang, Quan Chen, Yizhou Shan, Minyi Guo
FASTLIBRA comprises a dependency-aware cache manager and a performance-driven cache swapper.
no code implementations • 19 Mar 2025 • Yi Luo, Hamed Hooshangnejad, Xue Feng, Gaofeng Huang, Xiaojian Chen, Rui Zhang, Quan Chen, Wil Ngwa, Kai Ding
Conclusions: OCC represents a significant advance in oncology care, particularly through the use of the latest LVMs to improve contouring results by (1) streamlining oncology treatment workflows by optimizing tumor delineation, reducing manual processes; (2) offering a scalable and intuitive framework to reduce false positives in radiotherapy planning using LVMs; (3) introducing novel medical language vision prompt techniques to minimize LVMs hallucinations with ablation study, and (4) conducting a comparative analysis of LVMs, highlighting their potential in addressing medical language vision challenges.
4 code implementations • 27 Feb 2025 • Shulai Zhang, Ningxin Zheng, Haibin Lin, Ziheng Jiang, Wenlei Bao, Chengquan Jiang, Qi Hou, Weihao Cui, Size Zheng, Li-Wen Chang, Quan Chen, Xin Liu
The inter-device communication of a MoE layer can occupy 47% time of the entire model execution with popular models and frameworks.
no code implementations • 18 Feb 2025 • Jian Jia, Jingtong Gao, Ben Xue, Junhao Wang, Qingpeng Cai, Quan Chen, Xiangyu Zhao, Peng Jiang, Kun Gai
Discrete tokenizers have emerged as indispensable components in modern machine learning systems, particularly within the context of autoregressive modeling and large language models (LLMs).
no code implementations • 16 Dec 2024 • Quan Chen, Tingyu Wang, Rongfeng Lu, Bolun Zheng, Zhedong Zheng, Chenggang Yan
Specifically, we propose a distance guided dynamic partition learning strategy~(DGDPL), consisting of a square partition strategy and a distance-guided adjustment strategy.
no code implementations • 12 Dec 2024 • Zhihui Yin, Ye Ma, Xipeng Cao, Bo wang, Quan Chen, Peng Jiang
The proliferation of online short video platforms has driven a surge in user demand for short video editing.
no code implementations • 11 Dec 2024 • Zhentao Tan, Ben Xue, Jian Jia, Junhao Wang, Wencai Ye, Shaoyun Shi, MingJie Sun, Wenjin Wu, Quan Chen, Peng Jiang
SweetTokenizer achieves comparable video reconstruction fidelity with only \textbf{25\%} of the tokens used in previous state-of-the-art video tokenizers, and boost video generation results by \textbf{32. 9\%} w. r. t gFVD.
no code implementations • 28 Nov 2024 • Siqi Kou, Jiachun Jin, Zhihong Liu, Chang Liu, Ye Ma, Jian Jia, Quan Chen, Peng Jiang, Zhijie Deng
We introduce Orthus, an autoregressive (AR) transformer that excels in generating images given textual prompts, answering questions based on visual inputs, and even crafting lengthy image-text interleaved contents.
no code implementations • 23 Nov 2024 • Te Yang, Jian Jia, Xiangyu Zhu, Weisong Zhao, Bo wang, Yanhua Cheng, Yan Li, Shengyuan Liu, Quan Chen, Peng Jiang, Kun Gai, Zhen Lei
In this paper, we propose Visual-Modality Token Compression (VMTC) and Cross-Modality Attention Inhibition (CMAI) strategies to alleviate this gap between MLLMs and LLMs by inhibiting the influence of irrelevant visual tokens during content generation, increasing the instruction-following ability of the MLLMs while retaining their multimodal understanding capacity.
no code implementations • 18 Nov 2024 • Songyu Sun, Xiao Dong, Yanliang Sha, Quan Chen, Cheng Zhuo
High-speed serial links are fundamental to energy-efficient and high-performance computing systems such as artificial intelligence, 5G mobile and automotive, enabling low-latency and high-bandwidth communication.
no code implementations • 25 Sep 2024 • Xin Yuan, Ning li, Quan Chen, Wenchao Xu, Zhaoxin Zhang, Song Guo
Thus, the model split inference is proposed to improve the performance of edge intelligence, in which the AI model is divided into different sub models and the resource-intensive sub model is offloaded to edge server wirelessly for reducing resource requirements and inference latency.
1 code implementation • 6 Sep 2024 • Shen Zhao, Junyu Wang, Xitong Wang, Sizhuo Liu, Quan Chen, Kevin Kai Li, Yoo Jin Lee, Michael Salerno
(5-point Likert Scale) Conclusion: The theoretical derivation and experimental results validate the SMILE's improved performance at high acceleration and MB as compared to the existing 2D CAIPI SMS acquisition and reconstruction techniques for first-pass myocardial perfusion imaging.
no code implementations • 23 Aug 2024 • Jingyu Liu, Minquan Wang, Ye Ma, Bo wang, Aozhu Chen, Quan Chen, Peng Jiang, Xirong Li
Previous studies about adding SFX to videos perform video to SFX matching at a holistic level, lacking the ability of adding SFX to a specific moment.
no code implementations • 6 Aug 2024 • Ruixiang Zhao, Jian Jia, Yan Li, Xuehan Bai, Quan Chen, Han Li, Peng Jiang, Xirong Li
While Automatic Speech Recognition (ASR) text derived from the short or live-stream videos is readily accessible, how to de-noise the excessively noisy text for multimodal representation learning is mostly untouched.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • 23 Jul 2024 • Xiaowan Hu, Yiyi Chen, Yan Li, Minquan Wang, Haoqian Wang, Quan Chen, Han Li, Peng Jiang
The LPR task encompasses three primary dilemmas in real-world scenarios: 1) the recognition of intended products from distractor products present in the background; 2) the video-image heterogeneity that the appearance of products showcased in live streams often deviates substantially from standardized product images in stores; 3) there are numerous confusing products with subtle visual nuances in the shop.
no code implementations • 11 May 2024 • Shengyuan Liu, Bo wang, Ye Ma, Te Yang, Xipeng Cao, Quan Chen, Han Li, Di Dong, Peng Jiang
Furthermore, we propose a novel metric GroundingScore to evaluate subject alignment thoroughly.
1 code implementation • 7 May 2024 • Jian Jia, Yipei Wang, Yan Li, Honggang Chen, Xuehan Bai, Zhaocheng Liu, Jian Liang, Quan Chen, Han Li, Peng Jiang, Kun Gai
Contemporary recommendation systems predominantly rely on ID embedding to capture latent associations among users and items.
no code implementations • 24 Mar 2024 • Chunyu Xue, Weihao Cui, Han Zhao, Quan Chen, Shulai Zhang, Pengyu Yang, Jing Yang, Shaobo Li, Minyi Guo
The exponentially enlarged scheduling space and ever-changing optimal parallelism plan from adaptive parallelism together result in the contradiction between low-overhead and accurate performance data acquisition for efficient cluster scheduling.
no code implementations • 18 Mar 2024 • Qianyu Zhang, Bolun Zheng, Xinying Chen, Quan Chen, Zhunjie Zhu, Canjin Wang, Zongpeng Li, Chengang Yan
The goal of video quality enhancement is to reduce compression artifacts and reconstruct a visually-pleasant result.
no code implementations • 15 Mar 2024 • Dongze Hao, Jian Jia, Longteng Guo, Qunbo Wang, Te Yang, Yan Li, Yanhua Cheng, Bo wang, Quan Chen, Han Li, Jing Liu
We condense the retrieved knowledge passages from two perspectives.
1 code implementation • 7 Mar 2024 • Quan Chen, Tingyu Wang, Zihao Yang, Haoran Li, Rongfeng Lu, Yaoqi Sun, Bolun Zheng, Chenggang Yan
We propose a dense partition strategy (DPS), dividing the image into multiple parts to explore contextual information while explicitly maintaining the global structure.
no code implementations • 21 Feb 2024 • Hamed Hooshangnejad, Xue Feng, Gaofeng Huang, Rui Zhang, Katelyn Kelly, Quan Chen, Kai Ding
Lung cancer is a devastating disease with the highest mortality rate among cancer types.
1 code implementation • 1 Jan 2024 • Kaibin Tian, Yanhua Cheng, Yi Liu, Xinglin Hou, Quan Chen, Han Li
To address this issue, we adopt multi-granularity visual feature learning, ensuring the model's comprehensiveness in capturing visual content features spanning from abstract to detailed levels during the training phase.
Ranked #7 on
Video Retrieval
on MSR-VTT-1kA
no code implementations • 27 Dec 2023 • Xin Yuan, Ning li, Kang Wei, Wenchao Xu, Quan Chen, Hao Chen, Song Guo
The model segmentation without user mobility has been investigated deeply by previous works.
no code implementations • 27 Sep 2023 • Jiawen Wang, Quan Chen, Deze Zeng, Zhuo Song, Chen Chen, Minyi Guo
With the collaborative serving mechanism, only part of node representations are updated during the update phase, and the final representations are calculated in the inference phase.
1 code implementation • ICCV 2023 • Xuehan Bai, Yan Li, Yanhua Cheng, Wenjie Yang, Quan Chen, Han Li
It is the first dataset to cover product pages, short videos, and live streams simultaneously, providing the basis for establishing a unified product representation across different media domains.
1 code implementation • ICCV 2023 • Wenjie Yang, Yiyi Chen, Yan Li, Yanhua Cheng, Xudong Liu, Quan Chen, Han Li
Moreover, a cRoss-vIew semantiC alignmEnt (RICE) model is proposed to learn discriminative instance features from the image and video views of the products.
no code implementations • 5 Aug 2023 • Duolan Huang, Quan Chen, Zhun Wei, Rui Chen
Subsequently, the reconstruction is achieved by optimizing a directional albedo model with SS regularization using fast iterative shrinkage-thresholding algorithm.
no code implementations • 23 Jul 2023 • Guan Shen, Jieru Zhao, Zeke Wang, Zhe Lin, Wenchao Ding, Chentao Wu, Quan Chen, Minyi Guo
Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly.
no code implementations • 27 May 2023 • Yangjie Zhou, Yaoxu Song, Jingwen Leng, Zihan Liu, Weihao Cui, Zhendong Zhang, Cong Guo, Quan Chen, Li Li, Minyi Guo
Graph neural networks (GNNs) are powerful tools for exploring and learning from graph structures and features.
1 code implementation • 11 Jul 2022 • Ben Xue, Shenghui Ran, Quan Chen, Rongfei Jia, Binqiang Zhao, Xing Tang
Image color harmonization algorithm aims to automatically match the color distribution of foreground and background images captured in different conditions.
Ranked #9 on
Image Harmonization
on iHarmony4
no code implementations • 29 Jun 2022 • Guan Shen, Jieru Zhao, Quan Chen, Jingwen Leng, Chao Li, Minyi Guo
However, the quadratic complexity of self-attention w. r. t the sequence length incurs heavy computational and memory burdens, especially for tasks with long sequences.
no code implementations • 17 May 2022 • Tianshu Hou, Peining Zhen, Ngai Wong, Quan Chen, Guoyong Shi, Shuqi Wang, Hai-Bao Chen
Electromigration (EM) is one of the major concerns in the reliability analysis of very large scale integration (VLSI) systems due to the continuous technology scaling.
no code implementations • 29 Mar 2022 • Tianshu Hou, Ngai Wong, Quan Chen, Zhigang Ji, Hai-Bao Chen
The electromigration (EM)-induced reliability issues in very large scale integration (VLSI) circuits have attracted increased attention due to the continuous technology scaling.
no code implementations • 18 Oct 2021 • Ye Ma, Jin Ma, Min Zhou, Quan Chen, Tiezheng Ge, Yuning Jiang, Tong Lin
Secondly, another GAN model is trained to synthesize real images based on the extended semantic layouts.
no code implementations • 8 Sep 2021 • Shulai Zhang, Zirui Li, Quan Chen, Wenli Zheng, Jingwen Leng, Minyi Guo
Federated learning (FL) is a distributed machine learning paradigm that allows clients to collaboratively train a model over their own local data.
1 code implementation • 15 Feb 2021 • Chaofan Tao, Rui Lin, Quan Chen, Zhaoyang Zhang, Ping Luo, Ngai Wong
Prior arts often discretize the network weights by carefully tuning hyper-parameters of quantization (e. g. non-uniform stepsize and layer-wise bitwidths), which are complicated and sub-optimal because the full-precision and low-precision models have a large discrepancy.
4 code implementations • 3 Feb 2021 • Qinbin Li, Yiqun Diao, Quan Chen, Bingsheng He
We find that non-IID does bring significant challenges in learning accuracy of FL algorithms, and none of the existing state-of-the-art FL algorithms outperforms others in all cases.
no code implementations • 10 Dec 2020 • Quan Chen, Hai Zhu, Lei Yang, Xiaoqian Chen, Sofie Pollin, Evgenii Vinogradov
By proposing a framework of Edge Computing Assisted Autonomous Flight (ECAAF), we illustrate that vision and communications can interact with and assist each other with the aid of edge computing and offloading, and further speed up the UAV mission completion.
Edge-computing
Trajectory Planning
Networking and Internet Architecture
Robotics
Systems and Control
Systems and Control
no code implementations • COLING 2020 • Yue Guan, Jingwen Leng, Chao Li, Quan Chen, Minyi Guo
Recent research on the multi-head attention mechanism, especially that in pre-trained models such as BERT, has shown us heuristics and clues in analyzing various aspects of the mechanism.
no code implementations • 2 Nov 2020 • Yue Guan, Jingwen Leng, Chao Li, Quan Chen, Minyi Guo
Recent research on the multi-head attention mechanism, especially that in pre-trained models such as BERT, has shown us heuristics and clues in analyzing various aspects of the mechanism.
no code implementations • 18 Feb 2020 • Cong Guo, Yangjie Zhou, Jingwen Leng, Yuhao Zhu, Zidong Du, Quan Chen, Chao Li, Bin Yao, Minyi Guo
We propose Simultaneous Multi-mode Architecture (SMA), a novel architecture design and execution model that offers general-purpose programmability on DNN accelerators in order to accelerate end-to-end applications.
1 code implementation • 14 Nov 2019 • Bo Wang, Quan Chen, Min Zhou, Zhiqiang Zhang, Xiaogang Jin, Kun Gai
Feature matters for salient object detection.
no code implementations • CVPR 2019 • Yuxian Qiu, Jingwen Leng, Cong Guo, Quan Chen, Chao Li, Minyi Guo, Yuhao Zhu
Recently, researchers have started decomposing deep neural network models according to their semantics or functions.
no code implementations • 6 Mar 2019 • Huitong Pan, Yushan Feng, Quan Chen, Craig Meyer, Xue Feng
Using PROMISE-12 data, we demonstrated the robustness of the two-stage model and showed high correlation of the proposed variable-input based uncertainty measures with GT-based performance.
no code implementations • 27 Sep 2018 • Yuxian Qiu, Jingwen Leng, Yuhao Zhu, Quan Chen, Chao Li, Minyi Guo
Despite their enormous success, there is still no solid understanding of deep neural network’s working mechanism.
2 code implementations • 5 Sep 2018 • Quan Chen, Tiezheng Ge, Yanyu Xu, Zhiqiang Zhang, Xinxin Yang, Kun Gai
SHM is the first algorithm that learns to jointly fit both semantic information and high quality details with deep networks.
Ranked #5 on
Image Matting
on AIM-500
1 code implementation • 28 Aug 2018 • Jianmin Guo, Yu Jiang, Yue Zhao, Quan Chen, Jiaguang Sun
Deep learning (DL) systems are increasingly applied to safety-critical domains such as autonomous driving cars.
Software Engineering
no code implementations • 21 Jan 2018 • Liang Shen, Zihan Yue, Quan Chen, Fan Feng, Jie Ma
On the other hand, the accumulation of rain streaks from long distance makes the rain image look like haze veil.
no code implementations • 7 Nov 2017 • Liang Shen, Zihan Yue, Fan Feng, Quan Chen, Shihao Liu, Jie Ma
In this paper, a low-light image enhancement model based on convolutional neural network and Retinex theory is proposed.