no code implementations • ECCV 2020 • Xu Yan, Weibing Zhao, Kun Yuan, Ruimao Zhang, Zhen Li, Shuguang Cui
Recovering realistic textures from a largely down-sampled low resolution (LR) image with complicated patterns is a challenging problem in image super-resolution.
no code implementations • 26 Jun 2025 • Qizhi Xie, Kun Yuan, Yunpeng Qu, Jiachao Gong, Mingda Wu, Ming Sun, Chao Zhou, Jihong Zhu
The automated pipeline eliminates the reliance on expert-written quality descriptions and proprietary systems, ensuring data scalability and generation efficiency.
no code implementations • 25 Jun 2025 • Kun Yuan, Tingxuan Chen, Shi Li, Joel L. Lavanchy, Christian Heiliger, Ege Özsoy, Yiming Huang, Long Bai, Nassir Navab, Vinkle Srivastav, Hongliang Ren, Nicolas Padoy
SPA is a lightweight adaptation framework, allowing hospitals to rapidly customize phase recognition models by defining phases in natural language text, annotating a few images with the phase labels, and providing a task graph defining phase transitions.
no code implementations • 13 Jun 2025 • Jie Hu, Shengnan Wang, Yutong He, Ping Gong, Jiawei Yi, Juncheng Zhang, Youhui Bai, Renhai Chen, Gong Zhang, Cheng Li, Kun Yuan
To enable efficient clustering, we divide the sequence into chunks and propose Chunked Soft Matching, which employs an alternating partition strategy within each chunk and identifies clusters based on similarity.
1 code implementation • 3 Jun 2025 • Ping Gong, Jiawei Yi, Shengnan Wang, Juncheng Zhang, Zewen Jin, Ouxiang Zhou, Ruibo Liu, Guanbin Xu, Youhui Bai, Bowen Ye, Kun Yuan, Tong Yang, Gong Zhang, Renhai Chen, Feng Wu, Cheng Li
Large Language Models (LLMs) have emerged as a pivotal research area, yet the attention module remains a critical bottleneck in LLM inference, even with techniques like KVCache to mitigate redundant computations.
no code implementations • 21 May 2025 • Chi Kit Ng, Long Bai, Guankun Wang, Yupeng Wang, Huxin Gao, Kun Yuan, Chenhan Jin, Tieyong Zeng, Hongliang Ren
In endoscopic procedures, autonomous tracking of abnormal regions and following circumferential cutting markers can significantly reduce the cognitive burden on endoscopists.
no code implementations • 19 May 2025 • Ege Özsoy, Chantal Pellegrini, David Bani-Harouni, Kun Yuan, Matthias Keicher, Nassir Navab
We show the strong performance of ORQA on our proposed benchmark, and its zero-shot generalization, paving the way for scalable, unified OR modeling and significantly advancing multimodal surgical intelligence.
no code implementations • 16 May 2025 • Elsa Rizk, Kun Yuan, Ali H. Sayed
Diffusion learning is a framework that endows edge devices with advanced intelligence.
no code implementations • 21 Apr 2025 • Xin Li, Xijun Wang, Bingchen Li, Kun Yuan, Yizhen Shao, Suhang Yao, Ming Sun, Chao Zhou, Radu Timofte, Zhibo Chen
In this work, we build the first benchmark dataset for short-form UGC Image Super-resolution in the wild, termed KwaiSR, intending to advance the research on developing image super-resolution algorithms for short-form UGC platforms.
1 code implementation • 17 Apr 2025 • Xin Li, Kun Yuan, Bingchen Li, Fengbin Guan, Yizhen Shao, Zihao Yu, Xijun Wang, Yiting Lu, Wei Luo, Suhang Yao, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Yabin Zhang, Ao-Xiang Zhang, Tianwu Zhi, Jianzhao Liu, Yang Li, Jingwen Xu, Yiting Liao, Yushen Zuo, Mingyang Wu, Renjie Li, Shengyun Zhong, Zhengzhong Tu, Yufan Liu, Xiangguang Chen, Zuowei Cao, Minhao Tang, Shan Liu, Kexin Zhang, Jingfen Xie, Yan Wang, Kai Chen, Shijie Zhao, Yunchen Zhang, Xiangkai Xu, Hong Gao, Ji Shi, Yiming Bao, Xiugang Dong, Xiangsheng Zhou, Yaofeng Tu, Ying Liang, Yiwen Wang, Xinning Chai, Yuxuan Zhang, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Rong Xie, Li Song, Wei Sun, Kang Fu, Linhan Cao, Dandan Zhu, Kaiwei Zhang, Yucheng Zhu, ZiCheng Zhang, Menghan Hu, Xiongkuo Min, Guangtao Zhai, Zhi Jin, Jiawei Wu, Wei Wang, Wenjian Zhang, Yuhai Lan, Gaoxiong Yi, Hengyuan Na, Wang Luo, Di wu, MingYin Bai, Jiawang Du, Zilong Lu, Zhenyu Jiang, Hui Zeng, Ziguan Cui, Zongliang Gan, Guijin Tang, Xinglin Xie, Kehuan Song, Xiaoqiang Lu, Licheng Jiao, Fang Liu, Xu Liu, Puhua Chen, Ha Thu Nguyen, Katrien De Moor, Seyed Ali Amirshahi, Mohamed-Chaker Larabi, Qi Tang, Linfeng He, Zhiyong Gao, Zixuan Gao, Guohua Zhang, Zhiye Huang, Yi Deng, Qingmiao Jiang, Lu Chen, Yi Yang, Xi Liao, Nourine Mohammed Nadir, YuXuan Jiang, Qiang Zhu, Siyue Teng, Fan Zhang, Shuyuan Zhu, Bing Zeng, David Bull, Meiqin Liu, Chao Yao, Yao Zhao
This paper presents a review for the NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement.
no code implementations • 29 Mar 2025 • Boyi Ma, Yanguang Zhao, Jie Wang, Guankun Wang, Kun Yuan, Tong Chen, Long Bai, Hongliang Ren
In this study, we investigate the dialogue capabilities of the DeepSeek model in robotic surgery scenarios, focusing on tasks such as Single Phrase QA, Visual QA, and Detailed Description.
no code implementations • 26 Mar 2025 • Lisha Chen, Quan Xiao, Ellen Hidemi Fukuda, Xinyi Chen, Kun Yuan, Tianyi Chen
To solve this problem, we convert the multi-objective constraints to a single-objective constraint through a merit function with an easy-to-evaluate gradient, and then, we use a penalty-based reformulation of the bilevel optimization problem.
no code implementations • 20 Mar 2025 • Qiankun Shi, Jie Peng, Kun Yuan, Xiao Wang, Qing Ling
We establish the lower bounds on the Byzantine error and on the minimum number of queries to a stochastic gradient oracle required to achieve an arbitrarily small optimization error.
1 code implementation • CVPR 2025 • Yunpeng Qu, Kun Yuan, Qizhi Xie, Ming Sun, Chao Zhou, Jian Wang
Inspired by the Human Visual System (HVS) that links global quality to the local texture of different regions and their visual saliency, we propose a Kaleidoscope Video Quality Assessment (KVQ) framework, which aims to effectively assess both saliency and local texture, thereby facilitating the assessment of global quality.
1 code implementation • CVPR 2025 • Ege Özsoy, Chantal Pellegrini, Tobias Czempiel, Felix Tristram, Kun Yuan, David Bani-Harouni, Ulrich Eck, Benjamin Busam, Matthias Keicher, Nassir Navab
Operating rooms (ORs) are complex, high-stakes environments requiring precise understanding of interactions among medical staff, tools, and equipment for enhancing surgical assistance, situational awareness, and patient safety.
Ranked #1 on
Video Panoptic Segmentation
on 4D-OR
(using extra training data)
no code implementations • 11 Feb 2025 • Yiming Chen, Yuan Zhang, Yin Liu, Kun Yuan, Zaiwen Wen
In this work, we introduce a Randomized Subspace Optimization framework for pre-training and fine-tuning LLMs.
no code implementations • 4 Feb 2025 • Yaling Shen, Zhixiong Zhuang, Kun Yuan, Maria-Irina Nicolae, Nassir Navab, Nicolas Padoy, Mario Fritz
Experiments on the IU X-RAY and MIMIC-CXR radiology datasets demonstrate that Adversarial Domain Alignment enables attackers to steal the medical MLLM without any access to medical data.
1 code implementation • 31 Jan 2025 • Yunpeng Qu, Kun Yuan, Jinhua Hao, Kai Zhao, Qizhi Xie, Ming Sun, Chao Zhou
Image Super-Resolution (ISR) has seen significant progress with the introduction of remarkable generative models.
1 code implementation • 20 Jan 2025 • Guankun Wang, Long Bai, Junyi Wang, Kun Yuan, Zhen Li, Tianxu Jiang, Xiting He, Jinlin Wu, Zhen Chen, Zhen Lei, Hongbin Liu, Jiazheng Wang, Fan Zhang, Nicolas Padoy, Nassir Navab, Hongliang Ren
Recently, Multimodal Large Language Models (MLLMs) have demonstrated their immense potential in computer-aided diagnosis and decision-making.
1 code implementation • 16 Jan 2025 • Tingxuan Chen, Kun Yuan, Vinkle Srivastav, Nassir Navab, Nicolas Padoy
Conclusion: We propose a text-driven adaptation approach that mitigates the modality gap and handles multiple downstream tasks in surgical workflow analysis, with minimal reliance on large annotated datasets.
1 code implementation • 13 Jan 2025 • Ziqing Wen, Ping Luo, Jiahuan Wang, Xiaoge Deng, Jinping Zou, Kun Yuan, Tao Sun, Dongsheng Li
Large language models (LLMs) have shown impressive performance across a range of natural language processing tasks.
no code implementations • 23 Nov 2024 • Ming Hu, Kun Yuan, Yaling Shen, Feilong Tang, Xiaohao Xu, Lin Zhou, Wei Li, Ying Chen, Zhongxing Xu, Zelin Peng, Siyuan Yan, Vinkle Srivastav, Diping Song, Tianbin Li, Danli Shi, Jin Ye, Nicolas Padoy, Nassir Navab, Junjun He, ZongYuan Ge
Surgical practice involves complex visual interpretation, procedural skills, and advanced medical knowledge, making surgical vision-language pretraining (VLP) particularly challenging due to this complexity and the limited availability of annotated data.
no code implementations • 21 Nov 2024 • Shuchen Zhu, Boao Kong, Songtao Lu, Xinmeng Huang, Kun Yuan
To address these limitations, this paper proposes SPARKLE, a unified Single-loop Primal-dual AlgoRithm frameworK for decentraLized bilEvel optimization.
no code implementations • 21 Oct 2024 • Tao Sun, Xinwang Liu, Kun Yuan
This paper investigates the roles of gradient normalization and clipping in ensuring the convergence of Stochastic Gradient Descent (SGD) under heavy-tailed noise.
1 code implementation • 15 Oct 2024 • Yutong He, Pengrui Li, Yipeng Hu, Chuyan Chen, Kun Yuan
Subspace optimization algorithms, such as GaLore (Zhao et al., 2024), have gained attention for pre-training and fine-tuning large language models (LLMs) due to their memory efficiency.
1 code implementation • 10 Oct 2024 • Yiming Chen, Yuan Zhang, Liyuan Cao, Kun Yuan, Zaiwen Wen
However, traditional first-order (FO) fine-tuning algorithms incur substantial memory overhead due to the need to store activation values for back-propagation during gradient computation, particularly in long-context fine-tuning tasks.
2 code implementations • 30 Sep 2024 • Kun Yuan, Vinkle Srivastav, Nassir Navab, Nicolas Padoy
Surgical video-language pretraining (VLP) faces unique challenges due to the knowledge domain gap and the scarcity of multi-modal data.
1 code implementation • 16 Aug 2024 • Xue Wang, Tian Zhou, Jianqing Zhu, Jialin Liu, Kun Yuan, Tao Yao, Wotao Yin, Rong Jin, HanQin Cai
Attention based models have achieved many remarkable breakthroughs in numerous applications.
1 code implementation • 23 Jul 2024 • Qizhi Xie, Kun Yuan, Yunpeng Qu, Mingda Wu, Ming Sun, Chao Zhou, Jihong Zhu
To this end, we propose Quality- and aesthetics-aware pretraining (QPT V2), the first pretraining framework based on MIM that offers a unified solution to quality and aesthetics assessment.
no code implementations • 28 Jun 2024 • Ying Cao, Zhaoxian Wu, Kun Yuan, Ali H. Sayed
This paper proposes a theoretical framework to evaluate and compare the performance of gradient-descent algorithms for distributed learning in relation to their behavior around local minima in nonconvex environments.
no code implementations • CVPR 2024 • Kun Yuan, Hongbo Liu, Mading Li, Muyi Sun, Ming Sun, Jiachao Gong, Jinhua Hao, Chao Zhou, Yansong Tang
In this paper, we propose a VQA method named PTM-VQA, which leverages PreTrained Models to transfer knowledge from models pretrained on various pre-tasks, enabling benefits for VQA from different aspects.
2 code implementations • 16 May 2024 • Kun Yuan, Vinkle Srivastav, Nassir Navab, Nicolas Padoy
By disentangling embedding spaces of different hierarchical levels, the learned multi-modal representations encode short-term and long-term surgical concepts in the same model.
1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.
1 code implementation • 15 Apr 2024 • Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Hongyu An, Xinfeng Zhang, Zhiyuan Song, Ziyue Dong, Qing Zhao, Xiaogang Xu, Pengxu Wei, Zhi-chao Dou, Gui-ling Wang, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Cansu Korkmaz, A. Murat Tekalp, Yubin Wei, Xiaole Yan, Binren Li, Haonan Chen, Siqi Zhang, Sihan Chen, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi, Anjali Sarvaiya, Pooja Choksy, Jagrit Joshi, Shubh Kawa, Kishor Upla, Sushrut Patwardhan, Raghavendra Ramachandra, Sadat Hossain, Geongi Park, S. M. Nadim Uddin, Hao Xu, Yanhui Guo, Aman Urumbekov, Xingzhuo Yan, Wei Hao, Minghan Fu, Isaac Orais, Samuel Smith, Ying Liu, Wangwang Jia, Qisheng Xu, Kele Xu, Weijun Yuan, Zhan Li, Wenqin Kuang, Ruijin Guan, Ruting Deng, Zhao Zhang, Bo wang, Suiyi Zhao, Yan Luo, Yanyan Wei, Asif Hussain Khan, Christian Micheloni, Niki Martinel
This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained.
1 code implementation • 28 Mar 2024 • Huanpeng Chu, Wei Wu, Chengjie Zang, Kun Yuan
Diffusion models have revolutionized image synthesis, setting new benchmarks in quality and creativity.
no code implementations • 20 Mar 2024 • Diwei Wang, Kun Yuan, Candice Muller, Frédéric Blanc, Nicolas Padoy, Hyewon Seo
Based on a large-scale pre-trained Vision Language Model (VLM), our model learns and improves visual, textual, and numerical representations of patient gait videos, through a collective learning across three distinct modalities: gait videos, class-specific descriptions, and numerical gait parameters.
no code implementations • 18 Mar 2024 • Haolan Chen, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Wei Hu
In particular, we develop a cascaded controllable diffusion model that aims to optimize the extraction of information from low-resolution images.
1 code implementation • 8 Mar 2024 • Yunpeng Qu, Kun Yuan, Kai Zhao, Qizhi Xie, Jinhua Hao, Ming Sun, Chao Zhou
Diffusion-based methods, endowed with a formidable generative prior, have received increasing attention in Image Super-Resolution (ISR) recently.
1 code implementation • CVPR 2024 • Yiting Lu, Xin Li, Yajing Pei, Kun Yuan, Qizhi Xie, Yunpeng Qu, Ming Sun, Chao Zhou, Zhibo Chen
Short-form UGC video platforms, like Kwai and TikTok, have been an emerging and irreplaceable mainstream media form, thriving on user-friendly engagement, and kaleidoscope creation, etc.
no code implementations • 8 Feb 2024 • Elsa Rizk, Kun Yuan, Ali H. Sayed
In this work, we examine a network of agents operating asynchronously, aiming to discover an ideal global model that suits individual local datasets.
no code implementations • 5 Feb 2024 • Boao Kong, Shuchen Zhu, Songtao Lu, Xinmeng Huang, Kun Yuan
This provides the first theoretical understanding of how network topology, data heterogeneity, and nested bilevel structures influence decentralized SBO.
2 code implementations • 15 Dec 2023 • Kun Yuan, Manasi Kattel, Joel L. Lavanchy, Nassir Navab, Vinkle Srivastav, Nicolas Padoy
We highlight that the primary limitation in the current surgical VQA systems is the lack of scene knowledge to answer complex queries.
no code implementations • 28 Nov 2023 • Yifan Zhang, Xue Wang, Tian Zhou, Kun Yuan, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan
We demonstrate the effectiveness of \abbr through comprehensive experiments on multiple OOD detection benchmarks, extensive empirical studies show that \abbr significantly improves the performance of OOD detection over state-of-the-art methods.
no code implementations • 12 Oct 2023 • Luyao Guo, Sulaiman A. Alghunaim, Kun Yuan, Laurent Condat, Jinde Cao
We demonstrate that the leading communication complexity of ProxSkip is $\mathcal{O}(\frac{p\sigma^2}{n\epsilon^2})$ for non-convex and convex settings, and $\mathcal{O}(\frac{p\sigma^2}{n\epsilon})$ for the strongly convex setting, where $n$ represents the number of nodes, $p$ denotes the probability of communication, $\sigma^2$ signifies the level of stochastic noise, and $\epsilon$ denotes the desired accuracy level.
no code implementations • 28 Sep 2023 • Lei Yang, Tao Tang, Jun Li, Peng Chen, Kun Yuan, Li Wang, Yi Huang, Xinyu Zhang, Kaicheng Yu
In essence, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods.
no code implementations • 1 Aug 2023 • Hongbo Liu, Mingda Wu, Kun Yuan, Ming Sun, Yansong Tang, Chuanchuan Zheng, Xing Wen, Xiu Li
Video quality assessment (VQA) has attracted growing attention in recent years.
no code implementations • 31 Jul 2023 • Kun Yuan, Zishang Kong, Chuanchuan Zheng, Ming Sun, Xing Wen
\textit{Second}, the perceptual quality of a video exhibits a multi-distortion distribution, due to the differences in the duration and probability of occurrence for various distortions.
2 code implementations • 27 Jul 2023 • Kun Yuan, Vinkle Srivastav, Tong Yu, Joel L. Lavanchy, Jacques Marescaux, Pietro Mascagni, Nassir Navab, Nicolas Padoy
We then present a novel method, SurgVLP - Surgical Vision Language Pre-training, for multi-modal representation learning.
no code implementations • 19 Jul 2023 • Xiaohong Liu, Xiongkuo Min, Wei Sun, Yulun Zhang, Kai Zhang, Radu Timofte, Guangtao Zhai, Yixuan Gao, Yuqin Cao, Tengchuan Kou, Yunlong Dong, Ziheng Jia, Yilin Li, Wei Wu, Shuming Hu, Sibin Deng, Pengxiang Xiao, Ying Chen, Kai Li, Kai Zhao, Kun Yuan, Ming Sun, Heng Cong, Hao Wang, Lingzhi Fu, Yusheng Zhang, Rongyu Zhang, Hang Shi, Qihang Xu, Longan Xiao, Zhiliang Ma, Mirko Agarla, Luigi Celona, Claudio Rota, Raimondo Schettini, Zhiwei Huang, Yanan Li, Xiaotao Wang, Lei Lei, Hongye Liu, Wei Hong, Ironhead Chuang, Allen Lin, Drake Guan, Iris Chen, Kae Lou, Willy Huang, Yachun Tasi, Yvonne Kao, Haotian Fan, Fangyuan Kong, Shiqi Zhou, Hao liu, Yu Lai, Shanshan Chen, Wenqi Wang, HaoNing Wu, Chaofeng Chen, Chunzheng Zhu, Zekun Guo, Shiling Zhao, Haibing Yin, Hongkui Wang, Hanene Brachemi Meftah, Sid Ahmed Fezza, Wassim Hamidouche, Olivier Déforges, Tengfei Shi, Azadeh Mansouri, Hossein Motamednia, Amir Hossein Bakhtiari, Ahmad Mahmoudi Aznaveh
61 participating teams submitted their prediction results during the development phase, with a total of 3168 submissions.
no code implementations • 28 Jun 2023 • Ziheng Cheng, Xinmeng Huang, Pengfei Wu, Kun Yuan
When all clients participate in the training process, we demonstrate that incorporating momentum allows FedAvg to converge without relying on the assumption of bounded data heterogeneity even using a constant local learning rate.
1 code implementation • 1 Jun 2023 • Lisang Ding, Kexin Jin, Bicheng Ying, Kun Yuan, Wotao Yin
Their communication, governed by the communication topology and gossip weight matrices, facilitates the exchange of model updates.
no code implementations • NeurIPS 2023 • Yutong He, Xinmeng Huang, Kun Yuan
Our results reveal that using independent unbiased compression can reduce the total communication cost by a factor of up to $\Theta(\sqrt{\min\{n, \kappa\}})$ when all local smoothness constants are constrained by a common upper bound, where $n$ is the number of workers and $\kappa$ is the condition number of the functions being minimized.
no code implementations • 12 May 2023 • Yutong He, Xinmeng Huang, Yiming Chen, Wotao Yin, Kun Yuan
In this paper, we investigate the performance limit of distributed stochastic optimization algorithms employing communication compression.
2 code implementations • 25 Apr 2023 • Yi-Fan Zhang, Xue Wang, Kexin Jin, Kun Yuan, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan
In particular, when the adaptation target is a series of domains, the adaptation accuracy of AdaNPC is 50% higher than advanced TTA methods.
1 code implementation • 13 Apr 2023 • Kai Zhao, Kun Yuan, Ming Sun, Xing Wen
Video quality assessment (VQA) aims to simulate the human perception of video quality, which is influenced by factors ranging from low-level color and texture details to high-level semantic content.
1 code implementation • CVPR 2023 • Lei Yang, Kaicheng Yu, Tao Tang, Jun Li, Kun Yuan, Li Wang, Xinyu Zhang, Peng Chen
In essence, instead of predicting the pixel-wise depth, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods.
Ranked #3 on
3D Object Detection
on Rope3D
no code implementations • CVPR 2023 • Kai Zhao, Kun Yuan, Ming Sun, Mading Li, Xing Wen
Blind image quality assessment (BIQA) aims to automatically evaluate the perceived quality of a single image, whose performance has been improved by deep learning-based methods in recent years.
2 code implementations • 13 Feb 2023 • Chinedu Innocent Nwoye, Tong Yu, Saurav Sharma, Aditya Murali, Deepak Alapatt, Armine Vardazaryan, Kun Yuan, Jonas Hajek, Wolfgang Reiter, Amine Yamlahi, Finn-Henri Smidt, Xiaoyang Zou, Guoyan Zheng, Bruno Oliveira, Helena R. Torres, Satoshi Kondo, Satoshi Kasai, Felix Holm, Ege Özsoy, Shuangchun Gui, Han Li, Sista Raviteja, Rachana Sathish, Pranav Poudel, Binod Bhattarai, Ziheng Wang, Guo Rui, Melanie Schellenberg, João L. Vilaça, Tobias Czempiel, Zhenkun Wang, Debdoot Sheet, Shrawan Kumar Thapa, Max Berniker, Patrick Godau, Pedro Morais, Sudarshan Regmi, Thuy Nuong Tran, Jaime Fonseca, Jan-Hinrich Nölke, Estevão Lima, Eduard Vazquez, Lena Maier-Hein, Nassir Navab, Pietro Mascagni, Barbara Seeliger, Cristians Gonzalez, Didier Mutter, Nicolas Padoy
This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection.
Ranked #1 on
Action Triplet Detection
on CholecT50 (Challenge)
no code implementations • 1 Nov 2022 • Xinmeng Huang, Kun Yuan
The main difficulties lie in how to gauge the effectiveness when transmitting messages between two nodes via time-varying communications, and how to establish the lower bound when the network size is fixed (which is a prerequisite in stochastic optimization).
no code implementations • 14 Oct 2022 • Kun Yuan, Xinmeng Huang, Yiming Chen, Xiaohan Zhang, Yingya Zhang, Pan Pan
While (Lu and Sa, 2021) have recently provided an optimal rate for non-convex stochastic decentralized optimization with weight matrices defined over linear graphs, the optimal rate with general weight matrices remains unclear.
1 code implementation • 14 Oct 2022 • Zhuoqing Song, Weijian Li, Kexin Jin, Lei Shi, Ming Yan, Wotao Yin, Kun Yuan
In the proposed family, EquiStatic has a degree of $\Theta(\ln(n))$, where $n$ is the network size, and a series of time-dependent one-peer topologies, EquiDyn, has a constant degree of 1.
no code implementations • 10 Oct 2022 • Edward Duc Hien Nguyen, Sulaiman A. Alghunaim, Kun Yuan, César A. Uribe
We study the decentralized optimization problem where a network of $n$ agents seeks to minimize the average of a set of heterogeneous non-convex cost functions distributedly.
no code implementations • 8 Jun 2022 • Xinmeng Huang, Yiming Chen, Wotao Yin, Kun Yuan
We establish a convergence lower bound for algorithms whether using unbiased or contractive compressors in unidirection or bidirection.
no code implementations • 13 May 2022 • Mert Gurbuzbalaban, Yuanhan Hu, Umut Simsekli, Kun Yuan, Lingjiong Zhu
To have a more explicit control on the tail exponent, we then consider the case where the loss at each node is a quadratic, and show that the tail-index can be estimated as a function of the step-size, batch-size, and the topological properties of the network of the computational nodes.
no code implementations • 6 Apr 2022 • Zhuojie Wu, Xingqun Qi, Zijian Wang, Wanting Zhou, Kun Yuan, Muyi Sun, Zhenan Sun
Furthermore, to better improve the inter-coordination between the corrupted and non-corrupted regions and enhance the intra-coordination in corrupted regions, we design InCo2 Loss, a pair of similarity based losses to constrain the feature consistency.
1 code implementation • CVPR 2022 • Zejiang Hou, Minghai Qin, Fei Sun, Xiaolong Ma, Kun Yuan, Yi Xu, Yen-Kuang Chen, Rong Jin, Yuan Xie, Sun-Yuan Kung
However, conventional pruning methods have limitations in that: they are restricted to pruning process only, and they require a fully pre-trained large model.
no code implementations • NeurIPS 2021 • Xinmeng Huang, Kun Yuan, Xianghui Mao, Wotao Yin
In this paper, we will improve the convergence analysis and rates of variance reduction under without-replacement sampling orders for composite finite-sum minimization. Our results are in two-folds.
no code implementations • CVPR 2022 • Zijian Wang, Xingqun Qi, Kun Yuan, Muyi Sun
However, such methods fail to exploit the spatial correlation between the disentangled features.
2 code implementations • 8 Nov 2021 • Bicheng Ying, Kun Yuan, Hanbin Hu, Yiming Chen, Wotao Yin
On mainstream DNN training tasks, BlueFog reaches a much higher throughput and achieves an overall $1. 2\times \sim 1. 8\times$ speedup over Horovod, a state-of-the-art distributed deep learning package based on Ring-Allreduce.
2 code implementations • NeurIPS 2021 • Bicheng Ying, Kun Yuan, Yiming Chen, Hanbin Hu, Pan Pan, Wotao Yin
Experimental results on a variety of tasks and models demonstrate that decentralized (momentum) SGD over exponential graphs promises both fast and high-quality training.
no code implementations • 29 Sep 2021 • Bicheng Ying, Kun Yuan, Yiming Chen, Hanbin Hu, Yingya Zhang, Pan Pan, Wotao Yin
Decentralized adaptive gradient methods, in which each node averages only with its neighbors, are critical to save communication and wall-clock training time in deep learning tasks.
no code implementations • 10 Aug 2021 • Yao Li, Xiaorui Liu, Jiliang Tang, Ming Yan, Kun Yuan
Decentralized optimization and communication compression have exhibited their great potential in accelerating distributed machine learning by mitigating the communication bottleneck in practice.
1 code implementation • ICLR 2022 • Xiaolong Ma, Minghai Qin, Fei Sun, Zejiang Hou, Kun Yuan, Yi Xu, Yanzhi Wang, Yen-Kuang Chen, Rong Jin, Yuan Xie
It addresses the shortcomings of the previous works by repeatedly growing a subset of layers to dense and then pruning them back to sparse after some training.
no code implementations • 19 May 2021 • Yiming Chen, Kun Yuan, Yingya Zhang, Pan Pan, Yinghui Xu, Wotao Yin
Communication overhead hinders the scalability of large-scale distributed training.
no code implementations • 17 May 2021 • Kun Yuan, Sulaiman A. Alghunaim, Xinmeng Huang
For smooth objective functions, the transient stage (which measures the number of iterations the algorithm has to experience before achieving the linear speedup stage) of D-SGD is on the order of ${\Omega}(n/(1-\beta)^2)$ and $\Omega(n^3/(1-\beta)^4)$ for strongly and generally convex cost functions, respectively, where $1-\beta \in (0, 1)$ is a topology-dependent quantity that approaches $0$ for a large and sparse network.
no code implementations • 25 Apr 2021 • Xinmeng Huang, Kun Yuan, Xianghui Mao, Wotao Yin
In the highly data-heterogeneous scenario, Prox-DFinito with optimal cyclic sampling can attain a sample-size-independent convergence rate, which, to our knowledge, is the first result that can match with uniform-iid-sampling with variance reduction.
1 code implementation • ICCV 2021 • Kun Yuan, Yiming Chen, Xinmeng Huang, Yingya Zhang, Pan Pan, Yinghui Xu, Wotao Yin
Experimental results on a variety of computer vision tasks and models demonstrate that DecentLaM promises both efficient and high-quality training.
no code implementations • 30 Mar 2021 • Shaopeng Guo, Yujie Wang, Kun Yuan, Quanquan Li
In this paper we propose a novel network adaption method called Differentiable Network Adaption (DNA), which can adapt an existing network to a specific computation budget by adjusting the width and depth in a differentiable manner.
3 code implementations • ICCV 2021 • Kun Yuan, Shaopeng Guo, Ziwei Liu, Aojun Zhou, Fengwei Yu, Wei Wu
Motivated by the success of Transformers in natural language processing (NLP) tasks, there emerge some attempts (e. g., ViT and DeiT) to apply Transformers to the vision domain.
Ranked #1 on
Image Classification
on Oxford-IIIT Pets
4 code implementations • ICLR 2021 • Aojun Zhou, Yukun Ma, Junnan Zhu, Jianbo Liu, Zhijie Zhang, Kun Yuan, Wenxiu Sun, Hongsheng Li
In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network, which can maintain the advantages of both unstructured fine-grained sparsity and structured coarse-grained sparsity simultaneously on specifically designed GPUs.
no code implementations • ICCV 2021 • Kun Yuan, Quanquan Li, Shaopeng Guo, Dapeng Chen, Aojun Zhou, Fengwei Yu, Ziwei Liu
A standard practice of deploying deep neural networks is to apply the same architecture to all the input instances.
no code implementations • 2 Oct 2020 • Kun Yuan, Quanquan Li, Dapeng Chen, Aojun Zhou, Junjie Yan
To facilitate the training, we represent the network connectivity of each sample in an adjacency matrix.
no code implementations • ECCV 2020 • Kun Yuan, Quanquan Li, Jing Shao, Junjie Yan
In this paper, we attempt to optimize the connectivity in neural networks.
no code implementations • 28 Oct 2019 • Dongdong Yu, Zehuan Yuan, Jinlai Liu, Kun Yuan, Changhu Wang
Instance Segmentation is an interesting yet challenging task in computer vision.
no code implementations • 25 Sep 2019 • Ernest K. Ryu, Kun Yuan, Wotao Yin
Despite remarkable empirical success, the training dynamics of generative adversarial networks (GAN), which involves solving a minimax game using stochastic gradients, is still poorly understood.
no code implementations • 25 Sep 2019 • Kun Yuan, Quanquan Li, Yucong Zhou, Jing Shao, Junjie Yan
Seeking effective networks has become one of the most crucial and practical areas in deep learning.
no code implementations • 26 May 2019 • Ernest K. Ryu, Kun Yuan, Wotao Yin
Despite remarkable empirical success, the training dynamics of generative adversarial networks (GAN), which involves solving a minimax game using stochastic gradients, is still poorly understood.
no code implementations • 26 Mar 2019 • Kun Yuan, Sulaiman A. Alghunaim, Bicheng Ying, Ali H. Sayed
It is still unknown {\em whether}, {\em when} and {\em why} these bias-correction methods can outperform their traditional counterparts (such as consensus and diffusion) with noisy gradient and constant step-sizes.
no code implementations • 17 Oct 2018 • Lucas Cassano, Kun Yuan, Ali H. Sayed
In this scenario, agents collaborate to estimate the value function of a target team policy.
no code implementations • 29 May 2018 • Bicheng Ying, Kun Yuan, Ali H. Sayed
This work studies the problem of learning under both large datasets and large-dimensional feature space scenarios.
no code implementations • 21 Mar 2018 • Bicheng Ying, Kun Yuan, Stefan Vlaski, Ali H. Sayed
In empirical risk optimization, it has been observed that stochastic gradient implementations that rely on random reshuffling of the data achieve better performance than implementations that rely on sampling the data uniformly.
no code implementations • 4 Aug 2017 • Bicheng Ying, Kun Yuan, Ali H. Sayed
First, it resolves this open issue and provides the first theoretical guarantee of linear convergence under random reshuffling for SAGA; the argument is also adaptable to other variance-reduced algorithms.
no code implementations • 4 Aug 2017 • Kun Yuan, Bicheng Ying, Jiageng Liu, Ali H. Sayed
For such situations, the balanced gradient computation property of AVRG becomes a real advantage in reducing idle time caused by unbalanced local data storage requirements, which is characteristic of other reduced-variance gradient algorithms.
no code implementations • 14 Mar 2016 • Kun Yuan, Bicheng Ying, Ali H. Sayed
The article examines in some detail the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime.
no code implementations • 24 Feb 2016 • Bicheng Ying, Kun Yuan, Ali H. Sayed
The stochastic dual coordinate-ascent (S-DCA) technique is a useful alternative to the traditional stochastic gradient-descent algorithm for solving large-scale optimization problems due to its scalability to large data sets and strong theoretical guarantees.