no code implementations • 19 Jan 2025 • Daochang Liu, Junyu Zhang, Anh-Dung Dinh, Eunbyung Park, Shichao Zhang, Ajmal Mian, Mubarak Shah, Chang Xu
Therefore, the field of physics-aware generation in computer vision is rapidly growing, calling for a comprehensive survey to provide a structured analysis of current efforts.
no code implementations • CVPR 2025 • Chen Chen, Daochang Liu, Mubarak Shah, Chang Xu
Text-to-image diffusion models have demonstrated remarkable capabilities in creating images highly aligned with user prompts, yet their proclivity for memorizing training set images has sparked concerns about the originality of the generated images and privacy issues, potentially leading to legal complications for both model owners and users, particularly when the memorized images contain proprietary content.
no code implementations • 29 Oct 2024 • Chen Chen, Enhuai Liu, Daochang Liu, Mubarak Shah, Chang Xu
Diffusion models, widely used for image and video generation, face a significant limitation: the risk of memorizing and reproducing training data during inference, potentially generating unauthorized copyrighted content.
no code implementations • 29 Oct 2024 • Chen Chen, Daochang Liu, Mubarak Shah, Chang Xu
Furthermore, driven by our observation that local memorization significantly underperforms in existing tasks of measuring, detecting, and mitigating memorization in diffusion models compared to global memorization, we propose a simple yet effective method to integrate BE and the results of the new localization task into these existing frameworks.
no code implementations • 20 Aug 2024 • Anh-Dung Dinh, Daochang Liu, Chang Xu
We found that enforcing guidance throughout the sampling process is often counterproductive due to the model-fitting issue, where samples are 'tuned' to match the classifier's parameters rather than generalizing the expected condition.
no code implementations • 19 Jun 2024 • Daochang Liu, Axel Hu, Mubarak Shah, Chang Xu
In this paper, we propose DiffTriplet, a new generative framework for surgical triplet recognition employing the diffusion model, which predicts surgical triplets via iterative denoising.
Ranked #1 on
Action Triplet Recognition
on CholecT45 (cross-val)
no code implementations • CVPR 2024 • Chen Chen, Daochang Liu, Chang Xu
Pretrained diffusion models and their outputs are widely accessible due to their exceptional capacity for synthesizing high-quality images and their open-source nature.
no code implementations • 18 Mar 2024 • Siyu Xu, Yunke Wang, Daochang Liu, Chang Xu
Based on the observation that the accuracy of GPT-4V's image recognition varies significantly with the order of images within the collage prompt, our method further learns to optimize the arrangement of images for maximum recognition accuracy.
no code implementations • CVPR 2024 • Junyu Zhang, Daochang Liu, Eunbyung Park, Shichao Zhang, Chang Xu
This gap results in a residual in the generated images adversely impacting the image quality.
no code implementations • 23 Aug 2023 • Xiyu Wang, Baijiong Lin, Daochang Liu, Chang Xu
Diffusion Probabilistic Models (DPMs) have demonstrated substantial promise in image generation tasks but heavily rely on the availability of large amounts of training data.
no code implementations • 23 Aug 2023 • Xiyu Wang, Anh-Dung Dinh, Daochang Liu, Chang Xu
Our proposed sampler can be readily applied to a pre-trained diffusion model, utilizing momentum mechanisms and adaptive updating to smooth the reverse sampling process and ensure stable generation, resulting in outputs of enhanced quality.
1 code implementation • ICCV 2023 • Daochang Liu, Qiyue Li, AnhDung Dinh, Tingting Jiang, Mubarak Shah, Chang Xu
Temporal action segmentation is crucial for understanding long-form videos.
Ranked #3 on
Action Segmentation
on GTEA
no code implementations • 21 Feb 2023 • Chuyang Zhou, Jiajun Huang, Daochang Liu, Chengbin Du, Siqi Ma, Surya Nepal, Chang Xu
More specifically, knowledge distillation on both the spatial and frequency branches has degraded performance than distillation only on the spatial branch.
1 code implementation • 13 Feb 2023 • Linwei Tao, Minjing Dong, Daochang Liu, Changming Sun, Chang Xu
However, early stopping, as a well-known technique to mitigate overfitting, fails to calibrate networks.
1 code implementation • NeurIPS 2023 • Zunzhi You, Daochang Liu, Bohyung Han, Chang Xu
Experimental results demonstrate that, in terms of adversarial robustness, NIM is superior to MIM thanks to its effective denoising capability.
no code implementations • CVPR 2023 • Chen Chen, Daochang Liu, Siqi Ma, Surya Nepal, Chang Xu
However, apart from this standard utility, we identify the "reversed utility" as another crucial aspect, which computes the accuracy on generated data of a classifier trained using real data, dubbed as real2gen accuracy (r2g%).
1 code implementation • ICCV 2023 • Shuyi Jiang, Daochang Liu, Dingquan Li, Chang Xu
Approximately, 350 million people, a proportion of 8%, suffer from color vision deficiency (CVD).
1 code implementation • 27 Dec 2022 • Zhongwei Qiu, Huan Yang, Jianlong Fu, Daochang Liu, Chang Xu, Dongmei Fu
Video Super-Resolution (VSR) aims to restore high-resolution (HR) videos from low-resolution (LR) videos.
no code implementations • CVPR 2021 • Daochang Liu, Qiyue Li, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li
In this paper, a unified multi-path framework for automatic surgical skill assessment is proposed, which takes care of multiple composing aspects of surgical skills, including surgical tool usage, intraoperative event pattern, and other skill proxies.
1 code implementation • 27 Aug 2020 • Daochang Liu, Yuhui Wei, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li
In the experiments on the binary instrument segmentation task of the 2017 MICCAI EndoVis Robotic Instrument Segmentation Challenge dataset, the proposed method achieves 0. 71 IoU and 0. 81 Dice score without using a single manual annotation, which is promising to show the potential of unsupervised learning for surgical tool segmentation.
no code implementations • 27 Aug 2020 • Daochang Liu, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li
Then an objective and automated framework based on neural network is proposed to predict surgical skills through the proxy of COF.
1 code implementation • CVPR 2019 • Daochang Liu, Tingting Jiang, Yizhou Wang
In this work, we first identify two underexplored problems posed by the weak supervision for temporal action localization, namely action completeness modeling and action-context separation.
Ranked #14 on
Weakly Supervised Action Localization
on ActivityNet-1.3
1 code implementation • 21 Jun 2018 • Daochang Liu, Tingting Jiang
Recognition of surgical gesture is crucial for surgical skill assessment and efficient surgery training.
Ranked #3 on
Action Segmentation
on JIGSAWS