1 code implementation • 20 Aug 2024 • HaoNing Wu, Shaocheng Shen, Qiang Hu, Xiaoyun Zhang, Ya zhang, Yanfeng Wang
Diffusion models have emerged as frontrunners in text-to-image generation, however, their fixed image resolution during training often leads to challenges in high-resolution image generation, such as semantic deviations and object replication.
no code implementations • 12 Jul 2024 • Zihan Zheng, Houqiang Zhong, Qiang Hu, Xiaoyun Zhang, Li Song, Ya zhang, Yanfeng Wang
Volumetric video based on Neural Radiance Field (NeRF) holds vast potential for various 3D applications, but its substantial data volume poses significant challenges for compression and transmission.
no code implementations • 24 Jun 2024 • Zhengyue Zhao, Xiaoyun Zhang, Kaidi Xu, Xing Hu, Rui Zhang, Zidong Du, Qi Guo, Yunji Chen
With the widespread application of Large Language Models (LLMs), it has become a significant concern to ensure their safety and prevent harmful responses.
no code implementations • 23 May 2024 • Zihan Zheng, Houqiang Zhong, Qiang Hu, Xiaoyun Zhang, Li Song, Ya zhang, Yanfeng Wang
Neural Radiance Field (NeRF) excels in photo-realistically static scenes, inspiring numerous efforts to facilitate volumetric videos.
no code implementations • 23 Apr 2024 • Haozhe Cheng, Cheng Ju, Haicheng Wang, Jinxiang Liu, Mengting Chen, Qiang Hu, Xiaoyun Zhang, Yanfeng Wang
The denoised text classes help OVAR models classify visual samples more accurately; in return, classified visual samples help better denoising.
no code implementations • 14 Feb 2024 • Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang, Hui Zhao
In this paper, we propose a Weakly supervised Iterative Spinal Segmentation (WISS) method leveraging only four corner landmark weak labels on a single sagittal slice to achieve automatic volumetric segmentation from CT images for VBs.
no code implementations • 14 Feb 2024 • Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang, Hui Zhao
In this paper, we explore a learning-based automatic bone quality classification method for spinal metastasis based on CT images.
1 code implementation • CVPR 2024 • Chang Liu, HaoNing Wu, Yujie Zhong, Xiaoyun Zhang, Yanfeng Wang, Weidi Xie
Generative models have recently exhibited exceptional capabilities in text-to-image generation but still struggle to generate image sequences coherently.
2 code implementations • 16 Aug 2023 • Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks.
1 code implementation • 24 Jun 2023 • HaoNing Wu, Xiaoyun Zhang, Weidi Xie, Ya zhang, Yanfeng Wang
Video frame interpolation (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.
no code implementations • 5 Jun 2023 • Xiaoyun Zhang, Xieyi Ping, Jianwei Zhang
Previous work optimizes traditional active learning (AL) processes with incremental neural network architecture search (Active-iNAS) based on data complexity change, which improves the accuracy and learning efficiency.
1 code implementation • 1 Jun 2023 • Chang Liu, HaoNing Wu, Yujie Zhong, Xiaoyun Zhang, Yanfeng Wang, Weidi Xie
Generative models have recently exhibited exceptional capabilities in text-to-image generation, but still struggle to generate image sequences coherently.
no code implementations • 19 Mar 2023 • Chaofan Ma, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Yanfeng Wang, Ya zhang
Interactive segmentation has recently been explored to effectively and efficiently harvest high-quality segmentation masks by iteratively incorporating user hints.
1 code implementation • CVPR 2023 • Zhixin Wang, Xiaoyun Zhang, Ziying Zhang, Huangjie Zheng, Mingyuan Zhou, Ya zhang, Yanfeng Wang
However, it is expensive and infeasible to include every type of degradation to cover real-world cases in the training data.
1 code implementation • ICCV 2023 • Ziyi Li, Qinye Zhou, Xiaoyun Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
The goal of this paper is to extract the visual-language correspondence from a pre-trained text-to-image diffusion model, in the form of segmentation map, i. e., simultaneously generating images and segmentation masks for the corresponding visual entities described in the text prompt.
no code implementations • 7 Oct 2022 • Qinye Zhou, Ziyi Li, Weidi Xie, Xiaoyun Zhang, Ya zhang, Yanfeng Wang
Existing models on super-resolution often specialized for one scale, fundamentally limiting their use in practical scenarios.
no code implementations • CVPR 2022 • Yixuan Huang, Xiaoyun Zhang, Yu Fu, Siheng Chen, Ya zhang, Yan-Feng Wang, Dazhi He
Those methods conduct the super-resolution task of the input low-resolution(LR) image and the texture transfer task from the reference image together in one module, easily introducing the interference between LR and reference features.
1 code implementation • CVPR 2022 • Baisong Guo, Xiaoyun Zhang, HaoNing Wu, Yu Wang, Ya zhang, Yan-Feng Wang
Previous super-resolution (SR) approaches often formulate SR as a regression problem and pixel wise restoration, which leads to a blurry and unreal SR output.
1 code implementation • NeurIPS 2021 • Xingyue Pu, Tianyue Cao, Xiaoyun Zhang, Xiaowen Dong, Siheng Chen
The model is trained in an end-to-end fashion with pairs of node data and graph samples.
no code implementations • ICCV 2021 • Tianyue Cao, Lianyu Du, Xiaoyun Zhang, Siheng Chen, Ya zhang, Yan-Feng Wang
To handle overlapping category transfer, we propose a double-supervision mean teacher to gather common category information and bridge the domain gap between two datasets.
no code implementations • 6 Apr 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Xiaoyun Zhang, Qi Tian
To solve this issue, we introduce an adaptive mutual supervision framework (AMS) with two branches, where the base branch adopts CAS to localize the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through a novel adaptive sampler.
Ranked #7 on Weakly Supervised Action Localization on THUMOS14
Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1
no code implementations • 13 Oct 2020 • Shixiang Feng, Beibei Liu, Ya zhang, Xiaoyun Zhang, Yuehua Li
In this paper, we explore to model VCFs diagnosis as a three-class classification problem, i. e. normal vertebrae, benign VCFs, and malignant VCFs.
no code implementations • ECCV 2020 • Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu, Zhiyong Gao
Therefore, the encoder is adaptive to different video contents and achieves better compression performance by reducing the domain gap between the training and testing datasets.
no code implementations • CVPR 2020 • Xuan Liao, Wenhao Li, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang
We here propose to model the dynamic process of iterative interactive image segmentation as a Markov decision process (MDP) and solve it with reinforcement learning (RL).
5 code implementations • CVPR 2019 • Wenbo Bao, Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, Ming-Hsuan Yang
The proposed model then warps the input frames, depth maps, and contextual features based on the optical flow and local interpolation kernels for synthesizing the output frame.
Ranked #5 on Video Frame Interpolation on Middlebury
4 code implementations • CVPR 2019 • Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Chunlei Cai, Zhiyong Gao
Conventional video compression approaches use the predictive coding architecture and encode the corresponding motion information and residual information.
1 code implementation • 20 Oct 2018 • Wenbo Bao, Wei-Sheng Lai, Xiaoyun Zhang, Zhiyong Gao, Ming-Hsuan Yang
Recently, a number of data-driven frame interpolation methods based on convolutional neural networks have been proposed.
Ranked #21 on Video Frame Interpolation on Vimeo90K
1 code implementation • arXiv 2018 • Wenbo Bao, Wei-Sheng Lai, Xiaoyun Zhang, Zhiyong Gao, Ming-Hsuan Yang
In this work, we propose a motion estimation and motion compensation driven neural network for video frame interpolation.
Ranked #6 on Video Frame Interpolation on Middlebury
1 code implementation • ECCV 2018 • Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Zhiyong Gao, Ming-Ting Sun
In this paper, we model the video artifact reduction task as a Kalman filtering procedure and restore decoded frames through a deep Kalman filtering network.
no code implementations • 10 May 2018 • Xiaoyi He, Qiang Hu, Xintong Han, Xiaoyun Zhang, Chongyang Zhang, Weiyao Lin
In this paper, we propose a partition-masked Convolution Neural Network (CNN) to achieve compressed-video enhancement for the state-of-the-art coding standard, High Efficiency Video Coding (HECV).
Multimedia