Search Results for author: Shilong Zhang

Found 18 papers, 17 papers with code

NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results

1 code implementation17 Apr 2025 Xin Li, Yeying Jin, Xin Jin, Zongwei Wu, Bingchen Li, YuFei Wang, Wenhan Yang, Yu Li, Zhibo Chen, Bihan Wen, Robby T. Tan, Radu Timofte, Qiyu Rong, Hongyuan Jing, Mengmeng Zhang, Jinglong Li, Xiangyu Lu, Yi Ren, YuTing Liu, Meng Zhang, Xiang Chen, Qiyuan Guan, Jiangxin Dong, Jinshan Pan, Conglin Gou, Qirui Yang, Fangpu Zhang, Yunlong Lin, Sixiang Chen, Guoxi Huang, Ruirui Lin, Yan Zhang, Jingyu Yang, Huanjing Yue, Jiyuan Chen, Qiaosi Yi, Hongjun Wang, Chenxi Xie, Shuai Li, Yuhui Wu, Kaiyi Ma, Jiakui Hu, Juncheng Li, Liwen Pan, Guangwei Gao, Wenjie Li, Zhenyu Jin, Heng Guo, Zhanyu Ma, YuBo Wang, Jinghua Wang, Wangzhi Xing, Anjusree Karnavar, Diqi Chen, Mohammad Aminul Islam, Hao Yang, Ruikun Zhang, Liyuan Pan, Qianhao Luo, XinCao, Han Zhou, Yan Min, Wei Dong, Jun Chen, Taoyi Wu, Weijia Dou, Yu Wang, Shengjie Zhao, Yongcheng Huang, Xingyu Han, Anyan Huang, Hongtao Wu, Hong Wang, Yefeng Zheng, Abhijeet Kumar, Aman Kumar, Marcos V. Conde, Paula Garrido, Daniel Feijoo, Juan C. Benito, Guanglu Dong, Xin Lin, Siyuan Liu, Tianheng Zheng, Jiayu Zhong, Shouyi Wang, Xiangtai Li, Lanqing Guo, Lu Qi, Chao Ren, Shuaibo Wang, Shilong Zhang, Wanyu Zhou, Yunze Wu, Qinzhong Tan, Jieyuan Pei, Zhuoxuan Li, Jiayu Wang, Haoyu Bian, Haoran Sun, Subhajit Paul, Ni Tang, Junhao Huang, Zihan Cheng, Hongyun Zhu, Yuehan Wu, Kaixin Deng, Hang Ouyang, Tianxin Xiao, Fan Yang, Zhizun Luo, Zeyu Xiao, Zhuoyuan Li, Nguyen Pham Hoang Le, An Dinh Thien, Son T. Luu, Kiet Van Nguyen, Ronghua Xu, Xianmin Tian, Weijian Zhou, Jiacheng Zhang, Yuqian Chen, Yihang Duan, Yujie Wu, Suresh Raikwar, Arsh Garg, Kritika, Jianhua Zheng, Xiaoshan Ma, Ruolin Zhao, Yongyu Yang, Yongsheng Liang, Guiming Huang, Qiang Li, Hongbin Zhang, Xiangyu Zheng, A. N. Rajagopalan

This paper reviews the NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images.

Raindrop Removal Rain Removal +1

PixelFlow: Pixel-Space Generative Models with Flow

1 code implementation10 Apr 2025 Shoufa Chen, Chongjian Ge, Shilong Zhang, Peize Sun, Ping Luo

We present PixelFlow, a family of image generation models that operate directly in the raw pixel space, in contrast to the predominant latent-space models.

Conditional Image Generation

Goku: Flow Based Video Generative Foundation Models

no code implementations7 Feb 2025 Shoufa Chen, Chongjian Ge, Yuqi Zhang, Yida Zhang, Fengda Zhu, Hao Yang, Hongxiang Hao, Hui Wu, Zhichao Lai, Yifei Hu, Ting-Che Lin, Shilong Zhang, Fu Li, Chuan Li, Xing Wang, Yanghua Peng, Peize Sun, Ping Luo, Yi Jiang, Zehuan Yuan, Bingyue Peng, Xiaobing Liu

This paper introduces Goku, a state-of-the-art family of joint image-and-video generation models leveraging rectified flow Transformers to achieve industry-leading performance.

Text-to-Image Generation Video Generation

Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM

1 code implementation19 Dec 2024 Yatai Ji, Jiacheng Zhang, Jie Wu, Shilong Zhang, Shoufa Chen, Chongjian Ge, Peize Sun, Weifeng Chen, Wenqi Shao, Xuefeng Xiao, Weilin Huang, Ping Luo

Text-to-video models have made remarkable advancements through optimization on high-quality text-video pairs, where the textual prompts play a pivotal role in determining quality of output videos.

Video Generation

Zero-shot Image Editing with Reference Imitation

1 code implementation11 Jun 2024 Xi Chen, Yutong Feng, Mengting Chen, Yiyang Wang, Shilong Zhang, Yu Liu, Yujun Shen, Hengshuang Zhao

Image editing serves as a practical yet challenging task considering the diverse demands from users, where one of the hardest parts is to precisely describe how the edited image should look like.

Semantic correspondence

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

2 code implementations10 Jun 2024 Peize Sun, Yi Jiang, Shoufa Chen, Shilong Zhang, Bingyue Peng, Ping Luo, Zehuan Yuan

(3) A text-conditional image generation model with 775M parameters, from two-stage training on LAION-COCO and high aesthetics quality images, demonstrating competitive performance of visual quality and text alignment.

Conditional Image Generation

FlashFace: Human Image Personalization with High-fidelity Identity Preservation

1 code implementation25 Mar 2024 Shilong Zhang, Lianghua Huang, Xi Chen, Yifei Zhang, Zhi-Fan Wu, Yutong Feng, Wei Wang, Yujun Shen, Yu Liu, Ping Luo

This work presents FlashFace, a practical tool with which users can easily personalize their own photos on the fly by providing one or a few reference face images and a text prompt.

Face Swapping Instruction Following +1

MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

1 code implementation8 May 2023 Tao Gong, Chengqi Lyu, Shilong Zhang, Yudong Wang, Miao Zheng, Qian Zhao, Kuikun Liu, Wenwei Zhang, Ping Luo, Kai Chen

To further enhance the ability to chat with humans of the MultiModal-GPT, we utilize language-only instruction-following data to train the MultiModal-GPT jointly.

Instruction Following Language Modeling +1

RTMDet: An Empirical Study of Designing Real-Time Object Detectors

12 code implementations14 Dec 2022 Chengqi Lyu, Wenwei Zhang, Haian Huang, Yue Zhou, Yudong Wang, Yanyi Liu, Shilong Zhang, Kai Chen

In this paper, we aim to design an efficient real-time object detector that exceeds the YOLO series and is easily extensible for many object recognition tasks such as instance segmentation and rotated object detection.

Object object-detection +7

What Are Expected Queries in End-to-End Object Detection?

1 code implementation2 Jun 2022 Shilong Zhang, Xinjiang Wang, Jiaqi Wang, Jiangmiao Pang, Kai Chen

As both sparse and dense queries are imperfect, then \emph{what are expected queries in end-to-end object detection}?

Instance Segmentation object-detection +2

Group R-CNN for Weakly Semi-supervised Object Detection with Points

1 code implementation CVPR 2022 Shilong Zhang, Zhuoran Yu, Liyang Liu, Xinjiang Wang, Aojun Zhou, Kai Chen

The core of this task is to train a point-to-box regressor on well-labeled images that can be used to predict credible bounding boxes for each point annotation.

Object Detection Representation Learning +1

Scale-Equalizing Pyramid Convolution for Object Detection

2 code implementations CVPR 2020 Xinjiang Wang, Shilong Zhang, Zhuoran Yu, Litong Feng, Wayne Zhang

Inspired by this, a convolution across the pyramid level is proposed in this study, which is termed pyramid convolution and is a modified 3-D convolution.

Object object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.