Search Results for author: Daoan Zhang

Found 16 papers, 6 papers with code

FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction

no code implementations • 23 Apr 2024 • Hang Hua, Jing Shi, Kushal Kafle, Simon Jenni, Daoan Zhang, John Collomosse, Scott Cohen, Jiebo Luo

To address this, we propose FineMatch, a new aspect-based fine-grained text and image matching benchmark, focusing on text and image mismatch detection and correction.

Hallucination In-Context Learning +2

Paper
Add Code

NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results

1 code implementation • 15 Apr 2024 • Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Hongyu An, Xinfeng Zhang, Zhiyuan Song, Ziyue Dong, Qing Zhao, Xiaogang Xu, Pengxu Wei, Zhi-chao Dou, Gui-ling Wang, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Cansu Korkmaz, A. Murat Tekalp, Yubin Wei, Xiaole Yan, Binren Li, Haonan Chen, Siqi Zhang, Sihan Chen, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi, Anjali Sarvaiya, Pooja Choksy, Jagrit Joshi, Shubh Kawa, Kishor Upla, Sushrut Patwardhan, Raghavendra Ramachandra, Sadat Hossain, Geongi Park, S. M. Nadim Uddin, Hao Xu, Yanhui Guo, Aman Urumbekov, Xingzhuo Yan, Wei Hao, Minghan Fu, Isaac Orais, Samuel Smith, Ying Liu, Wangwang Jia, Qisheng Xu, Kele Xu, Weijun Yuan, Zhan Li, Wenqin Kuang, Ruijin Guan, Ruting Deng, Zhao Zhang, Bo wang, Suiyi Zhao, Yan Luo, Yanyan Wei, Asif Hussain Khan, Christian Micheloni, Niki Martinel

This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained.

Image Super-Resolution valid

Paper
Code

Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering

no code implementations • 1 Feb 2024 • Pinxin Liu, Luchuan Song, Daoan Zhang, Hang Hua, Yunlong Tang, Huaijin Tu, Jiebo Luo, Chenliang Xu

To address the above problems, we propose the Efficient Monotonic Video Style Avatar (Emo-Avatar) through deferred neural rendering that enhances StyleGAN's capacity for producing dynamic, drivable portrait videos.

Contrastive Learning Neural Rendering

Paper
Add Code

CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs

no code implementations • 5 Jan 2024 • Daoan Zhang, Junming Yang, Hanjia Lyu, Zijian Jin, Yuan YAO, Mingkai Chen, Jiebo Luo

When exploring the development of Artificial General Intelligence (AGI), a critical task for these models involves interpreting and processing information from multiple image inputs.

Ranked #3 on Visual Reasoning on Winoground

Image Comprehension Text Matching +1

Paper
Add Code

Video Understanding with Large Language Models: A Survey

1 code implementation • 29 Dec 2023 • Yunlong Tang, Jing Bi, Siting Xu, Luchuan Song, Susan Liang, Teng Wang, Daoan Zhang, Jie An, Jingyang Lin, Rongyi Zhu, Ali Vosoughi, Chao Huang, Zeliang Zhang, Feng Zheng, JianGuo Zhang, Ping Luo, Jiebo Luo, Chenliang Xu

With the burgeoning growth of online video platforms and the escalating volume of video content, the demand for proficient video understanding tools has intensified markedly.

Video Understanding

650

Paper
Code

Semi-supervised Semantic Segmentation via Boosting Uncertainty on Unlabeled Data

no code implementations • 30 Nov 2023 • Daoan Zhang, Yunhao Luo, JianGuo Zhang

We first figure out that the distribution gap between labeled and unlabeled datasets cannot be ignored, even though the two datasets are sampled from the same distribution.

Segmentation Semi-Supervised Semantic Segmentation

Paper
Add Code

GPT-4V(ision) as A Social Media Analysis Engine

1 code implementation • 13 Nov 2023 • Hanjia Lyu, Jinfa Huang, Daoan Zhang, Yongsheng Yu, Xinyi Mou, Jinsheng Pan, Zhengyuan Yang, Zhongyu Wei, Jiebo Luo

Our investigation begins with a preliminary quantitative analysis for each task using existing benchmark datasets, followed by a careful review of the results and a selection of qualitative samples that illustrate GPT-4V's potential in understanding multimodal social media content.

Hallucination Hate Speech Detection +1

Paper
Code

Cross Contrasting Feature Perturbation for Domain Generalization

2 code implementations • ICCV 2023 • Chenming Li, Daoan Zhang, Wenjian Huang, JianGuo Zhang

Domain generalization (DG) aims to learn a robust model from source domains that generalize well on unseen target domains.

Domain Generalization

Paper
Code

DNAGPT: A Generalized Pre-trained Tool for Versatile DNA Sequence Analysis Tasks

no code implementations • 11 Jul 2023 • Daoan Zhang, Weitong Zhang, Yu Zhao, JianGuo Zhang, Bing He, Chenchen Qin, Jianhua Yao

Pre-trained large language models demonstrate potential in extracting information from DNA sequences, yet adapting to a variety of tasks and data modalities remains a challenge.

Binary Classification DNA analysis +1

Paper
Add Code

Black-box Source-free Domain Adaptation via Two-stage Knowledge Distillation

no code implementations • 13 May 2023 • Shuai Wang, Daoan Zhang, Zipei Yan, Shitong Shao, Rui Li

In Stage \uppercase\expandafter{\romannumeral1}, we train the target model from scratch with soft pseudo-labels generated by the source model in a knowledge distillation manner.

Knowledge Distillation Source-Free Domain Adaptation +1

Paper
Add Code

Towards Generalizable Medical Image Segmentation with Pixel-wise Uncertainty Estimation

no code implementations • 13 May 2023 • Shuai Wang, Zipei Yan, Daoan Zhang, Zhongsen Li, Sirui Wu, Wenxuan Chen, Rui Li

In contrast, the IID hypothesis is not universally guaranteed in numerous real-world applications, especially in medical image analysis.

Image Segmentation Medical Image Segmentation +1

Paper
Add Code

Feature Alignment and Uniformity for Test Time Adaptation

1 code implementation • CVPR 2023 • Shuai Wang, Daoan Zhang, Zipei Yan, JianGuo Zhang, Rui Li

Test time adaptation (TTA) aims to adapt deep neural networks when receiving out of distribution test domain samples.

Domain Generalization Image Segmentation +3

Paper
Code

Prototype Knowledge Distillation for Medical Segmentation with Missing Modality

1 code implementation • 17 Mar 2023 • Shuai Wang, Zipei Yan, Daoan Zhang, Haining Wei, Zhongsen Li, Rui Li

Specifically, our ProtoKD can not only distillate the pixel-wise knowledge of multi-modality data to single-modality data but also transfer intra-class and inter-class feature variations, such that the student model could learn more robust feature representation from the teacher model and inference with only one single modality data.

Image Segmentation Knowledge Distillation +3

Paper
Code

Bootstrap The Original Latent: Learning a Private Model from a Black-box Model

no code implementations • 7 Mar 2023 • Shuai Wang, Daoan Zhang, JianGuo Zhang, Weiwei Zhang, Rui Li

In this paper, considering the balance of data/model privacy of model owners and user needs, we propose a new setting called Back-Propagated Black-Box Adaptation (BPBA) for users to better train their private models via the guidance of the back-propagated results of a Black-box foundation/source model.

Paper
Add Code

Aggregation of Disentanglement: Reconsidering Domain Variations in Domain Generalization

no code implementations • 5 Feb 2023 • Daoan Zhang, Mingkai Chen, Chenming Li, Lingyun Huang, JianGuo Zhang

Different from learning domain invariant features from source domains, we decouple the input images into Domain Expert Features and noise.

Contrastive Learning Disentanglement +1

Paper
Add Code

Rethinking Alignment and Uniformity in Unsupervised Image Semantic Segmentation

no code implementations • 26 Nov 2022 • Daoan Zhang, Chenming Li, Haoquan Li, Wenjian Huang, Lingyun Huang, JianGuo Zhang

Experimental results on multiple semantic segmentation benchmarks show that our unsupervised segmentation framework specializes in catching semantic representations, which outperforms all the unpretrained and even several pretrained methods.

Ranked #1 on Unsupervised Semantic Segmentation on COCO-Stuff-3

Representation Learning Segmentation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.