no code implementations • 16 Dec 2024 • Hao Li, Shamit Lal, Zhiheng Li, Yusheng Xie, Ying Wang, Yang Zou, Orchid Majumder, R. Manmatha, Zhuowen Tu, Stefano Ermon, Stefano Soatto, Ashwin Swaminathan
We empirically study the scaling properties of various Diffusion Transformers (DiTs) for text-to-image generation by performing extensive and rigorous ablations, including training scaled DiTs ranging from 0. 3B upto 8B parameters on datasets up to 600M images.
no code implementations • 6 Dec 2024 • Songcheng Du, Yang Zou, Zixu Wang, Xingyuan Li, Ying Li, Qiang Shen
Hyperspectral Image Fusion (HIF) aims to fuse low-resolution hyperspectral images (LR-HSIs) and high-resolution multispectral images (HR-MSIs) to reconstruct high spatial and high spectral resolution images.
1 code implementation • 19 Nov 2024 • Yang Zou, Zhixin Chen, Zhipeng Zhang, Xingyuan Li, Long Ma, JinYuan Liu, Peng Wang, Yanning Zhang
In this work, we emphasize the infrared spectral distribution fidelity and propose a Contourlet refinement gate framework to restore infrared modal-specific features while preserving spectral distribution fidelity.
no code implementations • 20 Jun 2024 • Xiangli Yang, Xijie Deng, Hanwei Zhang, Yang Zou, Jianxi Yang
Structural health monitoring (SHM) is critical to safeguarding the safety and reliability of aerospace, civil, and mechanical infrastructure.
no code implementations • 12 Jun 2024 • Benjamin Biggs, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data.
no code implementations • CVPR 2024 • Hao Li, Yang Zou, Ying Wang, Orchid Majumder, Yusheng Xie, R. Manmatha, Ashwin Swaminathan, Zhuowen Tu, Stefano Ermon, Stefano Soatto
On the data scaling side, we show the quality and diversity of the training set matters more than simply dataset size.
no code implementations • CVPR 2024 • Robik Shrestha, Yang Zou, Qiuyu Chen, Zhiheng Li, Yusheng Xie, Siqi Deng
In this work, we introduce Fair Retrieval Augmented Generation (FairRAG), a novel framework that conditions pre-trained generative models on reference images retrieved from an external image database to improve fairness in human generation.
no code implementations • 31 Dec 2023 • Xingyuan Li, Yang Zou, JinYuan Liu, Zhiying Jiang, Long Ma, Xin Fan, Risheng Liu
With the rapid progression of deep learning technologies, multi-modality image fusion has become increasingly prevalent in object detection tasks.
4 code implementations • CVPR 2023 • Jongheon Jeong, Yang Zou, Taewan Kim, Dongqing Zhang, Avinash Ravichandran, Onkar Dabeer
Visual anomaly classification and segmentation are vital for automating industrial quality inspection.
Ranked #1 on
zero-shot anomaly detection
on MVTec AD
1 code implementation • 28 Jul 2022 • Yang Zou, Jongheon Jeong, Latha Pemula, Dongqing Zhang, Onkar Dabeer
Visual anomaly detection is commonly used in industrial quality inspection.
Ranked #27 on
Anomaly Detection
on VisA
no code implementations • 10 Aug 2021 • Ashild Kummen, Guanlin Li, Ali Hassan, Teodora Ganeva, Qianying Lu, Robert Shaw, Chenuka Ratwatte, Yang Zou, Lu Han, Emil Almazov, Sheena Visram, Andrew Taylor, Neil J Sebire, Lee Stott, Yvonne Rogers, Graham Roberts, Dean Mohamedally
We also introduce a series of bespoke gesture recognition classifications as DirectInput triggers, including gestures for idle states, auto calibration, depth capture from a 2D RGB webcam stream and tracking of facial motions such as mouth motions, winking, and head direction with rotation.
1 code implementation • NeurIPS 2020 • Zeyi Huang, Yang Zou, Vijayakumar Bhagavatula, Dong Huang
Moreover, the image-level category labels do not enforce consistent object detection across different transformations of the same images.
Ranked #1 on
Weakly Supervised Object Detection
on MSCOCO
no code implementations • 10 Sep 2020 • Yang Zou, Zhikun Zhang, Michael Backes, Yang Zhang
One major privacy attack in this domain is membership inference, where an adversary aims to determine whether a target data sample is part of the training set of a target ML model.
1 code implementation • 8 Aug 2020 • Yunlong Zhang, Changxing Jing, Huangxing Lin, Chaoqi Chen, Yue Huang, Xinghao Ding, Yang Zou
Second, we further consider that the predictions of target samples belonging to the hard class are vulnerable to perturbations.
Semi-supervised Domain Adaptation
Unsupervised Domain Adaptation
1 code implementation • ECCV 2020 • Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Kumar, Jan Kautz
To this end, we propose a joint learning framework that disentangles id-related/unrelated features and enforces adaptation to work on the id-related feature space exclusively.
Ranked #7 on
Unsupervised Domain Adaptation
on Market to MSMT
no code implementations • ICCV 2019 • Xiaofeng Liu, Yang Zou, Tong Che, Peng Ding, Ping Jia, Jane You, Kumar B. V. K
We propose to incorporate inter-class correlations in a Wasserstein training framework by pre-defining ($i. e.,$ using arc length of a circle) or adaptively learning the ground metric.
no code implementations • 23 Oct 2019 • Azeez Oluwafemi, Yang Zou, B. V. K. Vijaya Kumar
Monocular Depth Estimation is usually treated as a supervised and regression problem when it actually is very similar to semantic segmentation task since they both are fundamentally pixel-level classification tasks.
2 code implementations • ICCV 2019 • Yang Zou, Zhiding Yu, Xiaofeng Liu, B. V. K. Vijaya Kumar, Jinsong Wang
Recent advances in domain adaptation show that deep self-training presents a powerful means for unsupervised domain adaptation.
Ranked #18 on
Image-to-Image Translation
on SYNTHIA-to-Cityscapes
1 code implementation • 25 Jul 2019 • Ligong Han, Yang Zou, Ruijiang Gao, Lezi Wang, Dimitris Metaxas
Unsupervised domain adaptation (UDA) aims at inferring class labels for unlabeled target domain given a related labeled source dataset.
1 code implementation • 18 Oct 2018 • Yang Zou, Zhiding Yu, B. V. K. Vijaya Kumar, Jinsong Wang
In this paper, we propose a novel UDA framework based on an iterative self-training procedure, where the problem is formulated as latent variable loss minimization, and can be solved by alternatively generating pseudo labels on target data and re-training the model with these labels.
1 code implementation • ECCV 2018 • Yang Zou, Zhiding Yu, B. V. K. Vijaya Kumar, Jinsong Wang
Recent deep networks achieved state of the art performanceon a variety of semantic segmentation tasks.
3 code implementations • ECCV 2018 • Zhiding Yu, Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, B. V. K. Vijaya Kumar, Jan Kautz
Edge detection is among the most fundamental vision problems for its role in perceptual grouping and its wide applications.
no code implementations • 20 Feb 2018 • Yang Gao, Shouyan Guo, Kaimin Huang, Jiaxin Chen, Qian Gong, Yang Zou, Tong Bai, Gary Overett
By selecting better scales in the region proposal input and by combining feature maps through careful design of the convolutional neural network, we improve performance on smaller objects.
no code implementations • CVPR 2016 • Soheil Kolouri, Yang Zou, Gustavo K. Rohde
Optimal transport distances, otherwise known as Wasserstein distances, have recently drawn ample attention in computer vision and machine learning as a powerful discrepancy measure for probability distributions.