Search Results for author: Yang Zou

Found 24 papers, 11 papers with code

Efficient Scaling of Diffusion Transformers for Text-to-Image Generation

no code implementations16 Dec 2024 Hao Li, Shamit Lal, Zhiheng Li, Yusheng Xie, Ying Wang, Yang Zou, Orchid Majumder, R. Manmatha, Zhuowen Tu, Stefano Ermon, Stefano Soatto, Ashwin Swaminathan

We empirically study the scaling properties of various Diffusion Transformers (DiTs) for text-to-image generation by performing extensive and rigorous ablations, including training scaled DiTs ranging from 0. 3B upto 8B parameters on datasets up to 600M images.

Text-to-Image Generation

Modality Decoupling is All You Need: A Simple Solution for Unsupervised Hyperspectral Image Fusion

no code implementations6 Dec 2024 Songcheng Du, Yang Zou, Zixu Wang, Xingyuan Li, Ying Li, Qiang Shen

Hyperspectral Image Fusion (HIF) aims to fuse low-resolution hyperspectral images (LR-HSIs) and high-resolution multispectral images (HR-MSIs) to reconstruct high spatial and high spectral resolution images.

valid

Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution

1 code implementation19 Nov 2024 Yang Zou, Zhixin Chen, Zhipeng Zhang, Xingyuan Li, Long Ma, JinYuan Liu, Peng Wang, Yanning Zhang

In this work, we emphasize the infrared spectral distribution fidelity and propose a Contourlet refinement gate framework to restore infrared modal-specific features while preserving spectral distribution fidelity.

Image Enhancement Image Super-Resolution +1

Enhancing robustness of data-driven SHM models: adversarial training with circle loss

no code implementations20 Jun 2024 Xiangli Yang, Xijie Deng, Hanwei Zhang, Yang Zou, Jianxi Yang

Structural health monitoring (SHM) is critical to safeguarding the safety and reliability of aerospace, civil, and mechanical infrastructure.

Structural Health Monitoring

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

no code implementations12 Jun 2024 Benjamin Biggs, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto

We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data.

Continual Learning Memorization +1

FairRAG: Fair Human Generation via Fair Retrieval Augmentation

no code implementations CVPR 2024 Robik Shrestha, Yang Zou, Qiuyu Chen, Zhiheng Li, Yusheng Xie, Siqi Deng

In this work, we introduce Fair Retrieval Augmented Generation (FairRAG), a novel framework that conditions pre-trained generative models on reference images retrieved from an external image database to improve fairness in human generation.

Diversity Fairness +2

From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

no code implementations31 Dec 2023 Xingyuan Li, Yang Zou, JinYuan Liu, Zhiying Jiang, Long Ma, Xin Fan, Risheng Liu

With the rapid progression of deep learning technologies, multi-modality image fusion has become increasingly prevalent in object detection tasks.

Bilevel Optimization Infrared And Visible Image Fusion +2

MotionInput v2.0 supporting DirectX: A modular library of open-source gesture-based machine learning and computer vision methods for interacting and controlling existing software with a webcam

no code implementations10 Aug 2021 Ashild Kummen, Guanlin Li, Ali Hassan, Teodora Ganeva, Qianying Lu, Robert Shaw, Chenuka Ratwatte, Yang Zou, Lu Han, Emil Almazov, Sheena Visram, Andrew Taylor, Neil J Sebire, Lee Stott, Yvonne Rogers, Graham Roberts, Dean Mohamedally

We also introduce a series of bespoke gesture recognition classifications as DirectInput triggers, including gestures for idle states, auto calibration, depth capture from a 2D RGB webcam stream and tracking of facial motions such as mouth motions, winking, and head direction with rotation.

Gesture Recognition

Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

no code implementations10 Sep 2020 Yang Zou, Zhikun Zhang, Michael Backes, Yang Zhang

One major privacy attack in this domain is membership inference, where an adversary aims to determine whether a target data sample is part of the training set of a target ML model.

BIG-bench Machine Learning Transfer Learning

Hard Class Rectification for Domain Adaptation

1 code implementation8 Aug 2020 Yunlong Zhang, Changxing Jing, Huangxing Lin, Chaoqi Chen, Yue Huang, Xinghao Ding, Yang Zou

Second, we further consider that the predictions of target samples belonging to the hard class are vulnerable to perturbations.

Semi-supervised Domain Adaptation Unsupervised Domain Adaptation

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

1 code implementation ECCV 2020 Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Kumar, Jan Kautz

To this end, we propose a joint learning framework that disentangles id-related/unrelated features and enforces adaptation to work on the id-related feature space exclusively.

Person Re-Identification Unsupervised Domain Adaptation

Conservative Wasserstein Training for Pose Estimation

no code implementations ICCV 2019 Xiaofeng Liu, Yang Zou, Tong Che, Peng Ding, Ping Jia, Jane You, Kumar B. V. K

We propose to incorporate inter-class correlations in a Wasserstein training framework by pre-defining ($i. e.,$ using arc length of a circle) or adaptively learning the ground metric.

ARC Pose Estimation

Deep Classification Network for Monocular Depth Estimation

no code implementations23 Oct 2019 Azeez Oluwafemi, Yang Zou, B. V. K. Vijaya Kumar

Monocular Depth Estimation is usually treated as a supervised and regression problem when it actually is very similar to semantic segmentation task since they both are fundamentally pixel-level classification tasks.

Classification General Classification +3

Unsupervised Domain Adaptation via Calibrating Uncertainties

1 code implementation25 Jul 2019 Ligong Han, Yang Zou, Ruijiang Gao, Lezi Wang, Dimitris Metaxas

Unsupervised domain adaptation (UDA) aims at inferring class labels for unlabeled target domain given a related labeled source dataset.

Unsupervised Domain Adaptation

Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training

1 code implementation18 Oct 2018 Yang Zou, Zhiding Yu, B. V. K. Vijaya Kumar, Jinsong Wang

In this paper, we propose a novel UDA framework based on an iterative self-training procedure, where the problem is formulated as latent variable loss minimization, and can be solved by alternatively generating pseudo labels on target data and re-training the model with these labels.

Pseudo Label Semantic Segmentation +2

Simultaneous Edge Alignment and Learning

3 code implementations ECCV 2018 Zhiding Yu, Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, B. V. K. Vijaya Kumar, Jan Kautz

Edge detection is among the most fundamental vision problems for its role in perceptual grouping and its wide applications.

Edge Detection Representation Learning

Scale Optimization for Full-Image-CNN Vehicle Detection

no code implementations20 Feb 2018 Yang Gao, Shouyan Guo, Kaimin Huang, Jiaxin Chen, Qian Gong, Yang Zou, Tong Bai, Gary Overett

By selecting better scales in the region proposal input and by combining feature maps through careful design of the convolutional neural network, we improve performance on smaller objects.

Object object-detection +2

Sliced Wasserstein Kernels for Probability Distributions

no code implementations CVPR 2016 Soheil Kolouri, Yang Zou, Gustavo K. Rohde

Optimal transport distances, otherwise known as Wasserstein distances, have recently drawn ample attention in computer vision and machine learning as a powerful discrepancy measure for probability distributions.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.