Search Results for author: Siyu Zhou

Found 18 papers, 5 papers with code

Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

no code implementations25 Mar 2025 Tianhao Qi, Jianlong Yuan, Wanquan Feng, Shancheng Fang, Jiawei Liu, Siyu Zhou, Qian He, Hongtao Xie, Yongdong Zhang

Both qualitative and quantitative experiments confirm that Mask$^2$DiT excels in maintaining visual consistency across segments while ensuring semantic alignment between each segment and its corresponding text description.

text annotation Video Generation

MAO: Efficient Model-Agnostic Optimization of Prompt Tuning for Vision-Language Models

1 code implementation23 Mar 2025 Haoyang Li, Siyu Zhou, Liang Wang, Guodong Long

Though CLIP-based prompt tuning significantly enhances pre-trained Vision-Language Models, existing research focuses on reconstructing the model architecture, e. g., additional loss calculation and meta-networks.

AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion

1 code implementation10 Mar 2025 Mingzhen Sun, Weining Wang, Gen Li, Jiawei Liu, Jiahui Sun, Wanquan Feng, Shanshan Lao, Siyu Zhou, Qian He, Jing Liu

To address these issues, we introduce Auto-Regressive Diffusion (AR-Diffusion), a novel model that combines the strengths of auto-regressive and diffusion models for flexible, asynchronous video generation.

Video Generation

I2VControl: Disentangled and Unified Video Motion Synthesis Control

no code implementations26 Nov 2024 Wanquan Feng, Tianhao Qi, Jiawei Liu, Mingzhen Sun, Pengqi Tu, Tianxiang Ma, Fei Dai, Songtao Zhao, Siyu Zhou, Qian He

Video synthesis techniques are undergoing rapid progress, with controllability being a significant aspect of practical usability for end-users.

Motion Synthesis

I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength

no code implementations10 Nov 2024 Wanquan Feng, Jiawei Liu, Pengqi Tu, Tianhao Qi, Mingzhen Sun, Tianxiang Ma, Songtao Zhao, Siyu Zhou, Qian He

To accurately control and adjust the strength of subject motion, we explicitly model the higher-order components of the video trajectory expansion, not merely the linear terms, and design an operator that effectively represents the motion strength.

Video Generation

Global Censored Quantile Random Forest

no code implementations16 Oct 2024 Siyu Zhou, Limin Peng

We demonstrate the superior predictive accuracy of the proposed method over a number of existing alternatives and illustrate the use of the proposed importance ranking measures on both simulated and real data.

Feature Importance quantile regression +1

Body Fat Estimation from Surface Meshes using Graph Neural Networks

no code implementations13 Jul 2023 Tamara T. Mueller, Siyu Zhou, Sophie Starck, Friederike Jungmann, Alexander Ziller, Orhun Aksoy, Danylo Movchan, Rickmer Braren, Georgios Kaissis, Daniel Rueckert

Body fat volume and distribution can be a strong indication for a person's overall health and the risk for developing diseases like type 2 diabetes and cardiovascular diseases.

Autonomic Architecture for Big Data Performance Optimization

no code implementations17 Mar 2023 Mikhail Genkin, Frank Dehne, Anousheh Shahmirza, Pablo Navarro, Siyu Zhou

This paper presents KERMIT - the autonomic architecture for big data capable of automatically tuning Apache Spark and Hadoop on-line, and achieving performance results 30% faster than rule-of-thumb tuning by a human administrator and up to 92% as fast as the fastest possible tuning established by performing an exhaustive search of the tuning parameter space.

Local Repair of Neural Networks Using Optimization

no code implementations28 Sep 2021 Keyvan Majd, Siyu Zhou, Heni Ben Amor, Georgios Fainekos, Sriram Sankaranarayanan

In this paper, we propose a framework to repair a pre-trained feed-forward neural network (NN) to satisfy a set of properties.

Trees, Forests, Chickens, and Eggs: When and Why to Prune Trees in a Random Forest

1 code implementation30 Mar 2021 Siyu Zhou, Lucas Mentch

Due to their long-standing reputation as excellent off-the-shelf predictors, random forests continue remain a go-to model of choice for applied statisticians and data scientists.

Getting Better from Worse: Augmented Bagging and a Cautionary Tale of Variable Importance

no code implementations7 Mar 2020 Lucas Mentch, Siyu Zhou

As the size, complexity, and availability of data continues to grow, scientists are increasingly relying upon black-box learning algorithms that can often provide accurate predictions with minimal a priori model specifications.

Clone Swarms: Learning to Predict and Control Multi-Robot Systems by Imitation

no code implementations5 Dec 2019 Siyu Zhou, Mariano Phielipp, Jorge A. Sefair, Sara I. Walker, Heni Ben Amor

In this paper, we propose SwarmNet -- a neural network architecture that can learn to predict and imitate the behavior of an observed swarm of agents in a centralized manner.

Randomization as Regularization: A Degrees of Freedom Explanation for Random Forest Success

1 code implementation1 Nov 2019 Lucas Mentch, Siyu Zhou

Random forests remain among the most popular off-the-shelf supervised machine learning tools with a well-established track record of predictive accuracy in both regression and classification settings.

regression

Unrestricted Permutation forces Extrapolation: Variable Importance Requires at least One More Model, or There Is No Free Variable Importance

1 code implementation1 May 2019 Giles Hooker, Lucas Mentch, Siyu Zhou

This paper reviews and advocates against the use of permute-and-predict (PaP) methods for interpreting black box functions.

Personalized and Occupational-aware Age Progression by Generative Adversarial Networks

no code implementations26 Nov 2017 Siyu Zhou, Weiqiang Zhao, Jiashi Feng, Hanjiang Lai, Yan Pan, Jian Yin, Shuicheng Yan

Second, we propose a new occupational-aware adversarial face aging network, which learns human aging process under different occupations.

Human Aging

HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval

no code implementations26 Nov 2017 Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin, Shuicheng Yan

The proposed new adversarial network, HashGAN, consists of three building blocks: 1) the feature learning module to obtain feature representations, 2) the generative attention module to generate an attention mask, which is used to obtain the attended (foreground) and the unattended (background) feature representations, 3) the discriminative hash coding module to learn hash functions that preserve the similarities between different modalities.

Cross-Modal Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.