no code implementations • 4 Dec 2024 • Shuai Tan, Biao Gong, Yutong Feng, Kecheng Zheng, Dandan Zheng, Shuwei Shi, Yujun Shen, Jingdong Chen, Ming Yang
Text serves as the key control signal in video generation due to its narrative nature.
1 code implementation • 31 Oct 2024 • Lianghua Huang, Wei Wang, Zhi-Fan Wu, Yupeng Shi, Huanzhang Dou, Chen Liang, Yutong Feng, Yu Liu, Jingren Zhou
While task-specific in terms of tuning data, our framework remains task-agnostic in architecture and pipeline, offering a powerful tool for the community and providing valuable insights for further research on product-level task-agnostic generation systems.
no code implementations • 19 Oct 2024 • Lianghua Huang, Wei Wang, Zhi-Fan Wu, Huanzhang Dou, Yupeng Shi, Yutong Feng, Chen Liang, Yu Liu, Jingren Zhou
In this work, we introduce Group Diffusion Transformers (GDTs), a novel framework that unifies diverse visual generation tasks by redefining them as a group generation problem.
no code implementations • 17 Oct 2024 • Yujie Wei, Shiwei Zhang, Hangjie Yuan, Xiang Wang, Haonan Qiu, Rui Zhao, Yutong Feng, Feng Liu, Zhizhong Huang, Jiaxin Ye, Yingya Zhang, Hongming Shan
In this paper, we present DreamVideo-2, a zero-shot video customization framework capable of generating videos with a specific subject and motion trajectory, guided by a single image and a bounding box sequence, respectively, and without the need for test-time fine-tuning.
no code implementations • 11 Jun 2024 • Xi Chen, Yutong Feng, Mengting Chen, Yiyang Wang, Shilong Zhang, Yu Liu, Yujun Shen, Hengshuang Zhao
Image editing serves as a practical yet challenging task considering the diverse demands from users, where one of the hardest parts is to precisely describe how the edited image should look like.
no code implementations • 25 Mar 2024 • Shilong Zhang, Lianghua Huang, Xi Chen, Yifei Zhang, Zhi-Fan Wu, Yutong Feng, Wei Wang, Yujun Shen, Yu Liu, Ping Luo
This work presents FlashFace, a practical tool with which users can easily personalize their own photos on the fly by providing one or a few reference face images and a text prompt.
no code implementations • 2 Mar 2024 • Yutong Feng, Qiongyan Wang, Yutong Xia, Junlin Huang, Siru Zhong, Yuxuan Liang
The air quality inference problem aims to utilize historical data from a limited number of observation sites to infer the air quality index at an unknown location.
no code implementations • CVPR 2024 • Biao Gong, Siteng Huang, Yutong Feng, Shiwei Zhang, Yuyuan Li, Yu Liu
To align the generated image with layout instructions we present a training-free layout calibration system SimM that intervenes in the generative process on the fly during inference time.
no code implementations • 5 Dec 2023 • Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao
In particular, considering the facts that (1) text can only describe motions roughly (e. g., regardless of the moving speed) and (2) text may include both content and motion descriptions, we introduce a motion intensity estimation module as well as a text re-weighting module to reduce the ambiguity of text-to-motion mapping.
1 code implementation • CVPR 2024 • Yutong Feng, Biao Gong, Di Chen, Yujun Shen, Yu Liu, Jingren Zhou
Existing text-to-image (T2I) diffusion models usually struggle in interpreting complex prompts, especially those with quantity, object-attribute binding, and multi-subject descriptions.
no code implementations • 27 Nov 2023 • Biao Gong, Siteng Huang, Yutong Feng, Shiwei Zhang, Yuyuan Li, Yu Liu
To align the generated image with layout instructions, we present a training-free layout calibration system SimM that intervenes in the generative process on the fly during inference time.
no code implementations • CVPR 2024 • Siteng Huang, Biao Gong, Yutong Feng, Xi Chen, Yuqian Fu, Yu Liu, Donglin Wang
Experimental results show that existing subject-driven customization methods fail to learn the representative characteristics of actions and struggle in decoupling actions from context features, including appearance.
no code implementations • 17 Apr 2023 • Bing Luo, Yutong Feng, Shiqiang Wang, Jianwei Huang, Leandros Tassiulas
Incentive mechanism is crucial for federated learning (FL) when rational clients do not have the same interests in the global model as the server.
1 code implementation • CVPR 2024 • Siteng Huang, Biao Gong, Yutong Feng, Min Zhang, Yiliang Lv, Donglin Wang
Recent compositional zero-shot learning (CZSL) methods adapt pre-trained vision-language models (VLMs) by constructing trainable prompts only for composed state-object pairs.
no code implementations • ICCV 2023 • Yutong Feng, Biao Gong, Jianwen Jiang, Yiliang Lv, Yujun Shen, Deli Zhao, Jingren Zhou
ViM consists of a zoo of lightweight plug-in modules, each of which is independently learned on a midstream dataset with a shared frozen backbone.
no code implementations • 14 Feb 2023 • Biao Gong, Shuai Tan, Yutong Feng, Xiaoying Xie, Yuyuan Li, Chaochao Chen, Kecheng Zheng, Yujun Shen, Deli Zhao
This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data.
no code implementations • 9 Oct 2022 • Xinwei Zhang, Jianwen Jiang, Yutong Feng, Zhi-Fan Wu, Xibin Zhao, Hai Wan, Mingqian Tang, Rong Jin, Yue Gao
Although a number of studies are devoted to novel category discovery, most of them assume a static setting where both labeled and unlabeled data are given at once for finding new categories.
1 code implementation • 9 Jun 2022 • Si Shen, Jiangfeng Liu, Litao Lin, Ying Huang, Lin Zhang, Chang Liu, Yutong Feng, Dongbo Wang
The academic literature of social sciences records human civilization and studies human social problems.
no code implementations • ICLR 2022 • Yutong Feng, Jianwen Jiang, Mingqian Tang, Rong Jin, Yue Gao
Though for most cases, the pre-training stage is conducted based on supervised methods, recent works on self-supervised pre-training have shown powerful transferability and even outperform supervised pre-training on multiple downstream tasks.
no code implementations • 24 Jun 2021 • Zhiwu Qing, Xiang Wang, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Nong Sang
Temporal action localization aims to localize starting and ending time with action category.
no code implementations • 20 Jun 2021 • Xiang Wang, Zhiwu Qing, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Yuanjie Shao, Nong Sang
Then our proposed Local-Global Background Modeling Network (LGBM-Net) is trained to localize instances by using only video-level labels based on Multi-Instance Learning (MIL).
Weakly-supervised Learning Weakly-supervised Temporal Action Localization +1
1 code implementation • 20 Jun 2021 • Xiang Wang, Zhiwu Qing, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Nong Sang
We calculate the detection results by assigning the proposals with corresponding classification results.
Ranked #3 on Temporal Action Localization on ActivityNet-1.3 (using extra training data)
no code implementations • 15 Jun 2021 • Yutong Feng, Jianwen Jiang, Ziyuan Huang, Zhiwu Qing, Xiang Wang, Shiwei Zhang, Mingqian Tang, Yue Gao
This paper presents our solution to the AVA-Kinetics Crossover Challenge of ActivityNet workshop at CVPR 2021.
Ranked #4 on Spatio-Temporal Action Localization on AVA-Kinetics (using extra training data)
1 code implementation • 13 Jun 2021 • Zhiwu Qing, Ziyuan Huang, Xiang Wang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Marcelo H. Ang Jr, Nong Sang
This technical report analyzes an egocentric video action detection method we used in the 2021 EPIC-KITCHENS-100 competition hosted in CVPR2021 Workshop.
1 code implementation • 9 Jun 2021 • Ziyuan Huang, Zhiwu Qing, Xiang Wang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Zhurong Xia, Mingqian Tang, Nong Sang, Marcelo H. Ang Jr
In this paper, we present empirical results for training a stronger video vision transformer on the EPIC-KITCHENS-100 Action Recognition dataset.
no code implementations • CVPR 2021 • Xuancheng Zhang, Yutong Feng, Siqi Li, Changqing Zou, Hai Wan, Xibin Zhao, Yandong Guo, Yue Gao
This paper presents a view-guided solution for the task of point cloud completion.
Ranked #3 on Point Cloud Completion on ShapeNet-ViPC
no code implementations • 1 Jan 2021 • Yutong Feng, Jianwen Jiang, Yue Gao
To tackle this problem, we introduce incremental graph learning (IGL), a general framework to formulate the learning on growing graphs in an incremental manner, where traditional graph learning method could be deployed as a basic model.
no code implementations • ICCV 2021 • Siqi Li, Yutong Feng, Yipeng Li, Yu Jiang, Changqing Zou, Yue Gao
Therefore, it is imperative to explore the algorithm of event stream super-resolution, which is a non-trivial task due to the sparsity and strong spatio-temporal correlation of the events from an event camera.
no code implementations • 21 Dec 2019 • Boyue Fang, Yutong Feng
Based on iterative optimization and activation function in deep learning, we proposed a new analytical framework of high-frequency trading information, that reduced structural loss in the assembly of Volume-synchronized probability of Informed Trading ($VPIN$), Generalized Autoregressive Conditional Heteroscedasticity (GARCH) and Support Vector Machine (SVM) to make full use of the order book information.
Trading and Market Microstructure
2 code implementations • 28 Nov 2018 • Yutong Feng, Yifan Feng, Haoxuan You, Xibin Zhao, Yue Gao
However, there is little effort on using mesh data in recent years, due to the complexity and irregularity of mesh data.