no code implementations • 15 Jan 2025 • Jingyuan Chen, Fuchen Long, Jie An, Zhaofan Qiu, Ting Yao, Jiebo Luo, Tao Mei
Extensive experiments of long video generation on the VBench benchmark demonstrate the superiority of our Ouroboros-Diffusion, particularly in terms of subject consistency, motion smoothness, and temporal consistency.
no code implementations • 8 Dec 2024 • Zhenghong Zhou, Jie An, Jiebo Luo
Precise camera pose control is crucial for video generation with diffusion models.
1 code implementation • 28 Oct 2024 • Jie An, De Wang, Pengsheng Guo, Jiebo Luo, Alexander Schwing
Furthermore, we empirically find that both the placement and the effective attention size of these local attention windows are crucial factors.
no code implementations • 13 Aug 2024 • Sota Sato, Jie An, Zhenya Zhang, Ichiro Hasuo
We present a bounded model checking algorithm for signal temporal logic (STL) that exploits mixed-integer linear programming (MILP).
no code implementations • 4 Jan 2024 • Jie An, Zhengyuan Yang, JianFeng Wang, Linjie Li, Zicheng Liu, Lijuan Wang, Jiebo Luo
The first module, similar to a standard DDPM, learns to predict the added noise and is unaffected by the metric function.
1 code implementation • 29 Dec 2023 • Yunlong Tang, Jing Bi, Siting Xu, Luchuan Song, Susan Liang, Teng Wang, Daoan Zhang, Jie An, Jingyang Lin, Rongyi Zhu, Ali Vosoughi, Chao Huang, Zeliang Zhang, Pinxin Liu, Mingqian Feng, Feng Zheng, JianGuo Zhang, Ping Luo, Jiebo Luo, Chenliang Xu
With the burgeoning growth of online video platforms and the escalating volume of video content, the demand for proficient video understanding tools has intensified markedly.
no code implementations • 11 Oct 2023 • Jie An, Zhengyuan Yang, Linjie Li, JianFeng Wang, Kevin Lin, Zicheng Liu, Lijuan Wang, Jiebo Luo
We hope our proposed framework, benchmark, and LMM evaluation could help establish the intriguing interleaved image-text generation task.
1 code implementation • 14 Aug 2023 • Alexander Martin, Haitian Zheng, Jie An, Jiebo Luo
In this work, we use text-guided latent diffusion models for zero-shot image-to-image translation (I2I) across large domain gaps (longI2I), where large amounts of new visual features and new geometry need to be generated to enter the target domain.
1 code implementation • 26 Jun 2023 • Siyu Huang, Jie An, Donglai Wei, Zudi Lin, Jiebo Luo, Hanspeter Pfister
However, given a UNIT model trained on certain domains, it is difficult for current methods to incorporate new domains because they often need to train the full model on both existing and new domains.
1 code implementation • 28 May 2023 • Zhenya Zhang, Jie An, Paolo Arcaini, Ichiro Hasuo
The classic STL monitoring is performed by computing a robustness interval that specifies, at each instant, how far the monitored signals are from violating and satisfying the specification.
no code implementations • 8 May 2023 • Junyu Chen, Jie An, Hanjia Lyu, Christopher Kanan, Jiebo Luo
Assessing the artness of AI-generated images continues to be a challenge within the realm of image generation.
no code implementations • 17 Apr 2023 • Jie An, Songyang Zhang, Harry Yang, Sonal Gupta, Jia-Bin Huang, Jiebo Luo, Xi Yin
In contrast, we propose a parameter-free temporal shift module that can leverage the spatial U-Net as is for video generation.
1 code implementation • CVPR 2023 • Siyu Huang, Jie An, Donglai Wei, Jiebo Luo, Hanspeter Pfister
The mechanism of existing style transfer algorithms is by minimizing a hybrid loss function to push the generated image toward high similarities in both content and style.
1 code implementation • 23 Nov 2022 • Junyu Chen, Jie An, Hanjia Lyu, Christopher Kanan, Jiebo Luo
Visual-textual sentiment analysis aims to predict sentiment with the input of a pair of image and text, which poses a challenge in learning effective features for diverse input images.
2 code implementations • 29 Sep 2022 • Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv Taigman
We propose Make-A-Video -- an approach for directly translating the tremendous recent progress in Text-to-Image (T2I) generation to Text-to-Video (T2V).
Ranked #3 on
Text-to-Video Generation
on MSR-VTT
(CLIP-FID metric)
no code implementations • 9 Feb 2022 • Jie Chen, Chang Liu, Jiawu Xie, Jie An, Nan Huang
In particular, this method breaks through the limitations of the existing methods, not only achieves good results in multivariate separation, but also effectively separates signals when mixed with 40dB Gaussian noise signals.
1 code implementation • CVPR 2021 • Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu, Jiebo Luo
The forward inference projects input images into deep features, while the backward inference remaps deep features back to input images in a lossless and unbiased way.
no code implementations • 22 Jun 2020 • Jie An, Tianlang Chen, Songyang Zhang, Jiebo Luo
This work proposes a novel framework consisting of a reference image retrieval step and a global sentiment transfer step to transfer sentiments of images according to a given sentiment tag.
no code implementations • 16 Jun 2020 • Jie An, Tao Li, Hao-Zhi Huang, Li Shen, Xuan Wang, Yongyi Tang, Jinwen Ma, Wei Liu, Jiebo Luo
Extracting effective deep features to represent content and style information is the key to universal style transfer.
no code implementations • 5 Dec 2019 • Jie An, Haoyi Xiong, Jun Huan, Jiebo Luo
Our method consists of a construction step (C-step) to build a photorealistic stylization network and a pruning step (P-step) for acceleration.
1 code implementation • 23 Oct 2019 • Jie An, Mingshuai Chen, Bohua Zhan, Naijun Zhan, Miaomiao Zhang
We present an algorithm for active learning of deterministic timed automata with a single clock.
Formal Languages and Automata Theory
no code implementations • 6 Jul 2019 • Jie An, Haoyi Xiong, Jiebo Luo, Jun Huan, Jinwen Ma
Given a pair of images as the source of content and the reference of style, existing solutions usually first train an auto-encoder (AE) to reconstruct the image using deep features and then embeds pre-defined style transfer modules into the AE reconstruction procedure to transfer the style of the reconstructed image through modifying the deep features.
no code implementations • 6 Jun 2019 • Jie An, Haoyi Xiong, Jinwen Ma, Jiebo Luo, Jun Huan
Finally compared to existing universal style transfer networks for photorealistic rendering such as PhotoWCT that stacks multiple well-trained auto-encoders and WCT transforms in a non-end-to-end manner, the architectures designed by StyleNAS produce better style-transferred images with details preserving, using a tiny number of operators/parameters, and enjoying around 500x inference time speed-up.
no code implementations • 28 May 2019 • Mingshuai Chen, Jian Wang, Jie An, Bohua Zhan, Deepak Kapur, Naijun Zhan
Nonlinear interpolants have been shown useful for the verification of programs and hybrid systems in contexts of theorem proving, model checking, abstract interpretation, etc.
no code implementations • 25 May 2018 • Hanchao Li, Pengfei Xiong, Jie An, Lingxue Wang
A Pyramid Attention Network(PAN) is proposed to exploit the impact of global contextual information in semantic segmentation.