Search Results for author: Tsai-Shien Chen

Found 8 papers, 1 papers with code

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

no code implementations29 Feb 2024 Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov

Next, we finetune a retrieval model on a small subset where the best caption of each video is manually selected and then employ the model in the whole dataset to select the best caption as the annotation.

Retrieval Text Retrieval +3

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

no code implementations22 Feb 2024 Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren, Sergey Tulyakov

Since video content is highly redundant, we argue that naively bringing advances of image models to the video generation domain reduces motion fidelity, visual quality and impairs scalability.

Image Generation Text-to-Video Generation +1

Motion-Conditioned Diffusion Model for Controllable Video Synthesis

no code implementations27 Apr 2023 Tsai-Shien Chen, Chieh Hubert Lin, Hung-Yu Tseng, Tsung-Yi Lin, Ming-Hsuan Yang

In response to this gap, we introduce MCDiff, a conditional diffusion model that generates a video from a starting image frame and a set of strokes, which allow users to specify the intended content and dynamics for synthesis.

Motion Synthesis

Adaptive Region Pooling for Fine-Grained Representation Learning

no code implementations29 Sep 2021 Tsai-Shien Chen, Chih-Ting Liu, Shao-Yi Chien

Fine-grained recognition aims to discriminate the sub-categories of the images within one general category.

Image Classification Image Retrieval +2

Incremental False Negative Detection for Contrastive Learning

no code implementations ICLR 2022 Tsai-Shien Chen, Wei-Chih Hung, Hung-Yu Tseng, Shao-Yi Chien, Ming-Hsuan Yang

Self-supervised learning has recently shown great potential in vision tasks through contrastive learning, which aims to discriminate each image, or instance, in the dataset.

Contrastive Learning Self-Supervised Learning

Viewpoint-Aware Channel-Wise Attentive Network for Vehicle Re-Identification

no code implementations12 Oct 2020 Tsai-Shien Chen, Man-Yu Lee, Chih-Ting Liu, Shao-Yi Chien

Our VCAM enables the feature learning framework channel-wisely reweighing the importance of each feature maps according to the "viewpoint" of input vehicle.

Vehicle Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.