no code implementations • 13 Dec 2024 • Yushu Wu, Zhixing Zhang, Yanyu Li, Yanwu Xu, Anil Kag, Yang Sui, Huseyin Coskun, Ke Ma, Aleksei Lebedev, Ju Hu, Dimitris Metaxas, Yanzhi Wang, Sergey Tulyakov, Jian Ren
We have witnessed the unprecedented success of diffusion-based video generation over the past year.
no code implementations • 12 Dec 2024 • Dongting Hu, Jierun Chen, Xijie Huang, Huseyin Coskun, Arpit Sahni, Aarush Gupta, Anujraaj Goyal, Dishani Lahiri, Rajesh Singh, Yerlan Idelbayev, Junli Cao, Yanyu Li, Kwang-Ting Cheng, S. -H. Gary Chan, Mingming Gong, Sergey Tulyakov, Anil Kag, Yanwu Xu, Jian Ren
For the first time, our model SnapGen, demonstrates the generation of 1024x1024 px images on a mobile device around 1. 4 seconds.
Ranked #11 on
Text-to-Image Generation
on GenEval
1 code implementation • 6 Dec 2024 • Yitian Zhang, Huseyin Coskun, Xu Ma, Huan Wang, Ke Ma, Xi, Chen, Derek Hao Hu, Yun Fu
Thus, we propose a general framework, named Scala, to enable a single network to represent multiple smaller ViTs with flexible inference capability, which aligns with the inherent design of ViT to vary from widths.
no code implementations • 7 Nov 2024 • Anil Kag, Huseyin Coskun, Jierun Chen, Junli Cao, Willi Menapace, Aliaksandr Siarohin, Sergey Tulyakov, Jian Ren
Neural network architecture design requires making many crucial decisions.
no code implementations • 23 Oct 2024 • Shyamgopal Karthik, Huseyin Coskun, Zeynep Akata, Sergey Tulyakov, Jian Ren, Anil Kag
In this work, we investigate a scalable approach for collecting large-scale and fully synthetic datasets for DPO training.
1 code implementation • 20 Jul 2022 • Huseyin Coskun, Alireza Zareian, Joshua L. Moore, Federico Tombari, Chen Wang
Specifically, we outperform the state of the art by 7% on UCF and 4% on HMDB for video retrieval, and 5% on UCF and 6% on HMDB for video classification
no code implementations • 14 Jan 2022 • John Ridley, Huseyin Coskun, David Joseph Tan, Nassir Navab, Federico Tombari
The video action segmentation task is regularly explored under weaker forms of supervision, such as transcript supervision, where a list of actions is easier to obtain than dense frame-wise labels.
no code implementations • CVPR 2022 • Weizhe Liu, Bugra Tekin, Huseyin Coskun, Vibhav Vineet, Pascal Fua, Marc Pollefeys
To this end, we propose an approach to enforce temporal priors on the optimal transport matrix, which leverages temporal consistency, while allowing for variations in the order of actions.
no code implementations • CVPR 2021 • Sanjay Haresh, Sateesh Kumar, Huseyin Coskun, Shahram Najam Syed, Andrey Konin, Muhammad Zeeshan Zia, Quoc-Huy Tran
To overcome this problem, we propose a temporal regularization term (i. e., Contrastive-IDM) which encourages different frames to be mapped to different points in the embedding space.
no code implementations • 23 Nov 2019 • Husna Betul Coskun, Huseyin Coskun
The indirect transactions between sectors of an economic system has been a long-standing open problem.
1 code implementation • ICCV 2019 • Janis Postels, Francesco Ferroni, Huseyin Coskun, Nassir Navab, Federico Tombari
We present a sampling-free approach for computing the epistemic uncertainty of a neural network.
no code implementations • 22 Jul 2019 • Huseyin Coskun, Zeeshan Zia, Bugra Tekin, Federica Bogo, Nassir Navab, Federico Tombari, Harpreet Sawhney
The lack of large-scale real datasets with annotations makes transfer learning a necessity for video activity understanding.
2 code implementations • ECCV 2018 • Huseyin Coskun, David Joseph Tan, Sailesh Conjeti, Nassir Navab, Federico Tombari
Nevertheless, we believe that traditional approaches such as L2 distance or Dynamic Time Warping based on hand-crafted local pose metrics fail to appropriately capture the semantic relationship across motions and, as such, are not suitable for being employed as metrics within these tasks.
no code implementations • ICCV 2017 • Huseyin Coskun, Felix Achilles, Robert DiPietro, Nassir Navab, Federico Tombari
One-shot pose estimation for tasks such as body joint localization, camera pose estimation, and object tracking are generally noisy, and temporal filters have been extensively used for regularization.
no code implementations • 6 Aug 2017 • Huseyin Coskun, Felix Achilles, Robert DiPietro, Nassir Navab, Federico Tombari
One-shot pose estimation for tasks such as body joint localization, camera pose estimation, and object tracking are generally noisy, and temporal filters have been extensively used for regularization.