no code implementations • 23 Jan 2024 • Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Guanghui Liu, Amit Raj, Yuanzhen Li, Michael Rubinstein, Tomer Michaeli, Oliver Wang, Deqing Sun, Tali Dekel, Inbar Mosseri
We introduce Lumiere -- a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion -- a pivotal challenge in video synthesis.
Ranked #6 on Text-to-Video Generation on UCF-101
1 code implementation • ICCV 2023 • Roni Paiss, Ariel Ephrat, Omer Tov, Shiran Zada, Inbar Mosseri, Michal Irani, Tali Dekel
Our counting loss is deployed over automatically-created counterfactual examples, each consisting of an image and a caption containing an incorrect object count.
1 code implementation • CVPR 2020 • Sagie Benaim, Ariel Ephrat, Oran Lang, Inbar Mosseri, William T. Freeman, Michael Rubinstein, Michal Irani, Tali Dekel
We demonstrate how those learned features can boost the performance of self-supervised action recognition, and can be used for video retrieval.
1 code implementation • ICLR 2019 • Tavi Halperin, Ariel Ephrat, Yedid Hoshen
In this work, we introduce a new method---Neural Egg Separation---to tackle the scenario of extracting a signal from an unobserved distribution additively mixed with a signal from an observed distribution.
1 code implementation • ICML'19 2018 • Tavi Halperin, Ariel Ephrat, Yedid Hoshen
In this work, we introduce a new method---Neural Egg Separation---to tackle the scenario of extracting a signal from an unobserved distribution additively mixed with a signal from an observed distribution.
1 code implementation • 19 Aug 2018 • Tavi Halperin, Ariel Ephrat, Shmuel Peleg
This alignment is based on deep audio-visual features, mapping the lips video and the speech signal to a shared representation.
5 code implementations • 10 Apr 2018 • Ariel Ephrat, Inbar Mosseri, Oran Lang, Tali Dekel, Kevin Wilson, Avinatan Hassidim, William T. Freeman, Michael Rubinstein
Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video.
no code implementations • 22 Aug 2017 • Aviv Gabbay, Ariel Ephrat, Tavi Halperin, Shmuel Peleg
Isolating the voice of a specific person while filtering out other voices or background noises is challenging when video is shot in noisy environments.
no code implementations • 1 Aug 2017 • Ariel Ephrat, Tavi Halperin, Shmuel Peleg
Speechreading is the task of inferring phonetic information from visually observed articulatory facial movements, and is a notoriously difficult task for humans to perform.
no code implementations • 2 Jan 2017 • Ariel Ephrat, Shmuel Peleg
Speechreading is a notoriously difficult task for humans to perform.
no code implementations • 28 Apr 2015 • Yair Poleg, Ariel Ephrat, Shmuel Peleg, Chetan Arora
Furthermore, our CNN is able to recognize whether a video is egocentric or not with 99. 2% accuracy, up by 24% from current state-of-the-art.