2 code implementations • 24 Sep 2024 • Yael Segal-Feldman, Aviv Shamsian, Aviv Navon, Gill Hetz, Joseph Keshet
Large transformer-based models have significant potential for speech transcription and translation.
2 code implementations • 12 Sep 2024 • Gil Ayache, Menachem Pirchi, Aviv Navon, Aviv Shamsian, Gill Hetz, Joseph Keshet
In this paper, we introduce WhisperNER, a novel model that allows joint speech transcription and entity recognition.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 4 Jun 2024 • Aviv Shamsian, Aviv Navon, Neta Glazer, Gill Hetz, Joseph Keshet
Automatic Speech Recognition (ASR) technology has made significant progress in recent years, providing accurate transcription across various domains.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • CVPR 2024 • Rishub Tamirisa, Chulin Xie, Wenxuan Bao, Andy Zhou, Ron Arel, Aviv Shamsian
Recent methods addressed the client data heterogeneity issue via personalized federated learning (PFL) - a class of FL algorithms aiming to personalize learned global knowledge to better suit the clients' local data distributions.
no code implementations • 17 Feb 2024 • Neta Glazer, Aviv Navon, Aviv Shamsian, Ethan Fetaya
One of the challenges in applying reinforcement learning in a complex real-world environment lies in providing the agent with a sufficiently detailed reward function.
1 code implementation • 6 Feb 2024 • Aviv Shamsian, Aviv Navon, David W. Zhang, Yan Zhang, Ethan Fetaya, Gal Chechik, Haggai Maron
Learning in deep weight spaces (DWS), where neural networks process the weights of other neural networks, is an emerging research direction, with applications to 2D and 3D neural fields (INRs, NeRFs), as well as making inferences about other types of neural networks.
no code implementations • 15 Nov 2023 • Aviv Shamsian, David W. Zhang, Aviv Navon, Yan Zhang, Miltiadis Kofinas, Idan Achituve, Riccardo Valperga, Gertjan J. Burghouts, Efstratios Gavves, Cees G. M. Snoek, Ethan Fetaya, Gal Chechik, Haggai Maron
Learning in weight spaces, where neural networks process the weights of other deep neural networks, has emerged as a promising research direction with applications in various fields, from analyzing and editing neural fields and implicit neural representations, to network pruning and quantization.
no code implementations • 30 Oct 2023 • Daniel Eitan, Menachem Pirchi, Neta Glazer, Shai Meital, Gil Ayach, Gidon Krendel, Aviv Shamsian, Aviv Navon, Gil Hetz, Joseph Keshet
In this work, we introduce a novel approach that integrates domain-specific or secondary LM into general-purpose LM.
1 code implementation • 20 Oct 2023 • Aviv Navon, Aviv Shamsian, Ethan Fetaya, Gal Chechik, Nadav Dym, Haggai Maron
To accelerate the alignment process and improve its quality, we propose a novel framework aimed at learning to solve the weight alignment problem, which we name Deep-Align.
no code implementations • 13 Sep 2023 • Aviv Navon, Aviv Shamsian, Neta Glazer, Gill Hetz, Joseph Keshet
Open vocabulary keyword spotting is a crucial and challenging task in automatic speech recognition (ASR) that focuses on detecting user-defined keywords within a spoken utterance.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 5 Jun 2023 • Yochai Yemini, Aviv Shamsian, Lior Bracha, Sharon Gannot, Ethan Fetaya
We then condition a diffusion model on the video and use the extracted text through a classifier-guidance mechanism where a pre-trained ASR serves as the classifier.
no code implementations • 30 May 2023 • Lior Bracha, Eitan Shaar, Aviv Shamsian, Ethan Fetaya, Gal Chechik
Our results highlight the potential of using pre-trained visual-semantic models for generating high-quality contextual descriptions.
1 code implementation • 31 Jan 2023 • Aviv Shamsian, Aviv Navon, Neta Glazer, Kenji Kawaguchi, Gal Chechik, Ethan Fetaya
Auxiliary learning is an effective method for enhancing the generalization capabilities of trained models, particularly when dealing with small datasets.
2 code implementations • 30 Jan 2023 • Aviv Navon, Aviv Shamsian, Idan Achituve, Ethan Fetaya, Gal Chechik, Haggai Maron
Designing machine learning architectures for processing neural networks in their raw weight matrix form is a newly introduced research direction.
4 code implementations • 2 Feb 2022 • Aviv Navon, Aviv Shamsian, Idan Achituve, Haggai Maron, Kenji Kawaguchi, Gal Chechik, Ethan Fetaya
In this paper, we propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
Ranked #2 on
Multi-Task Learning
on Cityscapes test
1 code implementation • 11 Dec 2021 • Honglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf
Group Activity Recognition detects the activity collectively performed by a group of actors, which requires compositional reasoning of actors and objects.
Ranked #2 on
Group Activity Recognition
on Collective Activity
1 code implementation • NeurIPS 2021 • Idan Achituve, Aviv Shamsian, Aviv Navon, Gal Chechik, Ethan Fetaya
A key challenge in this setting is to learn effectively across clients even though each client has unique data that is often limited in size.
Ranked #1 on
Personalized Federated Learning
on CIFAR-100
2 code implementations • 8 Mar 2021 • Aviv Shamsian, Aviv Navon, Ethan Fetaya, Gal Chechik
In this approach, a central hypernetwork model is trained to generate a set of models, one model for each client.
Ranked #1 on
Personalized Federated Learning
on CIFAR-10
1 code implementation • ICLR 2021 • Aviv Navon, Aviv Shamsian, Gal Chechik, Ethan Fetaya
Here, we tackle the problem of learning the entire Pareto front, with the capability of selecting a desired operating point on the front after training.
1 code implementation • ECCV 2020 • Aviv Shamsian, Ofri Kleinfeld, Amir Globerson, Gal Chechik
The fourth subtask, where a target object is carried by a containing object, is particularly challenging because it requires a system to reason about a moving location of an invisible object.
Ranked #3 on
Video Object Tracking
on CATER