Search Results for author: Amir Bar

Found 20 papers, 9 papers with code

Sequential Modeling Enables Scalable Learning for Large Vision Models

1 code implementation • 1 Dec 2023 • Yutong Bai, Xinyang Geng, Karttikeya Mangalam, Amir Bar, Alan Yuille, Trevor Darrell, Jitendra Malik, Alexei A Efros

We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data.

1,592

Paper
Code

Learning Individual Styles of Conversational Gesture

2 code implementations • CVPR 2019 • Shiry Ginosar, Amir Bar, Gefen Kohavi, Caroline Chan, Andrew Owens, Jitendra Malik

Specifically, we perform cross-modal translation from "in-the-wild'' monologue speech of a single speaker to their hand and arm motion.

Ranked #4 on Gesture Generation on BEAT

Gesture Generation Speech-to-Gesture Translation +1

356

Paper
Code

DETReg: Unsupervised Pretraining with Region Priors for Object Detection

1 code implementation • CVPR 2022 • Amir Bar, Xin Wang, Vadim Kantorov, Colorado J Reed, Roei Herzig, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson

Recent self-supervised pretraining methods for object detection largely focus on pretraining the backbone of the object detector, neglecting key parts of detection architecture.

Ranked #1 on Few-Shot Object Detection on COCO 2017

Few-Shot Learning Few-Shot Object Detection +6

332

Paper
Code

Visual Prompting via Image Inpainting

1 code implementation • 1 Sep 2022 • Amir Bar, Yossi Gandelsman, Trevor Darrell, Amir Globerson, Alexei A. Efros

How does one adapt a pre-trained visual model to novel downstream tasks without task-specific finetuning or any model modification?

Ranked #5 on Personalized Segmentation on PerSeg

Colorization Edge Detection +6

274

Paper
Code

Language Generation with Recurrent Generative Adversarial Networks without Pre-training

3 code implementations • 5 Jun 2017 • Ofir Press, Amir Bar, Ben Bogin, Jonathan Berant, Lior Wolf

Generative Adversarial Networks (GANs) have shown great promise recently in image generation.

Text Generation

252

Paper
Code

Object-Region Video Transformers

1 code implementation • CVPR 2022 • Roei Herzig, Elad Ben-Avraham, Karttikeya Mangalam, Amir Bar, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson

In this work, we present Object-Region Video Transformers (ORViT), an \emph{object-centric} approach that extends video transformer layers with a block that directly incorporates object representations.

Ranked #6 on Action Recognition on Diving-48

Action Detection Few-Shot action recognition +3

Paper
Code

Learning Canonical Representations for Scene Graph to Image Generation

2 code implementations • ECCV 2020 • Roei Herzig, Amir Bar, Huijuan Xu, Gal Chechik, Trevor Darrell, Amir Globerson

Generating realistic images of complex visual scenes becomes challenging when one wishes to control the structure of the generated images.

Ranked #3 on Layout-to-Image Generation on Visual Genome 256x256

Layout-to-Image Generation Scene Generation

Paper
Code

Compositional Video Synthesis with Action Graphs

1 code implementation • 27 Jun 2020 • Amir Bar, Roei Herzig, Xiaolong Wang, Anna Rohrbach, Gal Chechik, Trevor Darrell, Amir Globerson

Our generative model for this task (AG2Vid) disentangles motion and appearance features, and by incorporating a scheduling mechanism for actions facilitates a timely and coordinated video generation.

Scheduling Video Generation +2

Paper
Code

Finding Visual Task Vectors

1 code implementation • 8 Apr 2024 • Alberto Hojel, Yutong Bai, Trevor Darrell, Amir Globerson, Amir Bar

In this work, we analyze the activations of MAE-VQGAN, a recent Visual Prompting model, and find task vectors, activations that encode task-specific information.

Visual Prompting

Paper
Code

Compression Fractures Detection on CT

no code implementations • 6 Jun 2017 • Amir Bar, Lior Wolf, Orna Bergman Amitai, Eyal Toledano, Eldad Elnekave

Finally a Recurrent Neural Network (RNN) is utilized to predict whether a vertebral fracture is present in the series of patches.

Computed Tomography (CT)

Paper
Add Code

PHT-bot: Deep-Learning based system for automatic risk stratification of COPD patients based upon signs of Pulmonary Hypertension

no code implementations • 28 May 2019 • David Chettrit, Orna Bregman Amitai, Itamar Tamir, Amir Bar, Eldad Elnekave

Secondary pulmonary hypertension is a manifestation of advanced COPD, which can be reliably diagnosed by the main Pulmonary Artery (PA) to Ascending Aorta (Ao) ratio.

Computed Tomography (CT)

Paper
Add Code

Improved ICH classification using task-dependent learning

no code implementations • 29 Jun 2019 • Amir Bar, Michal Mauda, Yoni Turner, Michal Safadi, Eldad Elnekave

Head CT is one of the most commonly performed imaging studied in the Emergency Department setting and Intracranial hemorrhage (ICH) is among the most critical and timesensitive findings to be detected on Head CT. We present BloodNet, a deep learning architecture designed for optimal triaging of Head CTs, with the goal of decreasing the time from CT acquisition to accurate ICH detection.

Classification General Classification

Paper
Add Code

3D Convolutional Sequence to Sequence Model for Vertebral Compression Fractures Identification in CT

no code implementations • 8 Oct 2020 • David Chettrit, Tomer Meir, Hila Lebel, Mila Orlovsky, Ronen Gordon, Ayelet Akselrod-Ballin, Amir Bar

An osteoporosis-related fracture occurs every three seconds worldwide, affecting one in three women and one in five men aged over 50.

3D Architecture Management

Paper
Add Code

Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens

no code implementations • 13 Jun 2022 • Elad Ben-Avraham, Roei Herzig, Karttikeya Mangalam, Amir Bar, Anna Rohrbach, Leonid Karlinsky, Trevor Darrell, Amir Globerson

We explore a particular instantiation of scene structure, namely a \emph{Hand-Object Graph}, consisting of hands and objects with their locations as nodes, and physical relations of contact/no-contact as edges.

Action Recognition Video Understanding

Paper
Add Code

Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022

no code implementations • 15 Jun 2022 • Elad Ben-Avraham, Roei Herzig, Karttikeya Mangalam, Amir Bar, Anna Rohrbach, Leonid Karlinsky, Trevor Darrell, Amir Globerson

First, as both images and videos contain structured information, we enrich a transformer model with a set of \emph{object tokens} that can be used across images and videos.

Point- of-no-return (PNR) temporal localization Temporal Localization

Paper
Add Code

A Cookbook of Self-Supervised Learning

no code implementations • 24 Apr 2023 • Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann Lecun, Micah Goldblum

Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning.

Navigate Self-Supervised Learning

Paper
Add Code

Stochastic positional embeddings improve masked image modeling

no code implementations • 31 Jul 2023 • Amir Bar, Florian Bordes, Assaf Shocher, Mahmoud Assran, Pascal Vincent, Nicolas Ballas, Trevor Darrell, Amir Globerson, Yann Lecun

Masked Image Modeling (MIM) is a promising self-supervised learning approach that enables learning from unlabeled images.

Language Modelling Masked Language Modeling +3

Paper
Add Code

Object-based (yet Class-agnostic) Video Domain Adaptation

no code implementations • 29 Nov 2023 • Dantong Niu, Amir Bar, Roei Herzig, Trevor Darrell, Anna Rohrbach

Existing video-based action recognition systems typically require dense annotation and struggle in environments when there is significant distribution shift relative to the training data.

Action Recognition Domain Adaptation +1

Paper
Add Code

IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks

no code implementations • 4 Dec 2023 • Jiarui Xu, Yossi Gandelsman, Amir Bar, Jianwei Yang, Jianfeng Gao, Trevor Darrell, Xiaolong Wang

Given a textual description of a visual task (e. g. "Left: input image, Right: foreground segmentation"), a few input-output visual examples, or both, the model in-context learns to solve it for a new test input.

Colorization Foreground Segmentation +3

Paper
Add Code

EgoPet: Egomotion and Interaction Data from an Animal's Perspective

no code implementations • 15 Apr 2024 • Amir Bar, Arya Bakhtiar, Danny Tran, Antonio Loquercio, Jathushan Rajasegaran, Yann Lecun, Amir Globerson, Trevor Darrell

Animals perceive the world to plan their actions and interact with other agents to accomplish complex tasks, demonstrating capabilities that are still unmatched by AI systems.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.