Search Results for author: Orr Zohar

Found 4 papers, 2 papers with code

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

no code implementations • 15 Mar 2024 • Xiaohan Wang, Yuhui Zhang, Orr Zohar, Serena Yeung-Levy

Long-form video understanding represents a significant challenge within computer vision, demanding a model capable of reasoning over long multi-modal sequences.

Ranked #1 on Zero-Shot Video Question Answer on NExT-QA

Language Modelling Large Language Model +2

Paper
Add Code

Open World Object Detection in the Era of Foundation Models

no code implementations • 10 Dec 2023 • Orr Zohar, Alejandro Lozano, Shelly Goel, Serena Yeung, Kuan-Chieh Wang

We exploit the inherent connection between classes in application-driven datasets and introduce a novel method, Foundation Object detection Model for the Open world, or FOMO, which identifies unknown objects based on their shared attributes with the base known objects.

Object object-detection +1

Paper
Add Code

LOVM: Language-Only Vision Model Selection

1 code implementation • NeurIPS 2023 • Orr Zohar, Shih-Cheng Huang, Kuan-Chieh Wang, Serena Yeung

As the number of open-source VLM variants increases, there is a need for an efficient model selection strategy that does not require access to a curated evaluation dataset.

Model Selection

Paper
Code

PROB: Probabilistic Objectness for Open World Object Detection

1 code implementation • CVPR 2023 • Orr Zohar, Kuan-Chieh Wang, Serena Yeung

The resulting Probabilistic Objectness transformer-based open-world detector, PROB, integrates our framework into traditional object detection models, adapting them for the open-world setting.

Object object-detection +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.