no code implementations • 15 Mar 2022 • Kiana Ehsani, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi
Object manipulation is a critical skill required for Embodied AI agents interacting with the world around them.
2 code implementations • 10 Mar 2022 • Mitchell Wortsman, Gabriel Ilharco, Samir Yitzhak Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S. Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Kornblith, Ludwig Schmidt
In this paper, we revisit the second step of this procedure in the context of fine-tuning large pre-trained models, where fine-tuned models often appear to lie in a single low error basin.
Ranked #1 on
Image Classification
on ImageNet V2
(using extra training data)
no code implementations • 7 Jan 2022 • Rowan Zellers, Jiasen Lu, Ximing Lu, Youngjae Yu, Yanpeng Zhao, Mohammadreza Salehi, Aditya Kusupati, Jack Hessel, Ali Farhadi, Yejin Choi
Given a video, we replace snippets of text and audio with a MASK token; the model learns by choosing the correct masked-out snippet.
Ranked #1 on
Action Classification
on Kinetics-600
(using extra training data)
1 code implementation • 2 Jan 2022 • Sarah Pratt, Luca Weihs, Ali Farhadi
While traditional embodied agents manipulate an environment to best achieve a goal, we argue for an introspective agent, which considers its own abilities in the context of its environment.
no code implementations • 6 Dec 2021 • Vivek Ramanujan, Pavan Kumar Anasosalu Vasu, Ali Farhadi, Oncel Tuzel, Hadi Pouransari
To avoid the cost of backfilling, BCT modifies training of the new model to make its representations compatible with those of the old model.
1 code implementation • EMNLP 2021 • Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, Aaron Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali Farhadi
We investigate these challenges in the context of Iconary, a collaborative game of drawing and guessing based on Pictionary, that poses a novel challenge for the research community.
1 code implementation • 8 Oct 2021 • Elvis Nunez, Maxwell Horton, Anish Prabhu, Anurag Ranjan, Ali Farhadi, Mohammad Rastegari
Our models require no retraining, thus our subspace of models can be deployed entirely on-device to allow adaptive network compression at inference time.
1 code implementation • 4 Sep 2021 • Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig Schmidt
Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements under distribution shift, while preserving high accuracy on the target distribution.
no code implementations • 7 Jul 2021 • Junha Roh, Karthik Desingh, Ali Farhadi, Dieter Fox
Specifically, given a reconstructed 3D scene in the form of point clouds with 3D bounding boxes of potential object candidates, and a language utterance referring to a target object in the scene, our model successfully identifies the target object from a set of potential candidates.
1 code implementation • NeurIPS 2021 • Rowan Zellers, Ximing Lu, Jack Hessel, Youngjae Yu, Jae Sung Park, Jize Cao, Ali Farhadi, Yejin Choi
As humans, we understand events in the visual world contextually, performing multimodal reasoning across time to make inferences about the past, present, and future.
1 code implementation • NeurIPS 2021 • Aditya Kusupati, Matthew Wallingford, Vivek Ramanujan, Raghav Somani, Jae Sung Park, Krishna Pillutla, Prateek Jain, Sham Kakade, Ali Farhadi
We further quantitatively measure the quality of our codes by applying it to the efficient image retrieval as well as out-of-distribution (OOD) detection problems.
no code implementations • ACL 2021 • Rowan Zellers, Ari Holtzman, Matthew Peters, Roozbeh Mottaghi, Aniruddha Kembhavi, Ali Farhadi, Yejin Choi
We propose PIGLeT: a model that learns physical commonsense knowledge through interaction, and then uses this knowledge to ground language.
1 code implementation • CVPR 2021 • Kuo-Hao Zeng, Luca Weihs, Ali Farhadi, Roozbeh Mottaghi
In this paper, we study the problem of interactive navigation where agents learn to change the environment to navigate more efficiently to their goals.
1 code implementation • 20 Feb 2021 • Mitchell Wortsman, Maxwell Horton, Carlos Guestrin, Ali Farhadi, Mohammad Rastegari
Recent observations have advanced our understanding of the neural network optimization landscape, revealing the existence of (1) paths of high accuracy containing diverse solutions and (2) wider minima offering improved performance.
no code implementations • ICLR 2021 • Kiana Ehsani, Daniel Gordon, Thomas Hai Dang Nguyen, Roozbeh Mottaghi, Ali Farhadi
Learning effective representations of visual data that generalize to a variety of downstream tasks has been a long quest for computer vision.
no code implementations • ICLR 2021 • Luca Weihs, Aniruddha Kembhavi, Kiana Ehsani, Sarah M Pratt, Winson Han, Alvaro Herrasti, Eric Kolve, Dustin Schwenk, Roozbeh Mottaghi, Ali Farhadi
A growing body of research suggests that embodied gameplay, prevalent not just in human cultures but across a variety of animal species including turtles and ravens, is critical in developing the neural flexibility for creative problem solving, decision making and socialization.
no code implementations • 18 Nov 2020 • Maxwell Horton, Yanzi Jin, Ali Farhadi, Mohammad Rastegari
In the high-computation regime, we show how to combine our method with generative methods to improve upon state-of-the-art performance for several networks.
1 code implementation • 16 Oct 2020 • Kiana Ehsani, Daniel Gordon, Thomas Nguyen, Roozbeh Mottaghi, Ali Farhadi
Learning effective representations of visual data that generalize to a variety of downstream tasks has been a long quest for computer vision.
no code implementations • ECCV 2020 • Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
Autonomous agents must learn to collaborate.
1 code implementation • 6 Jul 2020 • Matthew Wallingford, Aditya Kusupati, Keivan Alizadeh-Vahid, Aaron Walsman, Aniruddha Kembhavi, Ali Farhadi
To foster research towards the goal of general ML methods, we introduce a new unified evaluation framework - FLUID (Flexible Sequential Data).
1 code implementation • NeurIPS 2020 • Mitchell Wortsman, Vivek Ramanujan, Rosanne Liu, Aniruddha Kembhavi, Mohammad Rastegari, Jason Yosinski, Ali Farhadi
We present the Supermasks in Superposition (SupSup) model, capable of sequentially learning thousands of tasks without catastrophic forgetting.
no code implementations • NAACL 2021 • Gabriel Ilharco, Rowan Zellers, Ali Farhadi, Hannaneh Hajishirzi
The success of large-scale contextual language models has attracted great interest in probing what is encoded in their representations.
no code implementations • ECCV 2020 • Jae Sung Park, Chandra Bhagavatula, Roozbeh Mottaghi, Ali Farhadi, Yejin Choi
In addition, we provide person-grounding (i. e., co-reference links) between people appearing in the image and people mentioned in the textual commonsense descriptions, allowing for tighter integration between images and text.
1 code implementation • CVPR 2020 • Matt Deitke, Winson Han, Alvaro Herrasti, Aniruddha Kembhavi, Eric Kolve, Roozbeh Mottaghi, Jordi Salvador, Dustin Schwenk, Eli VanderBilt, Matthew Wallingford, Luca Weihs, Mark Yatskar, Ali Farhadi
We argue that interactive and embodied visual AI has reached a stage of development similar to visual recognition prior to the advent of these ecosystems.
1 code implementation • NAACL 2021 • Rowan Zellers, Ari Holtzman, Elizabeth Clark, Lianhui Qin, Ali Farhadi, Yejin Choi
We propose TuringAdvice, a new challenge task and dataset for language understanding models.
1 code implementation • CVPR 2020 • Kiana Ehsani, Shubham Tulsiani, Saurabh Gupta, Ali Farhadi, Abhinav Gupta
Our quantitative and qualitative results show that (a) we can predict meaningful forces from videos whose effects lead to accurate imitation of the motions observed, (b) by jointly optimizing for contact point and force prediction, we can improve the performance on both tasks in comparison to independent training, and (c) we can learn a representation from this model that generalizes to novel objects using few shot examples.
1 code implementation • ECCV 2020 • Sarah Pratt, Mark Yatskar, Luca Weihs, Ali Farhadi, Aniruddha Kembhavi
We introduce Grounded Situation Recognition (GSR), a task that requires producing structured semantic summaries of images describing: the primary activity, entities engaged in the activity with their roles (e. g. agent, tool), and bounding-box groundings of entities.
1 code implementation • 18 Mar 2020 • Daniel Gordon, Kiana Ehsani, Dieter Fox, Ali Farhadi
Recent single image unsupervised representation learning techniques show remarkable success on a variety of tasks.
3 code implementations • 15 Feb 2020 • Jesse Dodge, Gabriel Ilharco, Roy Schwartz, Ali Farhadi, Hannaneh Hajishirzi, Noah Smith
We publicly release all of our experimental data, including training and validation scores for 2, 100 trials, to encourage further analysis of training dynamics during fine-tuning.
1 code implementation • ICML 2020 • Aditya Kusupati, Vivek Ramanujan, Raghav Somani, Mitchell Wortsman, Prateek Jain, Sham Kakade, Ali Farhadi
Sparsity in Deep Neural Networks (DNNs) is studied extensively with the focus of maximizing prediction accuracy given an overall parameter budget.
no code implementations • 17 Dec 2019 • Luca Weihs, Aniruddha Kembhavi, Kiana Ehsani, Sarah M Pratt, Winson Han, Alvaro Herrasti, Eric Kolve, Dustin Schwenk, Roozbeh Mottaghi, Ali Farhadi
A growing body of research suggests that embodied gameplay, prevalent not just in human cultures but across a variety of animal species including turtles and ravens, is critical in developing the neural flexibility for creative problem solving, decision making, and socialization.
1 code implementation • CVPR 2020 • Kuo-Hao Zeng, Roozbeh Mottaghi, Luca Weihs, Ali Farhadi
In this paper we address the problem of visual reaction: the task of interacting with dynamic environments where the changes in the environment are not necessarily caused by the agent itself.
3 code implementations • CVPR 2020 • Vivek Ramanujan, Mitchell Wortsman, Aniruddha Kembhavi, Ali Farhadi, Mohammad Rastegari
Training a neural network is synonymous with learning the values of the weights.
Ranked #547 on
Image Classification
on ImageNet
no code implementations • 16 Oct 2019 • Junha Roh, Chris Paxton, Andrzej Pronobis, Ali Farhadi, Dieter Fox
Widespread adoption of self-driving cars will depend not only on their safety but largely on their ability to interact with human users.
1 code implementation • ACL 2019 • Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
Existing open-domain question answering (QA) models are not suitable for real-time usage because they need to process several long documents on-demand for every input query.
1 code implementation • CVPR 2020 • Keivan Alizadeh Vahid, Anish Prabhu, Ali Farhadi, Mohammad Rastegari
By replacing pointwise convolutions with BFT, we reduce the computational complexity of these layers from O(n^2) to O(n\log n) with respect to the number of channels.
4 code implementations • NeurIPS 2019 • Mitchell Wortsman, Ali Farhadi, Mohammad Rastegari
In this work we propose a method for discovering neural wirings.
1 code implementation • CVPR 2019 • Kenneth Marino, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi
In this paper, we address the task of knowledge-based visual question answering and provide a benchmark, called OK-VQA, where the image content is not sufficient to answer the questions, encouraging methods that rely on external knowledge resources.
4 code implementations • NeurIPS 2019 • Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, Yejin Choi
We find that best current discriminators can classify neural fake news from real, human-written, news with 73% accuracy, assuming access to a moderate level of training data.
Ranked #2 on
Fake News Detection
on Grover-Mega
1 code implementation • ACL 2019 • Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, Yejin Choi
In this paper, we show that commonsense inference still proves difficult for even state-of-the-art models, by presenting HellaSwag, a new challenge dataset.
no code implementations • CVPR 2019 • Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi
Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities.
1 code implementation • CVPR 2019 • Yao-Hung Hubert Tsai, Santosh Divvala, Louis-Philippe Morency, Ruslan Salakhutdinov, Ali Farhadi
Visual relationship reasoning is a crucial yet challenging task for understanding rich interactions across visual concepts.
no code implementations • 6 Jan 2019 • Daniel Gordon, Dieter Fox, Ali Farhadi
In this work we propose Hierarchical Planning and Reinforcement Learning (HIP-RL), a method for merging the benefits and capabilities of Symbolic Planning with the learning abilities of Deep Reinforcement Learning.
1 code implementation • CVPR 2019 • Huiyu Wang, Aniruddha Kembhavi, Ali Farhadi, Alan Yuille, Mohammad Rastegari
We formulate the scaling policy as a non-linear function inside the network's structure that (a) is learned from data, (b) is instance specific, (c) does not add extra computation, and (d) can be applied on any network architecture.
2 code implementations • CVPR 2019 • Mitchell Wortsman, Kiana Ehsani, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi
In this paper we study the problem of learning to learn at both training and test time in the context of visual navigation.
Ranked #2 on
Visual Navigation
on AI2-THOR
4 code implementations • CVPR 2019 • Rowan Zellers, Yonatan Bisk, Ali Farhadi, Yejin Choi
While this task is easy for humans, it is tremendously difficult for today's vision systems, requiring higher-order cognition and commonsense reasoning about the world.
Multiple-choice
Multiple Choice Question Answering (MCQA)
+1
1 code implementation • ICLR 2019 • Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta, Roozbeh Mottaghi
Do we use the semantic/functional priors we have built over years to efficiently search and navigate?
1 code implementation • 26 Sep 2018 • Keunhong Park, Konstantinos Rematas, Ali Farhadi, Steven M. Seitz
Existing online 3D shape repositories contain thousands of 3D models but lack photorealistic appearance.
4 code implementations • 7 May 2018 • Hessam Bagherinezhad, Maxwell Horton, Mohammad Rastegari, Ali Farhadi
Among the three main components (data, labels, and models) of any supervised learning system, data and models have been the main subjects of active research.
no code implementations • 25 Apr 2018 • Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Ali Farhadi, Karteek Alahari
In this paper we describe the egocentric aspect of the dataset and present annotations for Charades-Ego with 68, 536 activity instances in 68. 8 hours of first and third-person video, making it one of the largest and most diverse egocentric datasets available.
1 code implementation • CVPR 2018 • Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Ali Farhadi, Karteek Alahari
Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer knowledge between third-person (observer) and first-person (actor).
1 code implementation • EMNLP 2018 • Minjoon Seo, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
We formalize a new modular variant of current question answering tasks by enforcing complete independence of the document encoder from the question encoder.
3 code implementations • ECCV 2018 • Tanmay Gupta, Dustin Schwenk, Ali Farhadi, Derek Hoiem, Aniruddha Kembhavi
Imagining a scene described in natural language with realistic layout and appearance of entities is the ultimate test of spatial, visual, and semantic world knowledge.
319 code implementations • 8 Apr 2018 • Joseph Redmon, Ali Farhadi
At 320x320 YOLOv3 runs in 22 ms at 28. 2 mAP, as accurate as SSD but three times faster.
One-stage Anchor-free Oriented Object Detection
Real-Time Object Detection
no code implementations • ECCV 2018 • Krishna Kumar Singh, Santosh Divvala, Ali Farhadi, Yong Jae Lee
We present a scalable approach for Detecting Objects by transferring Common-sense Knowledge (DOCK) from source to target categories.
1 code implementation • CVPR 2018 • Kiana Ehsani, Hessam Bagherinezhad, Joseph Redmon, Roozbeh Mottaghi, Ali Farhadi
We introduce the task of directly modeling a visually intelligent agent.
1 code implementation • 14 Dec 2017 • Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Daniel Gordon, Yuke Zhu, Abhinav Gupta, Ali Farhadi
We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at http://ai2thor. allenai. org.
1 code implementation • CVPR 2018 • Daniel Gordon, Aniruddha Kembhavi, Mohammad Rastegari, Joseph Redmon, Dieter Fox, Ali Farhadi
Our experiments show that our proposed model outperforms popular single controller based methods on IQUAD V1.
no code implementations • CVPR 2018 • Jonghyun Choi, Jayant Krishnamurthy, Aniruddha Kembhavi, Ali Farhadi
Diagrams often depict complex phenomena and serve as a good test bed for visual and textual reasoning.
1 code implementation • ICLR 2018 • Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
Inspired by the principles of speed reading, we introduce Skim-RNN, a recurrent neural network (RNN) that dynamically decides to update only a small fraction of the hidden state for relatively unimportant input tokens.
no code implementations • 13 Sep 2017 • Nancy Xin Ru Wang, Ali Farhadi, Rajesh Rao, Bingni Brunton
This paper describes our approach to detect and to predict natural human arm movements in the future, a key challenge in brain computer interfacing that has never before been attempted.
no code implementations • CVPR 2017 • Aniruddha Kembhavi, Minjoon Seo, Dustin Schwenk, Jonghyun Choi, Ali Farhadi, Hannaneh Hajishirzi
Our analysis shows that a significant portion of questions require complex parsing of the text and the diagrams and reasoning, indicating that our dataset is more complex compared to previous machine comprehension and visual question answering datasets.
no code implementations • ICCV 2017 • Yuke Zhu, Daniel Gordon, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav Gupta, Roozbeh Mottaghi, Ali Farhadi
A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world.
10 code implementations • 17 May 2017 • Daniel Gordon, Ali Farhadi, Dieter Fox
Robust object tracking requires knowledge and understanding of the object being tracked: its appearance, its motion, and how it changes over time.
1 code implementation • CVPR 2018 • Kiana Ehsani, Roozbeh Mottaghi, Ali Farhadi
Objects often occlude each other in scenes; Inferring their appearance beyond their visible parts plays an important role in scene understanding, depth estimation, object interaction and manipulation.
no code implementations • ICCV 2017 • Roozbeh Mottaghi, Connor Schenck, Dieter Fox, Ali Farhadi
Doing so requires estimating the volume of the cup, approximating the amount of water in the pitcher, and predicting the behavior of water when we tilt the pitcher.
234 code implementations • CVPR 2017 • Joseph Redmon, Ali Farhadi
On the 156 classes not in COCO, YOLO9000 gets 16. 0 mAP.
Ranked #4 on
Dense Object Detection
on SKU-110K
2 code implementations • CVPR 2017 • Gunnar A. Sigurdsson, Santosh Divvala, Ali Farhadi, Abhinav Gupta
Actions are more than just movements and trajectories: we cook to eat and we hold a cup to drink from it.
Ranked #11 on
Action Detection
on Charades
2 code implementations • CVPR 2017 • Mark Yatskar, Vicente Ordonez, Luke Zettlemoyer, Ali Farhadi
Semantic sparsity is a common challenge in structured visual classification problems; when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the training set.
no code implementations • CVPR 2017 • Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi
We introduce LCNN, a lookup-based convolutional neural network that encodes convolutions by few lookups to a dictionary that is trained to cover the space of weights in CNNs.
23 code implementations • 5 Nov 2016 • Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi
Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query.
Ranked #4 on
Question Answering
on CNN / Daily Mail
2 code implementations • 16 Sep 2016 • Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, Ali Farhadi
To address the second issue, we propose AI2-THOR framework, which provides an environment with high-quality 3D scenes and physics engine.
no code implementations • 25 Jul 2016 • Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta
We conclude that the optimal strategy is to ask as many questions as possible in a HIT (up to 52 binary questions after watching a 30-second video clip in our experiments).
2 code implementations • 14 Jun 2016 • Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
In this paper, we study the problem of question answering when reasoning over multiple facts is required.
Ranked #2 on
Question Answering
on bAbi
no code implementations • CVPR 2016 • Mark Yatskar, Luke Zettlemoyer, Ali Farhadi
This paper introduces situation recognition, the problem of producing a concise summary of the situation an image depicts including: (1) the main activity (e. g., clipping), (2) the participating actors, objects, substances, and locations (e. g., man, shears, sheep, wool, and field) and most importantly (3) the roles these participants play in the activity (e. g., the man is clipping, the shears are his tool, the wool is being clipped from the sheep, and the clipping is in a field).
no code implementations • CVPR 2016 • Roozbeh Mottaghi, Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi
Direct and explicit estimation of the forces and the motion of objects from a single image is extremely challenging.
no code implementations • CVPR 2016 • Roozbeh Mottaghi, Hannaneh Hajishirzi, Ali Farhadi
With the recent progress in visual recognition, we have already started to see a surge of vision related real-world applications.
3 code implementations • 13 Apr 2016 • Junyuan Xie, Ross Girshick, Ali Farhadi
As 3D movie viewing becomes mainstream and Virtual Reality (VR) market emerges, the demand for 3D contents is growing rapidly.
no code implementations • 6 Apr 2016 • Gunnar A. Sigurdsson, Gül Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, Abhinav Gupta
Each video is annotated by multiple free-text descriptions, action labels, action intervals and classes of interacted objects.
1 code implementation • 24 Mar 2016 • Aniruddha Kembhavi, Mike Salvato, Eric Kolve, Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi
We define syntactic parsing of diagrams as learning to infer DPGs for diagrams and study semantic interpretation and reasoning of diagrams in the context of diagram question answering.
no code implementations • 17 Mar 2016 • Roozbeh Mottaghi, Mohammad Rastegari, Abhinav Gupta, Ali Farhadi
To build a dataset of forces in scenes, we reconstructed all images in SUN RGB-D dataset in a physics simulator to estimate the physical movements of objects caused by external forces applied to them.
16 code implementations • 16 Mar 2016 • Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi
We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks.
no code implementations • 2 Feb 2016 • Hessam Bagherinezhad, Hannaneh Hajishirzi, Yejin Choi, Ali Farhadi
In this paper, we introduce a method to automatically infer object sizes, leveraging visual and textual information from web.
no code implementations • 4 Dec 2015 • Babak Saleh, Ahmed Elgammal, Jacob Feldman, Ali Farhadi
In this paper we study various types of atypicalities in images in a more comprehensive way than has been done before.
1 code implementation • CVPR 2016 • Xiaolong Wang, Ali Farhadi, Abhinav Gupta
In this paper, we propose a novel representation for actions by modeling an action as a transformation which changes the state of the environment before the action happens (precondition) to the state after the action (effect).
no code implementations • ICCV 2015 • Bilge Soran, Ali Farhadi, Linda Shapiro
The overall prediction accuracy is 46. 2% when only 10 frames of an action are seen (2/3 of a sec).
16 code implementations • 19 Nov 2015 • Junyuan Xie, Ross Girshick, Ali Farhadi
Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms.
Ranked #4 on
Image Clustering
on CMU-PIE
no code implementations • 12 Nov 2015 • Roozbeh Mottaghi, Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi
Direct and explicit estimation of the forces and the motion of objects from a single image is extremely challenging.
no code implementations • NeurIPS 2015 • Fereshteh Sadeghi, C. Lawrence Zitnick, Ali Farhadi
In this paper, we study the problem of answering visual analogy questions.
no code implementations • ICCV 2015 • Hamid Izadinia, Fereshteh Sadeghi, Santosh Kumar Divvala, Yejin Choi, Ali Farhadi
Next, we show that the association of high-quality segmentations to textual phrases aids in richer semantic understanding and reasoning of these textual phrases.
134 code implementations • CVPR 2016 • Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation.
Ranked #1 on
Real-Time Object Detection
on PASCAL VOC 2007
no code implementations • CVPR 2015 • Fereshteh Sadeghi, Santosh K. Kumar Divvala, Ali Farhadi
How can we know whether a statement about our world is valid.
no code implementations • CVPR 2015 • Mohammad Rastegari, Hannaneh Hajishirzi, Ali Farhadi
In this paper we present a bottom-up method to instance-level Multiple Instance Learning (MIL) that learns to discover positive instances with globally constrained reasoning about local pairwise similarities.
no code implementations • 25 Nov 2014 • Hamid Izadinia, Ali Farhadi, Aaron Hertzmann, Matthew D. Hoffman
This paper proposes direct learning of image classification from user-supplied tags, without filtering.
no code implementations • 9 Nov 2014 • Babak Saleh, Ali Farhadi, Ahmed Elgammal
When describing images, humans tend not to talk about the obvious, but rather mention what they find interesting.
no code implementations • CVPR 2014 • Peng Zhang, Jiuling Wang, Ali Farhadi, Martial Hebert, Devi Parikh
We show that a surprisingly straightforward and general approach, that we call ALERT, can predict the likely accuracy (or failure) of a variety of computer vision systems semantic segmentation, vanishing point and camera parameter estimation, and image memorability prediction on individual input images.
no code implementations • CVPR 2014 • Hamid Izadinia, Fereshteh Sadeghi, Ali Farhadi
In this paper, we propose a method to learn scene structures that can encode three main interlacing components of a scene: the scene category, the context-specific appearance of objects, and their layout.
no code implementations • CVPR 2014 • Santosh K. Divvala, Ali Farhadi, Carlos Guestrin
How can we learn a model for any concept that exhaustively covers all its appearance variations, while requiring minimal or no human supervision for compiling the vocabulary of visual variance, gathering the training images and annotations, and learning the models?
no code implementations • CVPR 2013 • Jonghyun Choi, Mohammad Rastegari, Ali Farhadi, Larry S. Davis
We propose a method to expand the visual coverage of training sets that consist of a small number of labeled examples using learned attributes.
no code implementations • CVPR 2013 • Mohammad Rastegari, Ali Diba, Devi Parikh, Ali Farhadi
We exploit a discriminative binary space to compute these geometric quantities efficiently.
no code implementations • CVPR 2013 • Babak Saleh, Ali Farhadi, Ahmed Elgammal
When describing images, humans tend not to talk about the obvious, but rather mention what they find interesting.