1 code implementation • ECCV 2020 • Linxi Fan, Shyamal Buch, Guanzhi Wang, Ryan Cao, Yuke Zhu, Juan Carlos Niebles, Li Fei-Fei
We analyze the suitability of our new primitive for video action recognition and explore several novel variations of our approach to enable stronger representational flexibility while maintaining an efficient design.
1 code implementation • NeurIPS 2023 • Tony Lee, Michihiro Yasunaga, Chenlin Meng, Yifan Mai, Joon Sung Park, Agrim Gupta, Yunzhi Zhang, Deepak Narayanan, Hannah Benita Teufel, Marco Bellagente, Minguk Kang, Taesung Park, Jure Leskovec, Jun-Yan Zhu, Li Fei-Fei, Jiajun Wu, Stefano Ermon, Percy Liang
The stunning qualitative improvement of recent text-to-image models has led to their widespread attention and adoption.
no code implementations • 2 Nov 2023 • Ruohan Zhang, Sharon Lee, Minjune Hwang, Ayano Hiranaka, Chen Wang, Wensi Ai, Jin Jie Ryan Tan, Shreya Gupta, Yilun Hao, Gabrael Levine, Ruohan Gao, Anthony Norcia, Li Fei-Fei, Jiajun Wu
We present Neural Signal Operated Intelligent Robots (NOIR), a general-purpose, intelligent brain-robot interface system that enables humans to command robots to perform everyday activities through brain signals.
no code implementations • 27 Oct 2023 • Kyle Sargent, Zizhang Li, Tanmay Shah, Charles Herrmann, Hong-Xing Yu, Yunzhi Zhang, Eric Ryan Chan, Dmitry Lagun, Li Fei-Fei, Deqing Sun, Jiajun Wu
Further, we observe that Score Distillation Sampling (SDS) tends to truncate the distribution of complex backgrounds during distillation of 360-degree scenes, and propose "SDS anchoring" to improve the diversity of synthesized novel views.
1 code implementation • 3 Oct 2023 • Emily Jin, Jiaheng Hu, Zhuoyi Huang, Ruohan Zhang, Jiajun Wu, Li Fei-Fei, Roberto Martín-Martín
We present Mini-BEHAVIOR, a novel benchmark for embodied AI that challenges agents to use reasoning and decision-making skills to solve complex activities that resemble everyday human challenges.
no code implementations • 28 Sep 2023 • YiXuan Wang, Zhuoran Li, Mingtong Zhang, Katherine Driggs-Campbell, Jiajun Wu, Li Fei-Fei, Yunzhu Li
These fields capture the dynamics of the underlying 3D environment and encode both semantic features and instance masks.
no code implementations • 18 Sep 2023 • Ran Gong, Qiuyuan Huang, Xiaojian Ma, Hoi Vo, Zane Durante, Yusuke Noda, Zilong Zheng, Song-Chun Zhu, Demetri Terzopoulos, Li Fei-Fei, Jianfeng Gao
Large Language Models (LLMs) have the capacity of performing complex scheduling in a multi-agent system and can coordinate these agents into completing sophisticated tasks that require extensive collaboration.
no code implementations • 2 Sep 2023 • Yuanpei Chen, Chen Wang, Li Fei-Fei, C. Karen Liu
However, the challenges arise due to the high-dimensional action space of dexterous hand and complex compositional dynamics of the long-horizon tasks.
no code implementations • ICCV 2023 • Tiange Xiang, Adam Sun, Jiajun Wu, Ehsan Adeli, Li Fei-Fei
3D understanding and rendering of moving humans from monocular videos is a challenging task.
no code implementations • 28 Jul 2023 • Ayano Hiranaka, Minjune Hwang, Sharon Lee, Chen Wang, Li Fei-Fei, Jiajun Wu, Ruohan Zhang
By combining them, SEED reduces the human effort required in RLHF and increases safety in training robot manipulation with RL in real-world settings.
1 code implementation • 12 Jul 2023 • Wenlong Huang, Chen Wang, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Li Fei-Fei
The composed value maps are then used in a model-based planning framework to zero-shot synthesize closed-loop robot trajectories with robustness to dynamic perturbations.
no code implementations • 29 Jun 2023 • YiXuan Wang, Yunzhu Li, Katherine Driggs-Campbell, Li Fei-Fei, Jiajun Wu
Prior works typically assume representation at a fixed dimension or resolution, which may be inefficient for simple tasks and ineffective for more complicated tasks.
no code implementations • 27 Jun 2023 • Zelun Luo, Yuliang Zou, Yijin Yang, Zane Durante, De-An Huang, Zhiding Yu, Chaowei Xiao, Li Fei-Fei, Animashree Anandkumar
In recent years, differential privacy has seen significant advancements in image classification; however, its application to video activity recognition remains under-explored.
no code implementations • 23 Jun 2023 • Michael Lingelbach, Chengshu Li, Minjune Hwang, Andrey Kurenkov, Alan Lou, Roberto Martín-Martín, Ruohan Zhang, Li Fei-Fei, Jiajun Wu
Embodied AI agents in large scenes often need to navigate to find objects.
1 code implementation • 2 Jun 2023 • Anirudh Sriram, Adrien Gaidon, Jiajun Wu, Juan Carlos Niebles, Li Fei-Fei, Ehsan Adeli
In this work, we propose a novel method for representation learning of multi-view videos, where we explicitly model the representation space to maintain Homography Equivariance (HomE).
no code implementations • CVPR 2023 • Ruohan Gao, Yiming Dou, Hao Li, Tanmay Agarwal, Jeannette Bohg, Yunzhu Li, Li Fei-Fei, Jiajun Wu
We introduce the ObjectFolder Benchmark, a benchmark suite of 10 tasks for multisensory object-centric learning, centered around object recognition, reconstruction, and manipulation with sight, sound, and touch.
1 code implementation • 1 Jun 2023 • Ruohan Gao, Hao Li, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Li Fei-Fei, Jiajun Wu
We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear.
no code implementations • 27 May 2023 • Andrey Kurenkov, Michael Lingelbach, Tanmay Agarwal, Emily Jin, Chengshu Li, Ruohan Zhang, Li Fei-Fei, Jiajun Wu, Silvio Savarese, Roberto Martín-Martín
We evaluate our method in the Dynamic House Simulator, a new benchmark that creates diverse dynamic graphs following the semantic patterns typically seen at homes, and show that NEP can be trained to predict the locations of objects in a variety of environments with diverse object movement dynamics, outperforming baselines both in terms of new scene adaptability and overall accuracy.
no code implementations • 7 Dec 2022 • Hao Li, Yizhi Zhang, Junzhe Zhu, Shaoxiong Wang, Michelle A Lee, Huazhe Xu, Edward Adelson, Li Fei-Fei, Ruohan Gao, Jiajun Wu
Humans use all of their senses to accomplish different tasks in everyday activities.
no code implementations • 11 Nov 2022 • Kuan Fang, Toki Migimatsu, Ajay Mandlekar, Li Fei-Fei, Jeannette Bohg
ATR selects suitable tasks, which consist of an initial environment state and manipulation goal, for learning robust skills by balancing the diversity and feasibility of the tasks.
no code implementations • 13 Oct 2022 • Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi, Sonia Raychaudhuri, Mike Roberts, Silvio Savarese, Manolis Savva, Mohit Shridhar, Niko Sünderhauf, Andrew Szot, Ben Talbot, Joshua B. Tenenbaum, Jesse Thomason, Alexander Toshev, Joanne Truong, Luca Weihs, Jiajun Wu
We present a retrospective on the state of Embodied AI research.
1 code implementation • 9 Oct 2022 • Zixian Ma, Rose Wang, Li Fei-Fei, Michael Bernstein, Ranjay Krishna
These results identify tasks where expectation alignment is a more useful strategy than curiosity-driven exploration for multi-agent coordination, enabling agents to do zero-shot coordination.
2 code implementations • 6 Oct 2022 • Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan
We show that a wide spectrum of robot manipulation tasks can be expressed with multimodal prompts, interleaving textual and visual tokens.
1 code implementation • 30 Jun 2022 • Mark Endo, Kathleen L. Poston, Edith V. Sullivan, Li Fei-Fei, Kilian M. Pohl, Ehsan Adeli
Because of this clinical data scarcity and inspired by the recent advances in self-supervised large-scale language models like GPT-3, we use human motion forecasting as an effective self-supervised pre-training task for the estimation of motor impairment severity.
no code implementations • 23 Jun 2022 • Agrim Gupta, Stephen Tian, Yunzhi Zhang, Jiajun Wu, Roberto Martín-Martín, Li Fei-Fei
This work shows that we can create good video prediction models by pre-training transformers via masked visual modeling.
no code implementations • 13 Jun 2022 • Ziang Liu, Roberto Martín-Martín, Fei Xia, Jiajun Wu, Li Fei-Fei
Robots excel in performing repetitive and precision-sensitive tasks in controlled environments such as warehouses and factories, but have not been yet extended to embodied AI agents providing assistance in household tasks.
no code implementations • 8 Jun 2022 • Carlos Hinojosa, Miguel Marquez, Henry Arguello, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles
The accelerated use of digital cameras prompts an increasing concern about privacy and security, particularly in applications such as action recognition.
1 code implementation • CVPR 2022 • Shyamal Buch, Cristóbal Eyzaguirre, Adrien Gaidon, Jiajun Wu, Li Fei-Fei, Juan Carlos Niebles
Building on recent progress in self-supervised image-language models, we revisit this question in the context of video and language tasks.
Ranked #1 on
Video Question Answering
on MSR-VTT-MC
1 code implementation • CVPR 2022 • Ruohan Gao, Zilin Si, Yen-Yu Chang, Samuel Clarke, Jeannette Bohg, Li Fei-Fei, Wenzhen Yuan, Jiajun Wu
We present ObjectFolder 2. 0, a large-scale, multisensory dataset of common household objects in the form of implicit neural representations that significantly enhances ObjectFolder 1. 0 in three aspects.
2 code implementations • ICLR 2022 • Agrim Gupta, Linxi Fan, Surya Ganguli, Li Fei-Fei
Multiple domains like vision, natural language, and audio are witnessing tremendous progress by leveraging Transformers for large scale pre-training followed by task specific fine tuning.
no code implementations • 9 Dec 2021 • Josiah Wong, Albert Tung, Andrey Kurenkov, Ajay Mandlekar, Li Fei-Fei, Silvio Savarese, Roberto Martín-Martín
Doing this is challenging for two reasons: on the data side, current interfaces make collecting high-quality human demonstrations difficult, and on the learning side, policies trained on limited data can suffer from covariate shift when deployed.
no code implementations • 12 Nov 2021 • Ranjay Krishna, Mitchell Gordon, Li Fei-Fei, Michael Bernstein
Over the last decade, Computer Vision, the branch of Artificial Intelligence aimed at understanding the visual world, has evolved from simply recognizing objects in images to describing pictures, answering questions about images, aiding robots maneuver around physical spaces and even generating novel visual content.
no code implementations • 21 Sep 2021 • Bohan Wu, Suraj Nair, Li Fei-Fei, Chelsea Finn
In this paper, we study the problem of learning a repertoire of low-level skills from raw images that can be sequenced to complete long-horizon visuomotor tasks.
Model-based Reinforcement Learning
reinforcement-learning
+1
no code implementations • 16 Sep 2021 • Ruohan Gao, Yen-Yu Chang, Shivani Mall, Li Fei-Fei, Jiajun Wu
Multisensory object-centric perception, reasoning, and interaction have been a key research topic in recent years.
3 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang
AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
1 code implementation • 13 Aug 2021 • Chen Wang, Claudia Pérez-D'Arpino, Danfei Xu, Li Fei-Fei, C. Karen Liu, Silvio Savarese
Our method co-optimizes a human policy and a robot policy in an interactive learning process: the human policy learns to generate diverse and plausible collaborative behaviors from demonstrations while the robot policy learns to assist by estimating the unobserved latent strategy of its human collaborator.
no code implementations • 6 Aug 2021 • Sanjana Srivastava, Chengshu Li, Michael Lingelbach, Roberto Martín-Martín, Fei Xia, Kent Vainio, Zheng Lian, Cem Gokmen, Shyamal Buch, C. Karen Liu, Silvio Savarese, Hyowon Gweon, Jiajun Wu, Li Fei-Fei
We introduce BEHAVIOR, a benchmark for embodied AI with 100 activities in simulation, spanning a range of everyday household chores such as cleaning, maintenance, and food preparation.
1 code implementation • 6 Aug 2021 • Chengshu Li, Fei Xia, Roberto Martín-Martín, Michael Lingelbach, Sanjana Srivastava, Bokui Shen, Kent Vainio, Cem Gokmen, Gokul Dharan, Tanish Jain, Andrey Kurenkov, C. Karen Liu, Hyowon Gweon, Jiajun Wu, Li Fei-Fei, Silvio Savarese
We evaluate the new capabilities of iGibson 2. 0 to enable robot learning of novel tasks, in the hope of demonstrating the potential of this new simulator to support new research in embodied AI.
1 code implementation • 6 Aug 2021 • Ajay Mandlekar, Danfei Xu, Josiah Wong, Soroush Nasiriany, Chen Wang, Rohun Kulkarni, Li Fei-Fei, Silvio Savarese, Yuke Zhu, Roberto Martín-Martín
Based on the study, we derive a series of lessons including the sensitivity to different algorithmic design choices, the dependence on the quality of the demonstrations, and the variability based on the stopping criteria due to the different objectives in training and evaluation.
no code implementations • 20 Jul 2021 • Kaylee Burns, Christopher D. Manning, Li Fei-Fei
Although virtual agents are increasingly situated in environments where natural language is the most effective mode of interaction with humans, these exchanges are rarely used as an opportunity for learning.
1 code implementation • ACL 2021 • Siddharth Karamcheti, Ranjay Krishna, Li Fei-Fei, Christopher D. Manning
Active learning promises to alleviate the massive data needs of supervised machine learning: it has successfully improved sample efficiency by an order of magnitude on traditional tasks like topic classification and object recognition.
no code implementations • 26 Jun 2021 • Kuan Fang, Yuke Zhu, Silvio Savarese, Li Fei-Fei
To encourage generalizable skills to emerge, our method trains each skill to specialize in the paired task and maximizes the diversity of the generated tasks.
Hierarchical Reinforcement Learning
reinforcement-learning
+1
no code implementations • CVPR 2021 • Zelun Luo, Daniel J. Wu, Ehsan Adeli, Li Fei-Fei
We propose a novel method for privacy-preserving training of deep neural networks leveraging public, out-domain data.
1 code implementation • 17 Jun 2021 • Linxi Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar
A student network then learns to mimic the expert policy by supervised learning with strong augmentations, making its representation more robust against visual variations compared to the expert.
3 code implementations • 15 Jun 2021 • Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Yu Fish Tung, R. T. Pramod, Cameron Holdaway, Sirui Tao, Kevin Smith, Fan-Yun Sun, Li Fei-Fei, Nancy Kanwisher, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith E. Fan
While current vision algorithms excel at many challenging tasks, it is unclear how well they understand the physical dynamics of real-world environments.
1 code implementation • CVPR 2022 • Liangqiong Qu, Yuyin Zhou, Paul Pu Liang, Yingda Xia, Feifei Wang, Ehsan Adeli, Li Fei-Fei, Daniel Rubin
Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution.
1 code implementation • CVPR 2021 • Mandy Lu, Qingyu Zhao, Jiequan Zhang, Kilian M. Pohl, Li Fei-Fei, Juan Carlos Niebles, Ehsan Adeli
Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods.
1 code implementation • 10 Mar 2021 • Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng, Olga Russakovsky
In this paper, we explore the effects of face obfuscation on the popular ImageNet challenge visual recognition benchmark.
no code implementations • CVPR 2021 • Bohan Wu, Suraj Nair, Roberto Martin-Martin, Li Fei-Fei, Chelsea Finn
Our key insight is that greedy and modular optimization of hierarchical autoencoders can simultaneously address both the memory constraints and the optimization challenges of large-scale video prediction.
Ranked #1 on
Video Prediction
on Cityscapes 128x128
no code implementations • 28 Feb 2021 • Chen Wang, Rui Wang, Ajay Mandlekar, Li Fei-Fei, Silvio Savarese, Danfei Xu
Key to such capability is hand-eye coordination, a cognitive ability that enables humans to adaptively direct their movements at task-relevant objects and be invariant to the objects' absolute spatial location.
1 code implementation • 3 Feb 2021 • Agrim Gupta, Silvio Savarese, Surya Ganguli, Li Fei-Fei
However, the principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control, remain elusive, partially due to the substantial challenge of performing large-scale in silico experiments on evolution and learning.
no code implementations • 12 Dec 2020 • Ajay Mandlekar, Danfei Xu, Roberto Martín-Martín, Yuke Zhu, Li Fei-Fei, Silvio Savarese
We develop a simple and effective algorithm to train the policy iteratively on new data collected by the system that encourages the policy to learn how to traverse bottlenecks through the interventions.
no code implementations • 12 Dec 2020 • Albert Tung, Josiah Wong, Ajay Mandlekar, Roberto Martín-Martín, Yuke Zhu, Li Fei-Fei, Silvio Savarese
To address these challenges, we present Multi-Arm RoboTurk (MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms and collect demonstrations for multi-arm tasks.
2 code implementations • 5 Dec 2020 • Bokui Shen, Fei Xia, Chengshu Li, Roberto Martín-Martín, Linxi Fan, Guanzhi Wang, Claudia Pérez-D'Arpino, Shyamal Buch, Sanjana Srivastava, Lyne P. Tchapmi, Micael E. Tchapmi, Kent Vainio, Josiah Wong, Li Fei-Fei, Silvio Savarese
We present iGibson 1. 0, a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes.
no code implementations • 5 Aug 2020 • Pranav Khadpe, Ranjay Krishna, Li Fei-Fei, Jeffrey Hancock, Michael Bernstein
In a third study, we assess effects of metaphor choices on potential users' desire to try out the system and find that users are drawn to systems that project higher competence and warmth.
no code implementations • 17 Jul 2020 • Mandy Lu, Kathleen Poston, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Kilian M. Pohl, Juan Carlos Niebles, Ehsan Adeli
This is the first benchmark for classifying PD patients based on MDS-UPDRS gait severity and could be an objective biomarker for disease severity.
no code implementations • ICLR 2021 • Kuan Fang, Yuke Zhu, Silvio Savarese, Li Fei-Fei
To enable curriculum learning in the absence of a direct indicator of learning progress, we propose to train the task generator by balancing the agent's performance in the generated tasks and the similarity to the target tasks.
1 code implementation • NeurIPS 2020 • Daniel M. Bear, Chaofei Fan, Damian Mrowca, Yunzhu Li, Seth Alter, Aran Nayebi, Jeremy Schwartz, Li Fei-Fei, Jiajun Wu, Joshua B. Tenenbaum, Daniel L. K. Yamins
To overcome these limitations, we introduce the idea of Physical Scene Graphs (PSGs), which represent scenes as hierarchical graphs, with nodes in the hierarchy corresponding intuitively to object parts at different scales, and edges to physical connections between parts.
no code implementations • 13 Mar 2020 • Ajay Mandlekar, Danfei Xu, Roberto Martín-Martín, Silvio Savarese, Li Fei-Fei
In the second stage of GTI, we collect a small set of rollouts from the unconditioned stochastic policy of the first stage, and train a goal-directed agent to generalize to novel start and goal configurations.
no code implementations • 16 Dec 2019 • Kaiyu Yang, Klint Qinami, Li Fei-Fei, Jia Deng, Olga Russakovsky
Computer vision technology is being used by many but remains representative of only a few.
1 code implementation • 15 Dec 2019 • Jingwei Ji, Ranjay Krishna, Li Fei-Fei, Juan Carlos Niebles
Next, by decomposing and learning the temporal changes in visual relationships that result in an action, we demonstrate the utility of a hierarchical event decomposition by enabling few-shot action recognition, achieving 42. 7% mAP using as few as 10 examples.
no code implementations • 2 Dec 2019 • Khaled Jedoui, Ranjay Krishna, Michael Bernstein, Li Fei-Fei
The assumption that these tasks always have exactly one correct answer has resulted in the creation of numerous uncertainty-based measurements, such as entropy and least confidence, which operate over a model's outputs.
no code implementations • 13 Nov 2019 • De-An Huang, Yu-Wei Chao, Chris Paxton, Xinke Deng, Li Fei-Fei, Juan Carlos Niebles, Animesh Garg, Dieter Fox
We further show that by using the automatically inferred goal from the video demonstration, our robot is able to reproduce the same task in a real kitchen environment.
no code implementations • 13 Nov 2019 • Ajay Mandlekar, Fabio Ramos, Byron Boots, Silvio Savarese, Li Fei-Fei, Animesh Garg, Dieter Fox
For simple short-horizon manipulation tasks with modest variation in task instances, offline learning from a small set of demonstrations can produce controllers that successfully solve the task.
no code implementations • 11 Nov 2019 • Ajay Mandlekar, Jonathan Booher, Max Spero, Albert Tung, Anchit Gupta, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei
We evaluate the quality of our platform, the diversity of demonstrations in our dataset, and the utility of our dataset via quantitative and qualitative analysis.
1 code implementation • 30 Oct 2019 • Fei Xia, William B. Shen, Chengshu Li, Priya Kasimbeg, Micael Tchapmi, Alexander Toshev, Li Fei-Fei, Roberto Martín-Martín, Silvio Savarese
We present Interactive Gibson Benchmark, the first comprehensive benchmark for training and evaluating Interactive Navigation: robot navigation strategies where physical interaction with objects is allowed and even encouraged to accomplish a task.
no code implementations • 29 Oct 2019 • Kuan Fang, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei
The fundamental challenge of planning for multi-step manipulation is to find effective and plausible action sequences that lead to the task goal.
no code implementations • 26 Oct 2019 • Zengyi Qin, Kuan Fang, Yuke Zhu, Li Fei-Fei, Silvio Savarese
For this purpose, we present KETO, a framework of learning keypoint representations of tool-based manipulation.
Robotics
2 code implementations • 23 Oct 2019 • Chen Wang, Roberto Martín-Martín, Danfei Xu, Jun Lv, Cewu Lu, Li Fei-Fei, Silvio Savarese, Yuke Zhu
We present 6-PACK, a deep learning approach to category-level 6D object pose tracking on RGB-D data.
Ranked #1 on
6D Pose Estimation using RGBD
on REAL275
(Rerr metric)
2 code implementations • 8 Oct 2019 • Ehsan Adeli, Qingyu Zhao, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Juan Carlos Niebles, Kilian M. Pohl
Presence of bias (in datasets or tasks) is inarguably one of the most critical challenges in machine learning applications that has alluded to pivotal debates in recent years.
2 code implementations • 3 Oct 2019 • Suraj Nair, Yuke Zhu, Silvio Savarese, Li Fei-Fei
Causal reasoning has been an indispensable capability for humans and other intelligent animals to interact with the physical world.
1 code implementation • NeurIPS 2019 • Danfei Xu, Roberto Martín-Martín, De-An Huang, Yuke Zhu, Silvio Savarese, Li Fei-Fei
Recent learning-to-plan methods have shown promising results on planning directly from observation space.
1 code implementation • 28 Sep 2019 • Yunbo Wang, Bo Liu, Jiajun Wu, Yuke Zhu, Simon S. Du, Li Fei-Fei, Joshua B. Tenenbaum
A major difficulty of solving continuous POMDPs is to infer the multi-modal distribution of the unobserved true states and to make the planning algorithm dependent on the perceived uncertainty.
no code implementations • 27 Sep 2019 • Linxi Fan, Yuke Zhu, Jiren Zhu, Zihua Liu, Orien Zeng, Anchit Gupta, Joan Creus-Costa, Silvio Savarese, Li Fei-Fei
We present an overview of SURREAL-System, a reproducible, flexible, and scalable framework for distributed reinforcement learning (RL).
no code implementations • 25 Sep 2019 • Piotr Tatarczyk, Damian Mrowca, Li Fei-Fei, Daniel L. K. Yamins, Nils Thuerey
Recently, neural-network based forward dynamics models have been proposed that attempt to learn the dynamics of physical systems in a deterministic way.
no code implementations • ICCV 2019 • Bokui Shen, Danfei Xu, Yuke Zhu, Leonidas J. Guibas, Li Fei-Fei, Silvio Savarese
A complex visual navigation task puts an agent in different situations which call for a diverse range of visual perception abilities.
no code implementations • 16 Aug 2019 • De-An Huang, Danfei Xu, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei, Juan Carlos Niebles
The key technical challenge is that the symbol grounding is prone to error with limited training data and leads to subsequent symbolic planning failures.
1 code implementation • 28 Jul 2019 • Michelle A. Lee, Yuke Zhu, Peter Zachares, Matthew Tan, Krishnan Srinivasan, Silvio Savarese, Li Fei-Fei, Animesh Garg, Jeannette Bohg
Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback.
no code implementations • ECCV 2020 • Chien-Yi Chang, De-An Huang, Danfei Xu, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles
In this paper, we study the problem of procedure planning in instructional videos, which can be seen as a step towards enabling autonomous agents to plan for complex tasks in everyday settings such as cooking.
no code implementations • 12 Jun 2019 • Apoorva Dornadula, Austin Narcomey, Ranjay Krishna, Michael Bernstein, Li Fei-Fei
We introduce the first scene graph prediction model that supports few-shot learning of predicates.
3 code implementations • ICLR 2019 • Yunbo Wang, Lu Jiang, Ming-Hsuan Yang, Li-Jia Li, Mingsheng Long, Li Fei-Fei
We first evaluate the E3D-LSTM network on widely-used future video prediction datasets and achieve the state-of-the-art performance.
Ranked #1 on
Video Prediction
on KTH
(Cond metric)
1 code implementation • ICCV 2019 • Vincent S. Chen, Paroma Varma, Ranjay Krishna, Michael Bernstein, Christopher Re, Li Fei-Fei
All scene graph models to date are limited to training on a small set of visual relationships that have thousands of training labels each.
Ranked #1 on
Scene Graph Detection
on VRD
no code implementations • NeurIPS 2019 • Sharon Zhou, Mitchell L. Gordon, Ranjay Krishna, Austin Narcomey, Li Fei-Fei, Michael S. Bernstein
We construct Human eYe Perceptual Evaluation (HYPE) a human benchmark that is (1) grounded in psychophysics research in perception, (2) reliable across different sets of randomly sampled outputs from a model, (3) able to produce separable model performances, and (4) efficient in cost and time.
no code implementations • CVPR 2019 • Ranjay Krishna, Michael Bernstein, Li Fei-Fei
We build a model that maximizes mutual information between the image, the expected answer and the generated question.
no code implementations • CVPR 2019 • Kuan Fang, Alexander Toshev, Li Fei-Fei, Silvio Savarese
Many robotic applications require the agent to perform long-horizon tasks in partially observable environments.
1 code implementation • 20 Feb 2019 • Albert Haque, Michelle Guo, Prateek Verma, Li Fei-Fei
We propose spoken sentence embeddings which capture both acoustic and linguistic content.
2 code implementations • CVPR 2019 • Junwei Liang, Lu Jiang, Juan Carlos Niebles, Alexander Hauptmann, Li Fei-Fei
To facilitate the training, the network is learned with an auxiliary task of predicting future location in which the activity will happen.
Ranked #1 on
Activity Prediction
on ActEV
8 code implementations • CVPR 2019 • Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martín-Martín, Cewu Lu, Li Fei-Fei, Silvio Savarese
A key technical challenge in performing 6D object pose estimation from RGB-D image is to fully leverage the two complementary data sources.
Ranked #4 on
6D Pose Estimation
on LineMOD
11 code implementations • CVPR 2019 • Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei
Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space.
Ranked #7 on
Semantic Segmentation
on PASCAL VOC 2012 val
no code implementations • CVPR 2019 • Chien-Yi Chang, De-An Huang, Yanan Sui, Li Fei-Fei, Juan Carlos Niebles
The key technical challenge for discriminative modeling with weak supervision is that the loss function of the ordering supervision is usually formulated using dynamic programming and is thus not differentiable.
4 code implementations • CVPR 2019 • Nam Vo, Lu Jiang, Chen Sun, Kevin Murphy, Li-Jia Li, Li Fei-Fei, James Hays
In this paper, we study the task of image retrieval, where the input query is specified in the form of an image plus some text that describes desired modifications to the input image.
Ranked #2 on
Image Retrieval with Multi-Modal Query
on MIT-States
no code implementations • 1 Dec 2018 • David Xue, Anin Sayana, Evan Darke, Kelly Shen, Jun-Ting Hsieh, Zelun Luo, Li-Jia Li, N. Lance Downing, Arnold Milstein, Li Fei-Fei
As the senior population rapidly increases, it is challenging yet crucial to provide effective long-term care for seniors who live at home or in senior care facilities.
no code implementations • 25 Nov 2018 • Edward Chou, Matthew Tan, Cherry Zou, Michelle Guo, Albert Haque, Arnold Milstein, Li Fei-Fei
Computer-vision hospital systems can greatly assist healthcare workers and improve medical facility treatment, but often face patient resistance due to the perceived intrusiveness and violation of privacy associated with visual surveillance.
no code implementations • 25 Nov 2018 • Edward Chou, Josh Beal, Daniel Levy, Serena Yeung, Albert Haque, Li Fei-Fei
Homomorphic encryption enables arbitrary computation over data while it remains encrypted.
Cryptography and Security
no code implementations • 21 Nov 2018 • Albert Haque, Michelle Guo, Adam S. Miner, Li Fei-Fei
This technology could be deployed to cell phones worldwide and facilitate low-cost universal access to mental health care.
no code implementations • 7 Nov 2018 • Ajay Mandlekar, Yuke Zhu, Animesh Garg, Jonathan Booher, Max Spero, Albert Tung, Julian Gao, John Emmons, Anchit Gupta, Emre Orbay, Silvio Savarese, Li Fei-Fei
Imitation Learning has empowered recent advances in learning robotic manipulation tasks by addressing shortcomings of Reinforcement Learning such as exploration and reward specification.
2 code implementations • 24 Oct 2018 • Michelle A. Lee, Yuke Zhu, Krishnan Srinivasan, Parth Shah, Silvio Savarese, Li Fei-Fei, Animesh Garg, Jeannette Bohg
Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback.
no code implementations • ECCV 2018 • Bingbin Liu, Serena Yeung, Edward Chou, De-An Huang, Li Fei-Fei, Juan Carlos Niebles
A major challenge in computer vision is scaling activity understanding to the long tail of complex activities without requiring collecting large quantities of data for new actions.
no code implementations • ECCV 2018 • Michelle Guo, Edward Chou, De-An Huang, Shuran Song, Serena Yeung, Li Fei-Fei
We propose Neural Graph Matching (NGM) Networks, a novel framework that can learn to recognize a previous unseen 3D action class with only a few examples.
Ranked #1 on
Skeleton Based Action Recognition
on CAD-120
no code implementations • ECCV 2018 • Michelle Guo, Albert Haque, De-An Huang, Serena Yeung, Li Fei-Fei
We propose dynamic task prioritization for multitask learning.
6 code implementations • ECCV 2018 • Jiren Zhu, Russell Kaplan, Justin Johnson, Li Fei-Fei
We show that these encodings are competitive with existing data hiding algorithms, and further that they can be made robust to noise: our models learn to reconstruct hidden information in an encoded image despite the presence of Gaussian blurring, pixel-wise dropout, cropping, and JPEG compression.
no code implementations • CVPR 2019 • De-An Huang, Suraj Nair, Danfei Xu, Yuke Zhu, Animesh Garg, Li Fei-Fei, Silvio Savarese, Juan Carlos Niebles
We hypothesize that to successfully generalize to unseen complex tasks from a single video demonstration, it is necessary to explicitly incorporate the compositional structure of the tasks into the model.
no code implementations • ICML 2018 • Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter Glynn, Yinyu Ye, Li-Jia Li, Li Fei-Fei
One of the most widely used optimization methods for large-scale machine learning problems is distributed asynchronous stochastic gradient descent (DASGD).
no code implementations • 25 Jun 2018 • Kuan Fang, Yuke Zhu, Animesh Garg, Andrey Kurenkov, Viraj Mehta, Li Fei-Fei, Silvio Savarese
We perform both simulated and real-world experiments on two tool-based manipulation tasks: sweeping and hammering.
no code implementations • NeurIPS 2018 • Damian Mrowca, Chengxu Zhuang, Elias Wang, Nick Haber, Li Fei-Fei, Joshua B. Tenenbaum, Daniel L. K. Yamins
Humans have a remarkable capacity to understand the physical dynamics of objects in their environment, flexibly capturing complex structures and interactions at multiple levels of detail.
1 code implementation • NeurIPS 2018 • Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li Fei-Fei, Juan Carlos Niebles
Our goal is to predict future video frames given a sequence of input frames.
no code implementations • CVPR 2018 • De-An Huang, Shyamal Buch, Lucio Dery, Animesh Garg, Li Fei-Fei, Juan Carlos Niebles
In this work, we propose to tackle this new task with a weakly-supervised framework for reference-aware visual grounding in instructional videos, where only the temporal alignment between the transcription and the video segment are available for supervision.
no code implementations • CVPR 2018 • De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Lorenzo Torresani, Manohar Paluri, Li Fei-Fei, Juan Carlos Niebles
The ability to capture temporal information has been critical to the development of video understanding models.
4 code implementations • CVPR 2018 • Justin Johnson, Agrim Gupta, Li Fei-Fei
To overcome this limitation we propose a method for generating images from scene graphs, enabling explicitly reasoning about objects and their relationships.
Ranked #4 on
Layout-to-Image Generation
on Visual Genome 64x64
Image Generation from Scene Graphs
Layout-to-Image Generation
7 code implementations • CVPR 2018 • Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, Alexandre Alahi
Understanding human motion behavior is critical for autonomous moving platforms (like self-driving cars and social robots) if they are to navigate human-centric environments.
Ranked #4 on
Trajectory Prediction
on ETH
no code implementations • CVPR 2018 • Xinlei Chen, Li-Jia Li, Li Fei-Fei, Abhinav Gupta
The framework consists of two core modules: a local module that uses spatial memory to store previous beliefs with parallel updates; and a global graph-reasoning module.
2 code implementations • CVPR 2018 • Ranjay Krishna, Ines Chami, Michael Bernstein, Li Fei-Fei
We formulate the cyclic condition between the entities in a relationship by modelling predicates that connect the entities as shifts in attention from one entity to another.
no code implementations • 24 Feb 2018 • Amy Jin, Serena Yeung, Jeffrey Jopling, Jonathan Krause, Dan Azagury, Arnold Milstein, Li Fei-Fei
We show that our method both effectively detects the spatial bounds of tools as well as significantly outperforms existing methods on tool presence detection.
no code implementations • 21 Feb 2018 • Nick Haber, Damian Mrowca, Li Fei-Fei, Daniel L. K. Yamins
We demonstrate that this policy causes the agent to explore novel and informative interactions with its environment, leading to the generation of a spectrum of complex behaviors, including ego-motion prediction, object attention, and object gathering.
no code implementations • 21 Feb 2018 • Nick Haber, Damian Mrowca, Li Fei-Fei, Daniel L. K. Yamins
Moreover, the world model that the agent learns supports improved performance on object dynamics prediction and localization tasks.
1 code implementation • ICML 2018 • Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, Li Fei-Fei
Recent deep networks are capable of memorizing the entire data even when the labels are completely random.
Ranked #16 on
Image Classification
on WebVision-1000
16 code implementations • ECCV 2018 • Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms.
Ranked #14 on
Neural Architecture Search
on NAS-Bench-201, ImageNet-16-120
(Accuracy (Val) metric)
1 code implementation • ECCV 2018 • Zelun Luo, Jun-Ting Hsieh, Lu Jiang, Juan Carlos Niebles, Li Fei-Fei
We propose a technique that tackles action detection in multimodal videos under a realistic and challenging condition in which only limited training data and partially observed modalities are available.
no code implementations • NeurIPS 2017 • Zelun Luo, Yuliang Zou, Judy Hoffman, Li Fei-Fei
We propose a framework that learns a representation transferable across different domains and tasks in a label efficient manner.
1 code implementation • CVPR 2018 • Zhe Li, Chong Wang, Mei Han, Yuan Xue, Wei Wei, Li-Jia Li, Li Fei-Fei
Accurate identification and localization of abnormalities from radiology images play an integral part in clinical diagnosis and treatment planning.
1 code implementation • 4 Oct 2017 • Danfei Xu, Suraj Nair, Yuke Zhu, Julian Gao, Animesh Garg, Li Fei-Fei, Silvio Savarese
In this work, we propose a novel robot learning framework called Neural Task Programming (NTP), which bridges the idea of few-shot learning from demonstration and neural program induction.
no code implementations • 7 Sep 2017 • Timnit Gebru, Jonathan Krause, Jia Deng, Li Fei-Fei
We present a crowdsourcing workflow to collect image annotations for visually similar synthetic categories without requiring experts.
no code implementations • 7 Sep 2017 • Timnit Gebru, Jonathan Krause, Yi-Lun Wang, Duyun Chen, Jia Deng, Li Fei-Fei
In this work, we leverage the ubiquity of Google Street View images and develop a computer vision pipeline to predict income, per capita carbon emission, crime rates and other city attributes from a single source of publicly available visual data.
no code implementations • ICCV 2017 • Timnit Gebru, Judy Hoffman, Li Fei-Fei
While fine-grained object recognition is an important problem in computer vision, current models are unlikely to accurately classify objects in the wild.
no code implementations • 1 Aug 2017 • Albert Haque, Michelle Guo, Alexandre Alahi, Serena Yeung, Zelun Luo, Alisha Rege, Jeffrey Jopling, Lance Downing, William Beninati, Amit Singh, Terry Platchek, Arnold Milstein, Li Fei-Fei
One in twenty-five patients admitted to a hospital will suffer from a hospital acquired infection.
no code implementations • CVPR 2017 • Katsuyuki Nakamura, Serena Yeung, Alexandre Alahi, Li Fei-Fei
Physiological signals such as heart rate can provide valuable information about an individual's state and activity.
no code implementations • CVPR 2017 • Yuke Zhu, Joseph J. Lim, Li Fei-Fei
Humans possess an extraordinary ability to learn new skills and new knowledge for problem solving.
no code implementations • 9 Jun 2017 • Serena Yeung, Anitha Kannan, Yann Dauphin, Li Fei-Fei
The so-called epitomes of this model are groups of mutually exclusive latent factors that compete to explain the data.
no code implementations • CVPR 2017 • Serena Yeung, Vignesh Ramanathan, Olga Russakovsky, Liyue Shen, Greg Mori, Li Fei-Fei
Our method uses Q-learning to learn a data labeling policy on a small labeled training dataset, and then uses this to automatically label noisy web data for new visual concepts.
no code implementations • ICCV 2017 • Yuke Zhu, Daniel Gordon, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav Gupta, Roozbeh Mottaghi, Ali Farhadi
A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world.
5 code implementations • ICCV 2017 • Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes.
Ranked #5 on
Visual Question Answering (VQA)
on CLEVR-Humans
no code implementations • ICCV 2017 • Agrim Gupta, Justin Johnson, Alexandre Alahi, Li Fei-Fei
Recent progress in style transfer on images has focused on improving the quality of stylized images and speed of methods.
4 code implementations • ICCV 2017 • Ranjay Krishna, Kenji Hata, Frederic Ren, Li Fei-Fei, Juan Carlos Niebles
We also introduce ActivityNet Captions, a large-scale benchmark for dense-captioning events.
no code implementations • CVPR 2017 • De-An Huang, Joseph J. Lim, Li Fei-Fei, Juan Carlos Niebles
We propose an unsupervised method for reference resolution in instructional videos, where the goal is to temporally link an entity (e. g., "dressing") to the action (e. g., "mix yogurt") that produced it.
no code implementations • 22 Feb 2017 • Timnit Gebru, Jonathan Krause, Yi-Lun Wang, Duyun Chen, Jia Deng, Erez Lieberman Aiden, Li Fei-Fei
The United States spends more than $1B each year on initiatives such as the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors.
3 code implementations • CVPR 2017 • Danfei Xu, Yuke Zhu, Christopher B. Choy, Li Fei-Fei
In this work, we explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image.
Ranked #7 on
Panoptic Scene Graph Generation
on PSG Dataset
no code implementations • CVPR 2017 • Zelun Luo, Boya Peng, De-An Huang, Alexandre Alahi, Li Fei-Fei
We present an unsupervised representation learning approach that compactly encodes the motion dependencies in videos.
5 code implementations • CVPR 2017 • Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
When building artificial intelligence systems that can reason and answer questions about visual data, we need diagnostic tests to analyze our progress and discover shortcomings.
no code implementations • CVPR 2016 • Albert Haque, Alexandre Alahi, Li Fei-Fei
We present an attention-based model that reasons on human body shape and motion dynamics to identify individuals in the absence of RGB information, hence in the dark.
3 code implementations • CVPR 2017 • Jonathan Krause, Justin Johnson, Ranjay Krishna, Li Fei-Fei
Recent progress on image captioning has made it possible to generate novel sentences describing images in natural language, but compressing an image into a single sentence can describe visual content in only coarse detail.
no code implementations • 7 Nov 2016 • Adriana Kovashka, Olga Russakovsky, Li Fei-Fei, Kristen Grauman
Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts.
2 code implementations • 16 Sep 2016 • Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, Ali Farhadi
To address the second issue, we propose AI2-THOR framework, which provides an environment with high-quality 3D scenes and physics engine.
no code implementations • 15 Sep 2016 • Kenji Hata, Ranjay Krishna, Li Fei-Fei, Michael S. Bernstein
Microtask crowdsourcing is increasingly critical to the creation of extremely large datasets.
no code implementations • 31 Jul 2016 • Cewu Lu, Ranjay Krishna, Michael Bernstein, Li Fei-Fei
We improve on prior work by leveraging language priors from semantic word embeddings to finetune the likelihood of a predicted relationship.
Ranked #2 on
Scene Graph Generation
on VRD
no code implementations • 28 Jul 2016 • De-An Huang, Li Fei-Fei, Juan Carlos Niebles
We propose a weakly-supervised framework for action labeling in video, where only the order of occurring actions is required during training time.
no code implementations • 7 Jun 2016 • Marius Cătălin Iordan, Armand Joulin, Diane M. Beck, Li Fei-Fei
Our method outperforms the two most commonly used alternatives (anatomical landmark-based AFNI alignment and cortical convexity-based FreeSurfer alignment) in overlap between predicted region and functionally-defined LOC.
no code implementations • CVPR 2016 • Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, Silvio Savarese
Different from the conventional LSTM, we share the information between multiple LSTMs through a new pooling layer.
Ranked #1 on
Trajectory Prediction
on Stanford Drone
(ADE (8/12) @K=5 metric)
79 code implementations • 27 Mar 2016 • Justin Johnson, Alexandre Alahi, Li Fei-Fei
We consider image transformation problems, where an input image is transformed into an output image.
Ranked #4 on
Nuclear Segmentation
on Cell17
3 code implementations • 23 Mar 2016 • Albert Haque, Boya Peng, Zelun Luo, Alexandre Alahi, Serena Yeung, Li Fei-Fei
We propose a viewpoint invariant model for 3D human pose estimation from a single depth image.
Ranked #4 on
Pose Estimation
on ITOP top-view
no code implementations • 14 Feb 2016 • Ranjay Krishna, Kenji Hata, Stephanie Chen, Joshua Kravitz, David A. Shamma, Li Fei-Fei, Michael S. Bernstein
Microtask crowdsourcing has enabled dataset advances in social science and machine learning, but existing crowdsourcing schemes are too expensive to scale up with the expanding volume of data.
no code implementations • ICCV 2015 • Alexandre Alahi, Albert Haque, Li Fei-Fei
Inspired by the recent success of RGB-D cameras, we propose the enrichment of RGB data with an additional "quasi-free" modality, namely, the wireless signal (e. g., wifi or Bluetooth) emitted by individuals' cell phones, referred to as RGB-W.
1 code implementation • CVPR 2016 • Justin Johnson, Andrej Karpathy, Li Fei-Fei
We introduce the dense captioning task, which requires a computer vision system to both localize and describe salient regions in images in natural language.
Ranked #3 on
Dense Captioning
on Visual Genome
1 code implementation • CVPR 2016 • Serena Yeung, Olga Russakovsky, Greg Mori, Li Fei-Fei
In this work we introduce a fully end-to-end approach for action detection in videos that learns to directly predict the temporal bounds of actions.
Ranked #9 on
Temporal Action Localization
on THUMOS’14
(mAP IOU@0.2 metric)
1 code implementation • 20 Nov 2015 • Jonathan Krause, Benjamin Sapp, Andrew Howard, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, Li Fei-Fei
Current approaches for fine-grained recognition do the following: First, recruit experts to annotate a dataset of images, optionally also collecting more structured data in the form of part annotations and bounding boxes.
Ranked #4 on
Fine-Grained Image Classification
on CUB-200-2011
(using extra training data)
no code implementations • CVPR 2016 • Yuke Zhu, Oliver Groth, Michael Bernstein, Li Fei-Fei
It enables a new type of QA with visual answers, in addition to textual answers used in previous work.
Multiple-choice
Multiple Choice Question Answering (MCQA)
+2
no code implementations • CVPR 2016 • Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, Li Fei-Fei
In this paper, we propose a model which learns to detect events in such videos while automatically "attending" to the people responsible for the event.
1 code implementation • 21 Jul 2015 • Serena Yeung, Olga Russakovsky, Ning Jin, Mykhaylo Andriluka, Greg Mori, Li Fei-Fei
Every moment counts in action recognition.
Ranked #7 on
Action Detection
on Multi-THUMOS