Search Results for author: Juan Carlos Niebles

Found 84 papers, 30 papers with code

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition

1 code implementation ECCV 2020 Linxi Fan, Shyamal Buch, Guanzhi Wang, Ryan Cao, Yuke Zhu, Juan Carlos Niebles, Li Fei-Fei

We analyze the suitability of our new primitive for video action recognition and explore several novel variations of our approach to enable stronger representational flexibility while maintaining an efficient design.

Action Recognition Temporal Action Localization +1

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

2 code implementations23 Feb 2024 JianGuo Zhang, Tian Lan, Rithesh Murthy, Zhiwei Liu, Weiran Yao, Juntao Tan, Thai Hoang, Liangwei Yang, Yihao Feng, Zuxin Liu, Tulika Awalgaonkar, Juan Carlos Niebles, Silvio Savarese, Shelby Heinecke, Huan Wang, Caiming Xiong

It meticulously standardizes and unifies these trajectories into a consistent format, streamlining the creation of a generic data loader optimized for agent training.

Causal Layering via Conditional Entropy

no code implementations19 Jan 2024 Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese

Under appropriate assumptions and conditioning, we can separate the sources or sinks from the remainder of the nodes by comparing their conditional entropy to the unconditional entropy of their noise.

Causal Discovery

Editing Arbitrary Propositions in LLMs without Subject Labels

no code implementations15 Jan 2024 Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese

On datasets of binary propositions derived from the CounterFact dataset, we show that our method -- without access to subject labels -- performs close to state-of-the-art L\&E methods which has access subject labels.

Language Modelling Large Language Model +1

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning

1 code implementation30 Nov 2023 Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles

Vision-language pre-training and instruction tuning have demonstrated general-purpose capabilities in 2D visual reasoning tasks by aligning visual encoders with state-of-the-art large language models (LLMs).

Visual Reasoning

Temporally Disentangled Representation Learning under Unknown Nonstationarity

1 code implementation NeurIPS 2023 Xiangchen Song, Weiran Yao, Yewen Fan, Xinshuai Dong, Guangyi Chen, Juan Carlos Niebles, Eric Xing, Kun Zhang

In unsupervised causal representation learning for sequential data with time-delayed latent causal influences, strong identifiability results for the disentanglement of causally-related latent variables have been established in stationary settings by leveraging temporal structure.

Disentanglement

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

no code implementations4 Aug 2023 Weiran Yao, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Yihao Feng, Le Xue, Rithesh Murthy, Zeyuan Chen, JianGuo Zhang, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

This demonstrates that using policy gradient optimization to improve language agents, for which we believe our work is one of the first, seems promising and can be applied to optimize other models in the agent architecture to enhance agent performances over time.

Language Modelling

HomE: Homography-Equivariant Video Representation Learning

1 code implementation2 Jun 2023 Anirudh Sriram, Adrien Gaidon, Jiajun Wu, Juan Carlos Niebles, Li Fei-Fei, Ehsan Adeli

In this work, we propose a novel method for representation learning of multi-view videos, where we explicitly model the representation space to maintain Homography Equivariance (HomE).

Action Classification Action Recognition +2

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding

1 code implementation14 May 2023 Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese

Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning multimodal features across 3D shapes, their 2D counterparts, and language descriptions.

Ranked #4 on 3D Point Cloud Classification on ScanObjectNN (using extra training data)

3D Point Cloud Classification Representation Learning +1

Procedure-Aware Pretraining for Instructional Video Understanding

1 code implementation CVPR 2023 Honglu Zhou, Roberto Martín-Martín, Mubbasir Kapadia, Silvio Savarese, Juan Carlos Niebles

This graph can then be used to generate pseudo labels to train a video representation that encodes the procedural knowledge in a more accessible form to generalize to multiple procedure understanding tasks.

Video Understanding

Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations

no code implementations CVPR 2023 Vibashan VS, Ning Yu, Chen Xing, Can Qin, Mingfei Gao, Juan Carlos Niebles, Vishal M. Patel, ran Xu

In summary, an OV method learns task-specific information using strong supervision from base annotations and novel category information using weak supervision from image-captions pairs.

Image Captioning Instance Segmentation +2

On the Unlikelihood of D-Separation

no code implementations10 Mar 2023 Itai Feigenbaum, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Devansh Arpit

We then provide an analytic average case analysis of the PC Algorithm for causal discovery, as well as a variant of the SGS Algorithm we call UniformSGS.

Causal Discovery

Model-Agnostic Hierarchical Attention for 3D Object Detection

no code implementations6 Jan 2023 Manli Shu, Le Xue, Ning Yu, Roberto Martín-Martín, Juan Carlos Niebles, Caiming Xiong, ran Xu

By plugging our proposed modules into the state-of-the-art transformer-based 3D detector, we improve the previous best results on both benchmarks, with the largest improvement margin on small objects.

3D Object Detection Object +1

Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding

no code implementations22 Aug 2022 Stephen Su, Samuel Kwong, Qingyu Zhao, De-An Huang, Juan Carlos Niebles, Ehsan Adeli

In this work, we propose a generalized notion of multi-task learning by incorporating both auxiliary tasks that the model should perform well on and adversarial tasks that the model should not perform well on.

Action Recognition Multi-Task Learning +3

PrivHAR: Recognizing Human Actions From Privacy-preserving Lens

no code implementations8 Jun 2022 Carlos Hinojosa, Miguel Marquez, Henry Arguello, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles

The accelerated use of digital cameras prompts an increasing concern about privacy and security, particularly in applications such as action recognition.

Action Recognition Privacy Preserving +1

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

1 code implementation CVPR 2022 Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi

To achieve this, we first introduce an entity prompter module, which is trained with VTC to produce the similarity between a video crop and text prompts instantiated with entity names.

Entity Alignment Retrieval +3

MOMA: Multi-Object Multi-Actor Activity Parsing

no code implementations NeurIPS 2021 Zelun Luo, Wanze Xie, Siddharth Kapoor, Yiyun Liang, Michael Cooper, Juan Carlos Niebles, Ehsan Adeli, Fei-Fei Li

This paper introduces Activity Parsing as the overarching task of temporal segmentation and classification of activities, sub-activities, atomic actions, along with an instance-level understanding of actors, objects, and their relationships in videos.

Object

PreViTS: Contrastive Pretraining with Video Tracking Supervision

no code implementations1 Dec 2021 Brian Chen, Ramprasaath R. Selvaraju, Shih-Fu Chang, Juan Carlos Niebles, Nikhil Naik

In this work, we propose PreViTS, an SSL framework that utilizes an unsupervised tracking signal for selecting clips containing the same object, which helps better utilize temporal transformations of objects.

Action Classification Self-Supervised Learning +1

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels

1 code implementation18 Nov 2021 Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, ran Xu, Wenhao Liu, Caiming Xiong

To enlarge the set of base classes, we propose a method to automatically generate pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs.

Object object-detection +1

On the Opportunities and Risks of Foundation Models

2 code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Metadata Normalization

1 code implementation CVPR 2021 Mandy Lu, Qingyu Zhao, Jiequan Zhang, Kilian M. Pohl, Li Fei-Fei, Juan Carlos Niebles, Ehsan Adeli

Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods.

TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild

no code implementations ICCV 2021 Vida Adeli, Mahsa Ehsanpour, Ian Reid, Juan Carlos Niebles, Silvio Savarese, Ehsan Adeli, Hamid Rezatofighi

Joint forecasting of human trajectory and pose dynamics is a fundamental building block of various applications ranging from robotics and autonomous driving to surveillance systems.

Autonomous Driving Human-Object Interaction Detection

Vision-based Estimation of MDS-UPDRS Gait Scores for Assessing Parkinson's Disease Motor Severity

no code implementations17 Jul 2020 Mandy Lu, Kathleen Poston, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Kilian M. Pohl, Juan Carlos Niebles, Ehsan Adeli

This is the first benchmark for classifying PD patients based on MDS-UPDRS gait severity and could be an objective biomarker for disease severity.

Socially and Contextually Aware Human Motion and Pose Forecasting

no code implementations14 Jul 2020 Vida Adeli, Ehsan Adeli, Ian Reid, Juan Carlos Niebles, Hamid Rezatofighi

In this paper, we propose a novel framework to tackle both tasks of human motion (or trajectory) and body skeleton pose forecasting in a unified end-to-end pipeline.

Human Dynamics Robot Navigation

Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction

1 code implementation20 Feb 2020 Bingbin Liu, Ehsan Adeli, Zhangjie Cao, Kuan-Hui Lee, Abhijeet Shenoi, Adrien Gaidon, Juan Carlos Niebles

In addition, we introduce a new dataset designed specifically for autonomous-driving scenarios in areas with dense pedestrian populations: the Stanford-TRI Intent Prediction (STIP) dataset.

Autonomous Driving Navigate

Adversarial Cross-Domain Action Recognition with Co-Attention

no code implementations22 Dec 2019 Boxiao Pan, Zhangjie Cao, Ehsan Adeli, Juan Carlos Niebles

Action recognition has been a widely studied topic with a heavy focus on supervised learning involving sufficient labeled videos.

Action Recognition

Action Genome: Actions as Composition of Spatio-temporal Scene Graphs

1 code implementation15 Dec 2019 Jingwei Ji, Ranjay Krishna, Li Fei-Fei, Juan Carlos Niebles

Next, by decomposing and learning the temporal changes in visual relationships that result in an action, we demonstrate the utility of a hierarchical event decomposition by enabling few-shot action recognition, achieving 42. 7% mAP using as few as 10 examples.

Few-Shot action recognition Few Shot Action Recognition +1

Motion Reasoning for Goal-Based Imitation Learning

no code implementations13 Nov 2019 De-An Huang, Yu-Wei Chao, Chris Paxton, Xinke Deng, Li Fei-Fei, Juan Carlos Niebles, Animesh Garg, Dieter Fox

We further show that by using the automatically inferred goal from the video demonstration, our robot is able to reproduce the same task in a real kitchen environment.

Imitation Learning Motion Planning +1

Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision

no code implementations4 Nov 2019 Karttikeya Mangalam, Ehsan Adeli, Kuan-Hui Lee, Adrien Gaidon, Juan Carlos Niebles

In contrast to the previous work that aims to solve either the task of pose prediction or trajectory forecasting in isolation, we propose a framework to unify the two problems and address the practically useful task of pedestrian locomotion prediction in the wild.

Human Dynamics Pose Prediction +1

Representation Learning with Statistical Independence to Mitigate Bias

2 code implementations8 Oct 2019 Ehsan Adeli, Qingyu Zhao, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Juan Carlos Niebles, Kilian M. Pohl

Presence of bias (in datasets or tasks) is inarguably one of the most critical challenges in machine learning applications that has alluded to pivotal debates in recent years.

Face Recognition Gender Classification +1

Learning Temporal Action Proposals With Fewer Labels

no code implementations ICCV 2019 Jingwei Ji, Kaidi Cao, Juan Carlos Niebles

Most current methods for training action proposal modules rely on fully supervised approaches that require large amounts of annotated temporal action intervals in long video sequences.

Action Detection Semi-Supervised Action Detection

Bias-Resilient Neural Network

no code implementations25 Sep 2019 Ehsan Adeli, Qingyu Zhao, Adolf Pfefferbaum, Edith V. Sullivan, L. Fei-Fei, Juan Carlos Niebles, Kilian M. Pohl

We apply our method to a synthetic, a medical diagnosis, and a gender classification (Gender Shades) dataset.

Face Recognition Gender Classification +1

Imitation Learning for Human Pose Prediction

no code implementations ICCV 2019 Borui Wang, Ehsan Adeli, Hsu-kuang Chiu, De-An Huang, Juan Carlos Niebles

Modeling and prediction of human motion dynamics has long been a challenging problem in computer vision, and most existing methods rely on the end-to-end supervised training of various architectures of recurrent neural networks.

Ranked #2 on Human Pose Forecasting on Human3.6M (MAR, walking, 1,000ms metric)

Human Pose Forecasting Imitation Learning +3

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning

no code implementations16 Aug 2019 De-An Huang, Danfei Xu, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei, Juan Carlos Niebles

The key technical challenge is that the symbol grounding is prone to error with limited training data and leads to subsequent symbolic planning failures.

Imitation Learning

Procedure Planning in Instructional Videos

no code implementations ECCV 2020 Chien-Yi Chang, De-An Huang, Danfei Xu, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles

In this paper, we study the problem of procedure planning in instructional videos, which can be seen as a step towards enabling autonomous agents to plan for complex tasks in everyday settings such as cooking.

Segmenting the Future

no code implementations24 Apr 2019 Hsu-kuang Chiu, Ehsan Adeli, Juan Carlos Niebles

While prior work attempts to predict future video pixels, anticipate activities or forecast future scene semantic segments from segmentation of the preceding frames, methods that predict future semantic segmentation solely from the previous frame RGB data in a single end-to-end trainable model do not exist.

Autonomous Driving Decision Making +4

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation

no code implementations CVPR 2019 Chien-Yi Chang, De-An Huang, Yanan Sui, Li Fei-Fei, Juan Carlos Niebles

The key technical challenge for discriminative modeling with weak supervision is that the loss function of the ordering supervision is usually formulated using dynamic programming and is thus not differentiable.

Dynamic Time Warping Segmentation +1

Action-Agnostic Human Pose Forecasting

1 code implementation23 Oct 2018 Hsu-kuang Chiu, Ehsan Adeli, Borui Wang, De-An Huang, Juan Carlos Niebles

In this paper, we propose a new action-agnostic method for short- and long-term human pose forecasting.

Ranked #5 on Human Pose Forecasting on Human3.6M (MAR, walking, 1,000ms metric)

Human Dynamics Human Pose Forecasting

Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos

no code implementations ECCV 2018 Bingbin Liu, Serena Yeung, Edward Chou, De-An Huang, Li Fei-Fei, Juan Carlos Niebles

A major challenge in computer vision is scaling activity understanding to the long tail of complex activities without requiring collecting large quantities of data for new actions.

Retrieval Video Retrieval

The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary

no code implementations11 Aug 2018 Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Victor Escorcia, Ranjay Krishna, Shyamal Buch, Cuong Duc Dao

The guest tasks focused on complementary aspects of the activity recognition problem at large scale and involved three challenging and recently compiled datasets: the Kinetics-600 dataset from Google DeepMind, the AVA dataset from Berkeley and Google, and the Moments in Time dataset from MIT and IBM Research.

Activity Recognition

Liquid Pouring Monitoring via Rich Sensory Inputs

no code implementations ECCV 2018 Tz-Ying Wu, Juan-Ting Lin, Tsun-Hsuang Wang, Chan-Wei Hu, Juan Carlos Niebles, Min Sun

In the closed-loop system, the ability to monitor the state of the task via rich sensory information is important but often less studied.

Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining

no code implementations1 Aug 2018 Yundong Zhang, Juan Carlos Niebles, Alvaro Soto

A key aspect of VQA models that are interpretable is their ability to ground their answers to relevant regions in the image.

Question Answering Visual Grounding +1

Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration

no code implementations CVPR 2019 De-An Huang, Suraj Nair, Danfei Xu, Yuke Zhu, Animesh Garg, Li Fei-Fei, Silvio Savarese, Juan Carlos Niebles

We hypothesize that to successfully generalize to unseen complex tasks from a single video demonstration, it is necessary to explicitly incorporate the compositional structure of the tasks into the model.

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

no code implementations CVPR 2018 De-An Huang, Shyamal Buch, Lucio Dery, Animesh Garg, Li Fei-Fei, Juan Carlos Niebles

In this work, we propose to tackle this new task with a weakly-supervised framework for reference-aware visual grounding in instructional videos, where only the temporal alignment between the transcription and the video segment are available for supervision.

Multiple Instance Learning Sentence +1

Graph Distillation for Action Detection with Privileged Modalities

1 code implementation ECCV 2018 Zelun Luo, Jun-Ting Hsieh, Lu Jiang, Juan Carlos Niebles, Li Fei-Fei

We propose a technique that tackles action detection in multimodal videos under a realistic and challenging condition in which only limited training data and partially observed modalities are available.

Action Classification Action Detection +1

ActivityNet Challenge 2017 Summary

no code implementations22 Oct 2017 Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Ranjay Khrisna, Victor Escorcia, Kenji Hata, Shyamal Buch

The ActivityNet Large Scale Activity Recognition Challenge 2017 Summary: results and challenge participants papers.

Activity Recognition

Visual Forecasting by Imitating Dynamics in Natural Sequences

no code implementations ICCV 2017 Kuo-Hao Zeng, William B. Shen, De-An Huang, Min Sun, Juan Carlos Niebles

This allows us to apply IRL at scale and directly imitate the dynamics in high-dimensional continuous visual sequences from the raw pixel values.

Action Anticipation

Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization

no code implementations CVPR 2017 Kuo-Hao Zeng, Shih-Han Chou, Fu-Hsiang Chan, Juan Carlos Niebles, Min Sun

For survival, a living agent must have the ability to assess risk (1) by temporally anticipating accidents before they occur, and (2) by spatially localizing risky regions in the environment to move away from threats.

Accident Anticipation

Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos

no code implementations CVPR 2017 De-An Huang, Joseph J. Lim, Li Fei-Fei, Juan Carlos Niebles

We propose an unsupervised method for reference resolution in instructional videos, where the goal is to temporally link an entity (e. g., "dressing") to the action (e. g., "mix yogurt") that produced it.

Referring Expression

Title Generation for User Generated Videos

no code implementations25 Aug 2016 Kuo-Hao Zeng, Tseng-Hung Chen, Juan Carlos Niebles, Min Sun

Finally, our sentence augmentation method also outperforms the baselines on the M-VAD dataset.

Sentence Video Captioning

Connectionist Temporal Modeling for Weakly Supervised Action Labeling

no code implementations28 Jul 2016 De-An Huang, Li Fei-Fei, Juan Carlos Niebles

We propose a weakly-supervised framework for action labeling in video, where only the order of occurring actions is required during training time.

General Classification

Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos

no code implementations CVPR 2016 Fabian Caba Heilbron, Juan Carlos Niebles, Bernard Ghanem

In many large-scale video analysis scenarios, one is interested in localizing and recognizing human activities that occur in short temporal intervals within long untrimmed videos.

Action Detection Action Recognition +2

On the Relationship Between Visual Attributes and Convolutional Networks

no code implementations CVPR 2015 Victor Escorcia, Juan Carlos Niebles, Bernard Ghanem

One of the cornerstone principles of deep models is their abstraction capacity, i. e. their ability to learn abstract concepts from `simpler' ones.

Attribute Object Recognition +2

Robust Manhattan Frame Estimation From a Single RGB-D Image

no code implementations CVPR 2015 Bernard Ghanem, Ali Thabet, Juan Carlos Niebles, Fabian Caba Heilbron

This paper proposes a new framework for estimating the Manhattan Frame (MF) of an indoor scene from a single RGB-D image.

ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding

1 code implementation CVPR 2015 Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, Juan Carlos Niebles

In spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize.

Action Detection Action Recognition +4

Discriminative Hierarchical Modeling of Spatio-Temporally Composable Human Activities

no code implementations CVPR 2014 Ivan Lillo, Alvaro Soto, Juan Carlos Niebles

Our method describes human activities in a hierarchical discriminative model that operates at three semantic levels.

Cannot find the paper you are looking for? You can Submit a new open access paper.