Search Results for author: Juan Carlos Niebles

Found 84 papers, 30 papers with code

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition

1 code implementation • ECCV 2020 • Linxi Fan, Shyamal Buch, Guanzhi Wang, Ryan Cao, Yuke Zhu, Juan Carlos Niebles, Li Fei-Fei

We analyze the suitability of our new primitive for video action recognition and explore several novel variations of our approach to enable stronger representational flexibility while maintaining an efficient design.

Action Recognition Temporal Action Localization +1

100

Paper
Code

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

2 code implementations • 23 Feb 2024 • JianGuo Zhang, Tian Lan, Rithesh Murthy, Zhiwei Liu, Weiran Yao, Juntao Tan, Thai Hoang, Liangwei Yang, Yihao Feng, Zuxin Liu, Tulika Awalgaonkar, Juan Carlos Niebles, Silvio Savarese, Shelby Heinecke, Huan Wang, Caiming Xiong

It meticulously standardizes and unifies these trajectories into a consistent format, streamlining the creation of a generic data loader optimized for agent training.

Paper
Code

Causal Layering via Conditional Entropy

no code implementations • 19 Jan 2024 • Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese

Under appropriate assumptions and conditioning, we can separate the sources or sinks from the remainder of the nodes by comparing their conditional entropy to the unconditional entropy of their noise.

Causal Discovery

Paper
Add Code

Editing Arbitrary Propositions in LLMs without Subject Labels

no code implementations • 15 Jan 2024 • Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese

On datasets of binary propositions derived from the CounterFact dataset, we show that our method -- without access to subject labels -- performs close to state-of-the-art L\&E methods which has access subject labels.

Language Modelling Large Language Model +1

Paper
Add Code

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning

1 code implementation • 30 Nov 2023 • Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles

Vision-language pre-training and instruction tuning have demonstrated general-purpose capabilities in 2D visual reasoning tasks by aligning visual encoders with state-of-the-art large language models (LLMs).

Visual Reasoning

Paper
Code

Temporally Disentangled Representation Learning under Unknown Nonstationarity

1 code implementation • NeurIPS 2023 • Xiangchen Song, Weiran Yao, Yewen Fan, Xinshuai Dong, Guangyi Chen, Juan Carlos Niebles, Eric Xing, Kun Zhang

In unsupervised causal representation learning for sequential data with time-delayed latent causal influences, strong identifiability results for the disentanglement of causally-related latent variables have been established in stationary settings by leveraging temporal structure.

Disentanglement

Paper
Code

Artificial Intelligence Index Report 2023

no code implementations • 5 Oct 2023 • Nestor Maslej, Loredana Fattorini, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, Helen Ngo, Juan Carlos Niebles, Vanessa Parli, Yoav Shoham, Russell Wald, Jack Clark, Raymond Perrault

Welcome to the sixth edition of the AI Index Report.

Paper
Add Code

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents

2 code implementations • 11 Aug 2023 • Zhiwei Liu, Weiran Yao, JianGuo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs).

Benchmarking Decision Making

267

Paper
Code

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

no code implementations • 4 Aug 2023 • Weiran Yao, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Yihao Feng, Le Xue, Rithesh Murthy, Zeyuan Chen, JianGuo Zhang, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

This demonstrates that using policy gradient optimization to improve language agents, for which we believe our work is one of the first, seems promising and can be applied to optimize other models in the agent architecture to enhance agent performances over time.

Language Modelling

Paper
Add Code

REX: Rapid Exploration and eXploitation for AI Agents

no code implementations • 18 Jul 2023 • Rithesh Murthy, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Le Xue, Weiran Yao, Yihao Feng, Zeyuan Chen, Akash Gokul, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

In this paper, we propose an enhanced approach for Rapid Exploration and eXploitation for AI Agents called REX.

Decision Making Reinforcement Learning (RL)

Paper
Add Code

HomE: Homography-Equivariant Video Representation Learning

1 code implementation • 2 Jun 2023 • Anirudh Sriram, Adrien Gaidon, Jiajun Wu, Juan Carlos Niebles, Li Fei-Fei, Ehsan Adeli

In this work, we propose a novel method for representation learning of multi-view videos, where we explicitly model the representation space to maintain Homography Equivariance (HomE).

Action Classification Action Recognition +2

Paper
Code

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

1 code implementation • NeurIPS 2023 • Can Qin, Shu Zhang, Ning Yu, Yihao Feng, Xinyi Yang, Yingbo Zhou, Huan Wang, Juan Carlos Niebles, Caiming Xiong, Silvio Savarese, Stefano Ermon, Yun Fu, ran Xu

Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when prompted with arbitrary languages.

Image Generation

577

Paper
Code

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding

1 code implementation • 14 May 2023 • Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese

Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning multimodal features across 3D shapes, their 2D counterparts, and language descriptions.

Ranked #4 on 3D Point Cloud Classification on ScanObjectNN (using extra training data)

3D Point Cloud Classification Representation Learning +1

350

Paper
Code

Procedure-Aware Pretraining for Instructional Video Understanding

1 code implementation • CVPR 2023 • Honglu Zhou, Roberto Martín-Martín, Mubbasir Kapadia, Silvio Savarese, Juan Carlos Niebles

This graph can then be used to generate pseudo labels to train a video representation that encodes the procedural knowledge in a more accessible form to generalize to multiple procedure understanding tasks.

Video Understanding

Paper
Code

Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations

no code implementations • CVPR 2023 • Vibashan VS, Ning Yu, Chen Xing, Can Qin, Mingfei Gao, Juan Carlos Niebles, Vishal M. Patel, ran Xu

In summary, an OV method learns task-specific information using strong supervision from base annotations and novel category information using weak supervision from image-captions pairs.

Image Captioning Instance Segmentation +2

Paper
Add Code

On the Unlikelihood of D-Separation

no code implementations • 10 Mar 2023 • Itai Feigenbaum, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Devansh Arpit

We then provide an analytic average case analysis of the PC Algorithm for causal discovery, as well as a variant of the SGS Algorithm we call UniformSGS.

Causal Discovery

Paper
Add Code

Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation

no code implementations • ICCV 2023 • Qichen Fu, Xingyu Liu, ran Xu, Juan Carlos Niebles, Kris M. Kitani

Accurately estimating 3D hand pose is crucial for understanding how humans interact with the world.

Hand Pose Estimation

Paper
Add Code

Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data

1 code implementation • 25 Jan 2023 • Devansh Arpit, Matthew Fernandez, Itai Feigenbaum, Weiran Yao, Chenghao Liu, Wenzhuo Yang, Paul Josel, Shelby Heinecke, Eric Hu, Huan Wang, Stephen Hoi, Caiming Xiong, Kun Zhang, Juan Carlos Niebles

Finally, we provide a user interface (UI) that allows users to perform causal analysis on data without coding.

Causal Discovery Causal Inference +2

223

Paper
Code

Model-Agnostic Hierarchical Attention for 3D Object Detection

no code implementations • 6 Jan 2023 • Manli Shu, Le Xue, Ning Yu, Roberto Martín-Martín, Juan Carlos Niebles, Caiming Xiong, ran Xu

By plugging our proposed modules into the state-of-the-art transformer-based 3D detector, we improve the previous best results on both benchmarks, with the largest improvement margin on small objects.

3D Object Detection Object +1

Paper
Add Code

LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer

1 code implementation • 19 Dec 2022 • Ning Yu, Chia-Chih Chen, Zeyuan Chen, Rui Meng, Gang Wu, Paul Josel, Juan Carlos Niebles, Caiming Xiong, ran Xu

Graphic layout designs play an essential role in visual communication.

Paper
Code

ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding

1 code implementation • CVPR 2023 • Le Xue, Mingfei Gao, Chen Xing, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese

Then, ULIP learns a 3D representation space aligned with the common image-text space, using a small number of automatically synthesized triplets.

Ranked #3 on Training-free 3D Point Cloud Classification on ModelNet40 (using extra training data)

3D Architecture 3D Classification +5

350

Paper
Code

Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding

no code implementations • 22 Aug 2022 • Stephen Su, Samuel Kwong, Qingyu Zhao, De-An Huang, Juan Carlos Niebles, Ehsan Adeli

In this work, we propose a generalized notion of multi-task learning by incorporating both auxiliary tasks that the model should perform well on and adversarial tasks that the model should not perform well on.

Action Recognition Multi-Task Learning +3

Paper
Add Code

PrivHAR: Recognizing Human Actions From Privacy-preserving Lens

no code implementations • 8 Jun 2022 • Carlos Hinojosa, Miguel Marquez, Henry Arguello, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles

The accelerated use of digital cameras prompts an increasing concern about privacy and security, particularly in applications such as action recognition.

Action Recognition Privacy Preserving +1

Paper
Add Code

Revisiting the "Video" in Video-Language Understanding

1 code implementation • CVPR 2022 • Shyamal Buch, Cristóbal Eyzaguirre, Adrien Gaidon, Jiajun Wu, Li Fei-Fei, Juan Carlos Niebles

Building on recent progress in self-supervised image-language models, we revisit this question in the context of video and language tasks.

Ranked #1 on Video Question Answering on MSR-VTT-MC

Benchmarking Question Answering +4

Paper
Code

The AI Index 2022 Annual Report

no code implementations • 2 May 2022 • Daniel Zhang, Nestor Maslej, Erik Brynjolfsson, John Etchemendy, Terah Lyons, James Manyika, Helen Ngo, Juan Carlos Niebles, Michael Sellitto, Ellie Sakhaee, Yoav Shoham, Jack Clark, Raymond Perrault

Welcome to the fifth edition of the AI Index Report!

Ethics

Paper
Add Code

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

1 code implementation • CVPR 2022 • Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi

To achieve this, we first introduce an entity prompter module, which is trained with VTC to produce the similarity between a video crop and text prompts instantiated with entity names.

Ranked #19 on Zero-Shot Video Retrieval on DiDeMo

Entity Alignment Retrieval +3

182

Paper
Code

MOMA: Multi-Object Multi-Actor Activity Parsing

no code implementations • NeurIPS 2021 • Zelun Luo, Wanze Xie, Siddharth Kapoor, Yiyun Liang, Michael Cooper, Juan Carlos Niebles, Ehsan Adeli, Fei-Fei Li

This paper introduces Activity Parsing as the overarching task of temporal segmentation and classification of activities, sub-activities, atomic actions, along with an instance-level understanding of actors, objects, and their relationships in videos.

Object

Paper
Add Code

PreViTS: Contrastive Pretraining with Video Tracking Supervision

no code implementations • 1 Dec 2021 • Brian Chen, Ramprasaath R. Selvaraju, Shih-Fu Chang, Juan Carlos Niebles, Nikhil Naik

In this work, we propose PreViTS, an SSL framework that utilizes an unsupervised tracking signal for selecting clips containing the same object, which helps better utilize temporal transformations of objects.

Action Classification Self-Supervised Learning +1

Paper
Add Code

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels

1 code implementation • 18 Nov 2021 • Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, ran Xu, Wenhao Liu, Caiming Xiong

To enlarge the set of base classes, we propose a method to automatically generate pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs.

Object object-detection +1

Paper
Code

On the Opportunities and Risks of Foundation Models

2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

847

Paper
Code

TNT: Text-Conditioned Network with Transductive Inference for Few-Shot Video Classification

2 code implementations • 21 Jun 2021 • Andrés Villa, Juan-Manuel Perez-Rua, Vladimir Araujo, Juan Carlos Niebles, Victor Escorcia, Alvaro Soto

Recently, few-shot learning has received increasing interest.

Action Classification Classification +3

Paper
Code

Home Action Genome: Cooperative Compositional Action Understanding

1 code implementation • CVPR 2021 • Nishant Rai, Haofeng Chen, Jingwei Ji, Rishi Desai, Kazuki Kozuka, Shun Ishizaka, Ehsan Adeli, Juan Carlos Niebles

However, there remains a lack of studies that extend action composition and leverage multiple viewpoints and multiple modalities of data for representation learning.

Ranked #1 on Video Classification on Home Action Genome

Action Understanding Few-Shot action recognition +3

Paper
Code

CoCon: Cooperative-Contrastive Learning

1 code implementation • 30 Apr 2021 • Nishant Rai, Ehsan Adeli, Kuan-Hui Lee, Adrien Gaidon, Juan Carlos Niebles

Labeling videos at scale is impractical.

Action Recognition Contrastive Learning +1

Paper
Code

Metadata Normalization

1 code implementation • CVPR 2021 • Mandy Lu, Qingyu Zhao, Jiequan Zhang, Kilian M. Pohl, Li Fei-Fei, Juan Carlos Niebles, Ehsan Adeli

Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods.

Paper
Code

TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild

no code implementations • ICCV 2021 • Vida Adeli, Mahsa Ehsanpour, Ian Reid, Juan Carlos Niebles, Silvio Savarese, Ehsan Adeli, Hamid Rezatofighi

Joint forecasting of human trajectory and pose dynamics is a fundamental building block of various applications ranging from robotics and autonomous driving to surveillance systems.

Autonomous Driving Human-Object Interaction Detection

Paper
Add Code

The AI Index 2021 Annual Report

no code implementations • 9 Mar 2021 • Daniel Zhang, Saurabh Mishra, Erik Brynjolfsson, John Etchemendy, Deep Ganguli, Barbara Grosz, Terah Lyons, James Manyika, Juan Carlos Niebles, Michael Sellitto, Yoav Shoham, Jack Clark, Raymond Perrault

Welcome to the fourth edition of the AI Index Report.

Paper
Add Code

Learning Privacy-Preserving Optics for Human Pose Estimation

no code implementations • ICCV 2021 • Carlos Hinojosa, Juan Carlos Niebles, Henry Arguello

However, we also want the camera to capture useful information to perform computer vision tasks.

Pose Estimation Privacy Preserving

Paper
Add Code

Detecting Human-Object Relationships in Videos

no code implementations • ICCV 2021 • Jingwei Ji, Rishi Desai, Juan Carlos Niebles

We study a crucial problem in video analysis: human-object relationship detection.

Human-Object Relationship Detection Object +1

Paper
Add Code

Vision-based Estimation of MDS-UPDRS Gait Scores for Assessing Parkinson's Disease Motor Severity

no code implementations • 17 Jul 2020 • Mandy Lu, Kathleen Poston, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Kilian M. Pohl, Juan Carlos Niebles, Ehsan Adeli

This is the first benchmark for classifying PD patients based on MDS-UPDRS gait severity and could be an objective biomarker for disease severity.

Paper
Add Code

Socially and Contextually Aware Human Motion and Pose Forecasting

no code implementations • 14 Jul 2020 • Vida Adeli, Ehsan Adeli, Ian Reid, Juan Carlos Niebles, Hamid Rezatofighi

In this paper, we propose a novel framework to tackle both tasks of human motion (or trajectory) and body skeleton pose forecasting in a unified end-to-end pipeline.

Human Dynamics Robot Navigation

Paper
Add Code

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

no code implementations • CVPR 2020 • Boxiao Pan, Haoye Cai, De-An Huang, Kuan-Hui Lee, Adrien Gaidon, Ehsan Adeli, Juan Carlos Niebles

In this paper, we propose a novel spatio-temporal graph model for video captioning that exploits object interactions in space and time.

Knowledge Distillation Object +2

Paper
Add Code

Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction

1 code implementation • 20 Feb 2020 • Bingbin Liu, Ehsan Adeli, Zhangjie Cao, Kuan-Hui Lee, Abhijeet Shenoi, Adrien Gaidon, Juan Carlos Niebles

In addition, we introduce a new dataset designed specifically for autonomous-driving scenarios in areas with dense pedestrian populations: the Stanford-TRI Intent Prediction (STIP) dataset.

Autonomous Driving Navigate

Paper
Code

Adversarial Cross-Domain Action Recognition with Co-Attention

no code implementations • 22 Dec 2019 • Boxiao Pan, Zhangjie Cao, Ehsan Adeli, Juan Carlos Niebles

Action recognition has been a widely studied topic with a heavy focus on supervised learning involving sufficient labeled videos.

Action Recognition

Paper
Add Code

Action Genome: Actions as Composition of Spatio-temporal Scene Graphs

1 code implementation • 15 Dec 2019 • Jingwei Ji, Ranjay Krishna, Li Fei-Fei, Juan Carlos Niebles

Next, by decomposing and learning the temporal changes in visual relationships that result in an action, we demonstrate the utility of a hierarchical event decomposition by enabling few-shot action recognition, achieving 42. 7% mAP using as few as 10 examples.

Few-Shot action recognition Few Shot Action Recognition +1

Paper
Code

Motion Reasoning for Goal-Based Imitation Learning

no code implementations • 13 Nov 2019 • De-An Huang, Yu-Wei Chao, Chris Paxton, Xinke Deng, Li Fei-Fei, Juan Carlos Niebles, Animesh Garg, Dieter Fox

We further show that by using the automatically inferred goal from the video demonstration, our robot is able to reproduce the same task in a real kitchen environment.

Imitation Learning Motion Planning +1

Paper
Add Code

Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision

no code implementations • 4 Nov 2019 • Karttikeya Mangalam, Ehsan Adeli, Kuan-Hui Lee, Adrien Gaidon, Juan Carlos Niebles

In contrast to the previous work that aims to solve either the task of pose prediction or trajectory forecasting in isolation, we propose a framework to unify the two problems and address the practically useful task of pedestrian locomotion prediction in the wild.

Human Dynamics Pose Prediction +1

Paper
Add Code

Representation Learning with Statistical Independence to Mitigate Bias

2 code implementations • 8 Oct 2019 • Ehsan Adeli, Qingyu Zhao, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Juan Carlos Niebles, Kilian M. Pohl

Presence of bias (in datasets or tasks) is inarguably one of the most critical challenges in machine learning applications that has alluded to pivotal debates in recent years.

Face Recognition Gender Classification +1

Paper
Code

Learning Temporal Action Proposals With Fewer Labels

no code implementations • ICCV 2019 • Jingwei Ji, Kaidi Cao, Juan Carlos Niebles

Most current methods for training action proposal modules rely on fully supervised approaches that require large amounts of annotated temporal action intervals in long video sequences.

Ranked #3 on Semi-Supervised Action Detection on ActivityNet-1.3

Action Detection Semi-Supervised Action Detection

Paper
Add Code

Bias-Resilient Neural Network

no code implementations • 25 Sep 2019 • Ehsan Adeli, Qingyu Zhao, Adolf Pfefferbaum, Edith V. Sullivan, L. Fei-Fei, Juan Carlos Niebles, Kilian M. Pohl

We apply our method to a synthetic, a medical diagnosis, and a gender classification (Gender Shades) dataset.

Face Recognition Gender Classification +1

Paper
Add Code

Imitation Learning for Human Pose Prediction

no code implementations • ICCV 2019 • Borui Wang, Ehsan Adeli, Hsu-kuang Chiu, De-An Huang, Juan Carlos Niebles

Modeling and prediction of human motion dynamics has long been a challenging problem in computer vision, and most existing methods rely on the end-to-end supervised training of various architectures of recurrent neural networks.

Ranked #2 on Human Pose Forecasting on Human3.6M (MAR, walking, 1,000ms metric)

Human Pose Forecasting Imitation Learning +3

Paper
Add Code

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning

no code implementations • 16 Aug 2019 • De-An Huang, Danfei Xu, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei, Juan Carlos Niebles

The key technical challenge is that the symbol grounding is prone to error with limited training data and leads to subsequent symbolic planning failures.

Imitation Learning

Paper
Add Code

Procedure Planning in Instructional Videos

no code implementations • ECCV 2020 • Chien-Yi Chang, De-An Huang, Danfei Xu, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles

In this paper, we study the problem of procedure planning in instructional videos, which can be seen as a step towards enabling autonomous agents to plan for complex tasks in everyday settings such as cooking.

Paper
Add Code

Few-Shot Video Classification via Temporal Alignment

no code implementations • CVPR 2020 • Kaidi Cao, Jingwei Ji, Zhangjie Cao, Chien-Yi Chang, Juan Carlos Niebles

In this paper, we propose Temporal Alignment Module (TAM), a novel few-shot learning framework that can learn to classify a previous unseen video.

Ranked #5 on Few Shot Action Recognition on Kinetics-100

Classification Few Shot Action Recognition +3

Paper
Add Code

Segmenting the Future

no code implementations • 24 Apr 2019 • Hsu-kuang Chiu, Ehsan Adeli, Juan Carlos Niebles

While prior work attempts to predict future video pixels, anticipate activities or forecast future scene semantic segments from segmentation of the preceding frames, methods that predict future semantic segmentation solely from the previous frame RGB data in a single end-to-end trainable model do not exist.

Autonomous Driving Decision Making +4

Paper
Add Code

Peeking into the Future: Predicting Future Person Activities and Locations in Videos

2 code implementations • CVPR 2019 • Junwei Liang, Lu Jiang, Juan Carlos Niebles, Alexander Hauptmann, Li Fei-Fei

To facilitate the training, the network is learned with an auxiliary task of predicting future location in which the activity will happen.

Ranked #1 on Activity Prediction on ActEV

Future prediction Human motion prediction +4

350

Paper
Code

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation

no code implementations • CVPR 2019 • Chien-Yi Chang, De-An Huang, Yanan Sui, Li Fei-Fei, Juan Carlos Niebles

The key technical challenge for discriminative modeling with weak supervision is that the loss function of the ordering supervision is usually formulated using dynamic programming and is thus not differentiable.

Ranked #5 on Weakly Supervised Action Segmentation (Transcript) on Breakfast

Dynamic Time Warping Segmentation +1

Paper
Add Code

Action-Agnostic Human Pose Forecasting

1 code implementation • 23 Oct 2018 • Hsu-kuang Chiu, Ehsan Adeli, Borui Wang, De-An Huang, Juan Carlos Niebles

In this paper, we propose a new action-agnostic method for short- and long-term human pose forecasting.

Ranked #5 on Human Pose Forecasting on Human3.6M (MAR, walking, 1,000ms metric)

Human Dynamics Human Pose Forecasting

Paper
Code

Translating Navigation Instructions in Natural Language to a High-Level Plan for Behavioral Robot Navigation

no code implementations • EMNLP 2018 • Xiaoxue Zang, Ashwini Pokle, Marynel Vázquez, Kevin Chen, Juan Carlos Niebles, Alvaro Soto, Silvio Savarese

We propose an end-to-end deep learning model for translating free-form natural language instructions to a high-level plan for behavioral robot navigation.

Robot Navigation Translation

Paper
Add Code

End-to-End Joint Semantic Segmentation of Actors and Actions in Video

no code implementations • ECCV 2018 • Jingwei Ji, Shyamal Buch, Alvaro Soto, Juan Carlos Niebles

Traditional video understanding tasks include human action recognition and actor/object semantic segmentation.

Action Recognition Segmentation +3

Paper
Add Code

Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos

no code implementations • ECCV 2018 • Bingbin Liu, Serena Yeung, Edward Chou, De-An Huang, Li Fei-Fei, Juan Carlos Niebles

A major challenge in computer vision is scaling activity understanding to the long tail of complex activities without requiring collecting large quantities of data for new actions.

Retrieval Video Retrieval

Paper
Add Code

The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary

no code implementations • 11 Aug 2018 • Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Victor Escorcia, Ranjay Krishna, Shyamal Buch, Cuong Duc Dao

The guest tasks focused on complementary aspects of the activity recognition problem at large scale and involved three challenging and recently compiled datasets: the Kinetics-600 dataset from Google DeepMind, the AVA dataset from Berkeley and Google, and the Moments in Time dataset from MIT and IBM Research.

Activity Recognition

Paper
Add Code

Liquid Pouring Monitoring via Rich Sensory Inputs

no code implementations • ECCV 2018 • Tz-Ying Wu, Juan-Ting Lin, Tsun-Hsuang Wang, Chan-Wei Hu, Juan Carlos Niebles, Min Sun

In the closed-loop system, the ability to monitor the state of the task via rich sensory information is important but often less studied.

Paper
Add Code

Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining

no code implementations • 1 Aug 2018 • Yundong Zhang, Juan Carlos Niebles, Alvaro Soto

A key aspect of VQA models that are interpretable is their ability to ground their answers to relevant regions in the image.

Question Answering Visual Grounding +1

Paper
Add Code

Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration

no code implementations • CVPR 2019 • De-An Huang, Suraj Nair, Danfei Xu, Yuke Zhu, Animesh Garg, Li Fei-Fei, Silvio Savarese, Juan Carlos Niebles

We hypothesize that to successfully generalize to unseen complex tasks from a single video demonstration, it is necessary to explicitly incorporate the compositional structure of the tasks into the model.

Paper
Add Code

Learning to Decompose and Disentangle Representations for Video Prediction

1 code implementation • NeurIPS 2018 • Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li Fei-Fei, Juan Carlos Niebles

Our goal is to predict future video frames given a sequence of input frames.

Disentanglement Predict Future Video Frames

134

Paper
Code

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

no code implementations • CVPR 2018 • De-An Huang, Shyamal Buch, Lucio Dery, Animesh Garg, Li Fei-Fei, Juan Carlos Niebles

In this work, we propose to tackle this new task with a weakly-supervised framework for reference-aware visual grounding in instructional videos, where only the temporal alignment between the transcription and the video segment are available for supervision.

Multiple Instance Learning Sentence +1

Paper
Add Code

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets

no code implementations • CVPR 2018 • De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Lorenzo Torresani, Manohar Paluri, Li Fei-Fei, Juan Carlos Niebles

The ability to capture temporal information has been critical to the development of video understanding models.

Video Understanding

Paper
Add Code

A Deep Learning Based Behavioral Approach to Indoor Autonomous Navigation

no code implementations • 12 Mar 2018 • Gabriel Sepulveda, Juan Carlos Niebles, Alvaro Soto

We present a semantically rich graph representation for indoor robotic navigation.

Autonomous Navigation

Paper
Add Code

Graph Distillation for Action Detection with Privileged Modalities

1 code implementation • ECCV 2018 • Zelun Luo, Jun-Ting Hsieh, Lu Jiang, Juan Carlos Niebles, Li Fei-Fei

We propose a technique that tackles action detection in multimodal videos under a realistic and challenging condition in which only limited training data and partially observed modalities are available.

Action Classification Action Detection +1

Paper
Code

ActivityNet Challenge 2017 Summary

no code implementations • 22 Oct 2017 • Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Ranjay Khrisna, Victor Escorcia, Kenji Hata, Shyamal Buch

The ActivityNet Large Scale Activity Recognition Challenge 2017 Summary: results and challenge participants papers.

Activity Recognition

Paper
Add Code

Visual Forecasting by Imitating Dynamics in Natural Sequences

no code implementations • ICCV 2017 • Kuo-Hao Zeng, William B. Shen, De-An Huang, Min Sun, Juan Carlos Niebles

This allows us to apply IRL at scale and directly imitate the dynamics in high-dimensional continuous visual sequences from the raw pixel values.

Action Anticipation

Paper
Add Code

SST: Single-Stream Temporal Action Proposals

1 code implementation • CVPR 2017 • Shyamal Buch, Victor Escorcia, Chuanqi Shen, Bernard Ghanem, Juan Carlos Niebles

Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences.

Action Detection Temporal Action Proposal Generation

101

Paper
Code

Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization

no code implementations • CVPR 2017 • Kuo-Hao Zeng, Shih-Han Chou, Fu-Hsiang Chan, Juan Carlos Niebles, Min Sun

For survival, a living agent must have the ability to assess risk (1) by temporally anticipating accidents before they occur, and (2) by spatially localizing risky regions in the environment to move away from threats.

Accident Anticipation

Paper
Add Code

Dense-Captioning Events in Videos

4 code implementations • ICCV 2017 • Ranjay Krishna, Kenji Hata, Frederic Ren, Li Fei-Fei, Juan Carlos Niebles

We also introduce ActivityNet Captions, a large-scale benchmark for dense-captioning events.

Dense Captioning Retrieval +1

Paper
Code

Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos

no code implementations • CVPR 2017 • De-An Huang, Joseph J. Lim, Li Fei-Fei, Juan Carlos Niebles

We propose an unsupervised method for reference resolution in instructional videos, where the goal is to temporally link an entity (e. g., "dressing") to the action (e. g., "mix yogurt") that produced it.

Referring Expression

Paper
Add Code

Leveraging Video Descriptions to Learn Video Question Answering

no code implementations • 12 Nov 2016 • Kuo-Hao Zeng, Tseng-Hung Chen, Ching-Yao Chuang, Yuan-Hong Liao, Juan Carlos Niebles, Min Sun

Then, a large number of candidate QA pairs are automatically generated from descriptions rather than manually annotated.

Question Answering Video Question Answering +1

Paper
Add Code

Title Generation for User Generated Videos

no code implementations • 25 Aug 2016 • Kuo-Hao Zeng, Tseng-Hung Chen, Juan Carlos Niebles, Min Sun

Finally, our sentence augmentation method also outperforms the baselines on the M-VAD dataset.

Sentence Video Captioning

Paper
Add Code

Connectionist Temporal Modeling for Weakly Supervised Action Labeling

no code implementations • 28 Jul 2016 • De-An Huang, Li Fei-Fei, Juan Carlos Niebles

We propose a weakly-supervised framework for action labeling in video, where only the order of occurring actions is required during training time.

General Classification

Paper
Add Code

A Hierarchical Pose-Based Approach to Complex Action Understanding Using Dictionaries of Actionlets and Motion Poselets

no code implementations • CVPR 2016 • Ivan Lillo, Juan Carlos Niebles, Alvaro Soto

In this paper, we introduce a new hierarchical model for human action recognition using body joint locations.

Action Recognition Action Understanding +2

Paper
Add Code

Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos

no code implementations • CVPR 2016 • Fabian Caba Heilbron, Juan Carlos Niebles, Bernard Ghanem

In many large-scale video analysis scenarios, one is interested in localizing and recognizing human activities that occur in short temporal intervals within long untrimmed videos.

Action Detection Action Recognition +2

Paper
Add Code

On the Relationship Between Visual Attributes and Convolutional Networks

no code implementations • CVPR 2015 • Victor Escorcia, Juan Carlos Niebles, Bernard Ghanem

One of the cornerstone principles of deep models is their abstraction capacity, i. e. their ability to learn abstract concepts from `simpler' ones.

Attribute Object Recognition +2

Paper
Add Code

Robust Manhattan Frame Estimation From a Single RGB-D Image

no code implementations • CVPR 2015 • Bernard Ghanem, Ali Thabet, Juan Carlos Niebles, Fabian Caba Heilbron

This paper proposes a new framework for estimating the Manhattan Frame (MF) of an indoor scene from a single RGB-D image.

Paper
Add Code

ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding

1 code implementation • CVPR 2015 • Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, Juan Carlos Niebles

In spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize.

Action Detection Action Recognition +4

Paper
Code

Discriminative Hierarchical Modeling of Spatio-Temporally Composable Human Activities

no code implementations • CVPR 2014 • Ivan Lillo, Alvaro Soto, Juan Carlos Niebles

Our method describes human activities in a hierarchical discriminative model that operates at three semantic levels.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.