Search Results for author: James M. Rehg

Found 57 papers, 23 papers with code

Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games

no code implementations16 Dec 2022 Bolin Lai, Hongxin Zhang, Miao Liu, Aryan Pariani, Fiona Ryan, Wenqi Jia, Shirley Anugrah Hayati, James M. Rehg, Diyi Yang

We also explore the generalization ability of language models for persuasion modeling and the role of persuasion strategies in predicting social deduction game outcomes.

Persuasion Strategies

PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation

1 code implementation14 Dec 2022 Maxwell A. Xu, Alexander Moreno, Supriya Nagesh, V. Burak Aydemir, David W. Wetter, Santosh Kumar, James M. Rehg

The promise of Mobile Health (mHealth) is the ability to use wearable sensors to monitor participant physiology at high frequencies during daily life to enable temporally-precise health interventions.


Learning Dense Object Descriptors from Multiple Views for Low-shot Category Generalization

1 code implementation28 Nov 2022 Stefan Stojanov, Anh Thai, Zixuan Huang, James M. Rehg

A hallmark of the deep learning era for computer vision is the successful use of large-scale labeled datasets to train feature representations for tasks ranging from object recognition and semantic segmentation to optical flow estimation and novel view synthesis of 3D scenes.

Novel View Synthesis Object Recognition +3

Transformer-based Localization from Embodied Dialog with Large-scale Pre-training

no code implementations10 Oct 2022 Meera Hahn, James M. Rehg

We address the challenging task of Localization via Embodied Dialog (LED).

In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation

no code implementations8 Aug 2022 Bolin Lai, Miao Liu, Fiona Ryan, James M. Rehg

To this end, we design the transformer encoder to embed the global context as one additional visual token and further propose a novel Global-Local Correlation (GLC) module to explicitly model the correlation of the global token and each local token.

Gaze Estimation

Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues

no code implementations21 Apr 2022 Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg

We present a novel 3D shape reconstruction method which learns to predict an implicit 3D shape representation from a single RGB image.

3D Shape Reconstruction 3D Shape Representation +1

Generative Adversarial Network for Future Hand Segmentation from Egocentric Video

1 code implementation21 Mar 2022 Wenqi Jia, Miao Liu, James M. Rehg

We introduce the novel problem of anticipating a time series of future hand masks from egocentric video.

Hand Segmentation Image Segmentation +2

Kernel Deformed Exponential Families for Sparse Continuous Attention

no code implementations1 Nov 2021 Alexander Moreno, Supriya Nagesh, Zhenke Wu, Walter Dempsey, James M. Rehg

Theoretically, we show new existence results for both kernel exponential and deformed exponential families, and that the deformed case has similar approximation capabilities to kernel exponential families.

Transformers for prompt-level EMA non-response prediction

no code implementations1 Nov 2021 Supriya Nagesh, Alexander Moreno, Stephanie M. Carpenter, Jamie Yap, Soujanya Chatterjee, Steven Lloyd Lizotte, Neng Wan, Santosh Kumar, Cho Lam, David W. Wetter, Inbal Nahum-Shani, James M. Rehg

The transformer model achieves a non-response prediction AUC of 0. 77 and is significantly better than classical ML and LSTM-based deep learning models.

No RL, No Simulation: Learning to Navigate without Navigating

1 code implementation NeurIPS 2021 Meera Hahn, Devendra Chaplot, Shubham Tulsiani, Mustafa Mukadam, James M. Rehg, Abhinav Gupta

Most prior methods for learning navigation policies require access to simulation environments, as they need online policy interaction and rely on ground-truth maps for rewards.


Ego4D: Around the World in 3,000 Hours of Egocentric Video

3 code implementations CVPR 2022 Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.

De-identification Ethics

Egocentric Activity Recognition and Localization on a 3D Map

no code implementations20 May 2021 Miao Liu, Lingni Ma, Kiran Somasundaram, Yin Li, Kristen Grauman, James M. Rehg, Chao Li

Given a video captured from a first person perspective and the environment context of where the video is recorded, can we recognize what the person is doing and identify where the action occurs in the 3D space?

Action Localization Action Recognition +2

The Surprising Positive Knowledge Transfer in Continual 3D Object Shape Reconstruction

3 code implementations18 Jan 2021 Anh Thai, Stefan Stojanov, Zixuan Huang, Isaac Rehg, James M. Rehg

Continual learning has been extensively studied for classification tasks with methods developed to primarily avoid catastrophic forgetting, a phenomenon where earlier learned concepts are forgotten at the expense of more recent samples.

3D Shape Reconstruction Continual Learning +2

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

no code implementations26 Nov 2020 Miao Liu, Dexin Yang, Yan Zhang, Zhaopeng Cui, James M. Rehg, Siyu Tang

We introduce a novel task of reconstructing a time series of second-person 3D human body meshes from monocular egocentric videos.

Time Series

Where Are You? Localization from Embodied Dialog

2 code implementations EMNLP 2020 Meera Hahn, Jacob Krantz, Dhruv Batra, Devi Parikh, James M. Rehg, Stefan Lee, Peter Anderson

In this paper, we focus on the LED task -- providing a strong baseline model with detailed ablations characterizing both dataset biases and the importance of various modeling choices.

Navigate Visual Dialog

3D Reconstruction of Novel Object Shapes from Single Images

2 code implementations14 Jun 2020 Anh Thai, Stefan Stojanov, Vijay Upadhya, James M. Rehg

This is challenging as it requires a model to learn a representation that can infer both the visible and occluded portions of any object using a limited training set.

3D Reconstruction 3D Shape Reconstruction

In the Eye of the Beholder: Gaze and Actions in First Person Video

no code implementations31 May 2020 Yin Li, Miao Liu, James M. Rehg

Moving beyond the dataset, we propose a novel deep model for joint gaze estimation and action recognition in FPV.

Action Recognition Gaze Estimation

Orthogonal Over-Parameterized Training

1 code implementation CVPR 2021 Weiyang Liu, Rongmei Lin, Zhen Liu, James M. Rehg, Liam Paull, Li Xiong, Le Song, Adrian Weller

The inductive bias of a neural network is largely determined by the architecture and the training algorithm.

Inductive Bias

Neural Similarity Learning

1 code implementation NeurIPS 2019 Weiyang Liu, Zhen Liu, James M. Rehg, Le Song

By generalizing inner product with a bilinear matrix, we propose the neural similarity which serves as a learnable parametric similarity measure for CNNs.

Few-Shot Learning

Regularizing Neural Networks via Minimizing Hyperspherical Energy

1 code implementation CVPR 2020 Rongmei Lin, Weiyang Liu, Zhen Liu, Chen Feng, Zhiding Yu, James M. Rehg, Li Xiong, Le Song

Inspired by the Thomson problem in physics where the distribution of multiple propelling electrons on a unit sphere can be modeled via minimizing some potential energy, hyperspherical energy minimization has demonstrated its potential in regularizing neural networks and improving their generalization power.

Locally Weighted Regression Pseudo-Rehearsal for Online Learning of Vehicle Dynamics

no code implementations13 May 2019 Grady Williams, Brian Goldfain, James M. Rehg, Evangelos A. Theodorou

We consider the problem of online adaptation of a neural network designed to represent vehicle dynamics.


Tripping through time: Efficient Localization of Activities in Videos

no code implementations22 Apr 2019 Meera Hahn, Asim Kadav, James M. Rehg, Hans Peter Graf

Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video.

Learning to Generate Synthetic Data via Compositing

1 code implementation CVPR 2019 Shashank Tripathi, Siddhartha Chandra, Amit Agrawal, Ambrish Tyagi, James M. Rehg, Visesh Chari

The synthesizer and target networks are trained in an adversarial manner wherein each network is updated with a goal to outdo the other.

Data Augmentation Human Detection +3

Attention Distillation for Learning Video Representations

no code implementations5 Apr 2019 Miao Liu, Xin Chen, Yun Zhang, Yin Li, James M. Rehg

To this end, we make use of attention modules that learn to highlight regions in the video and aggregate features for recognition.

Action Recognition Video Recognition

Action2Vec: A Crossmodal Embedding Approach to Action Learning

no code implementations2 Jan 2019 Meera Hahn, Andrew Silva, James M. Rehg

We describe a novel cross-modal embedding space for actions, named Action2Vec, which combines linguistic cues from class labels with spatio-temporal features derived from video clips.

Action Recognition General Classification +1

Taking a Deeper Look at the Inverse Compositional Algorithm

1 code implementation CVPR 2019 Zhaoyang Lv, Frank Dellaert, James M. Rehg, Andreas Geiger

In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment.

Motion Estimation regression

Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

no code implementations22 Sep 2018 Meera Hahn, Nataniel Ruiz, Jean-Baptiste Alayrac, Ivan Laptev, James M. Rehg

Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision.

Object Recognition

In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Video

no code implementations ECCV 2018 Yin Li, Miao Liu, James M. Rehg

We address the task of jointly determining what a person is doing and where they are looking based on the analysis of video captured by a headworn camera.

Action Recognition Gaze Estimation

Multi-object Tracking with Neural Gating Using Bilinear LSTM

no code implementations ECCV 2018 Chanho Kim, Fuxin Li, James M. Rehg

We also propose novel data augmentation approaches to efficiently train recurrent models that score object tracks on both appearance and motion.

Data Augmentation Multi-Object Tracking +2

3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare

no code implementations CVPR 2018 Abhijit Kundu, Yin Li, James M. Rehg

Our method produces a compact 3D representation of the scene, which can be readily used for applications like autonomous driving.

Ranked #3 on Vehicle Pose Estimation on KITTI Cars Hard (using extra training data)

3D Object Reconstruction Autonomous Driving +2

Decoupled Networks

1 code implementation CVPR 2018 Weiyang Liu, Zhen Liu, Zhiding Yu, Bo Dai, Rongmei Lin, Yisen Wang, James M. Rehg, Le Song

Inner product-based convolution has been a central component of convolutional neural networks (CNNs) and the key to learning visual representations.

Towards Black-box Iterative Machine Teaching

no code implementations ICML 2018 Weiyang Liu, Bo Dai, Xingguo Li, Zhen Liu, James M. Rehg, Le Song

We propose an active teacher model that can actively query the learner (i. e., make the learner take exams) for estimating the learner's status and provably guide the learner to achieve faster convergence.

Fine-Grained Head Pose Estimation Without Keypoints

9 code implementations2 Oct 2017 Nataniel Ruiz, Eunji Chong, James M. Rehg

Estimating the head pose of a person is a crucial problem that has a large amount of applications such as aiding in gaze estimation, modeling attention, fitting 3D models to video and performing face alignment.

Face Alignment Gaze Estimation +1

Dockerface: an Easy to Install and Use Faster R-CNN Face Detector in a Docker Container

1 code implementation15 Aug 2017 Nataniel Ruiz, James M. Rehg

Face detection is a very important task and a necessary pre-processing step for many applications such as facial landmark detection, pose estimation, sentiment analysis and face recognition.

Face Detection Face Recognition +3

iSurvive: An Interpretable, Event-time Prediction Model for mHealth

no code implementations ICML 2017 Walter H. Dempsey, Alexander Moreno, Christy K. Scott, Michael L. Dennis, David H. Gustafson, Susan A. Murphy, James M. Rehg

We present a parameter learning method for GLM emissions and survival model fitting, and present promising results on both synthetic data and an mHealth drug use dataset.

Survival Analysis

Information Theoretic Model Predictive Control: Theory and Applications to Autonomous Driving

2 code implementations7 Jul 2017 Grady Williams, Paul Drews, Brian Goldfain, James M. Rehg, Evangelos A. Theodorou

We present an information theoretic approach to stochastic optimal control problems that can be used to derive general sampling based optimization schemes.


Iterative Machine Teaching

2 code implementations ICML 2017 Weiyang Liu, Bo Dai, Ahmad Humayun, Charlene Tay, Chen Yu, Linda B. Smith, James M. Rehg, Le Song

Different from traditional machine teaching which views the learners as batch algorithms, we study a new paradigm where the learner uses an iterative algorithm and a teacher can feed examples sequentially and intelligently based on the current performance of the learner.

Automatic Variational ABC

no code implementations28 Jun 2016 Alexander Moreno, Tameem Adel, Edward Meeds, James M. Rehg, Max Welling

Approximate Bayesian Computation (ABC) is a framework for performing likelihood-free posterior inference for simulation models.

Variational Inference

The Middle Child Problem: Revisiting Parametric Min-Cut and Seeds for Object Proposals

no code implementations ICCV 2015 Ahmad Humayun, Fuxin Li, James M. Rehg

We propose a new energy minimization framework incorporating geodesic distances between segments which solves this problem.

Minimizing Human Effort in Interactive Tracking by Incremental Learning of Model Parameters

no code implementations ICCV 2015 Arridhana Ciptadi, James M. Rehg

We address the problem of minimizing human effort in interactive tracking by learning sequence-specific model parameters.

Incremental Learning

Efficient Learning of Continuous-Time Hidden Markov Models for Disease Progression

no code implementations NeurIPS 2015 Yu-Ying Liu, Shuang Li, Fuxin Li, Le Song, James M. Rehg

The Continuous-Time Hidden Markov Model (CT-HMM) is an attractive approach to modeling disease progression due to its ability to describe noisy observations arriving irregularly in time.

Multiple Hypothesis Tracking Revisited

no code implementations ICCV 2015 Chanho Kim, Fuxin Li, Arridhana Ciptadi, James M. Rehg

This paper revisits the classical multiple hypotheses tracking (MHT) algorithm in a tracking-by-detection framework.

Unsupervised Learning of Edges

no code implementations CVPR 2016 Yin Li, Manohar Paluri, James M. Rehg, Piotr Dollár

In this work we present a simple yet effective approach for training edge detectors without human supervision.

Edge Detection Motion Estimation +2

Delving Into Egocentric Actions

no code implementations CVPR 2015 Yin Li, Zhefan Ye, James M. Rehg

We propose to utilize these mid-level egocentric cues for egocentric action recognition.

Action Recognition

The Secrets of Salient Object Segmentation

1 code implementation CVPR 2014 Yin Li, Xiaodi Hou, Christof Koch, James M. Rehg, Alan L. Yuille

The dataset design bias does not only create the discomforting disconnection between fixations and salient object segmentation, but also misleads the algorithm designing.

Semantic Segmentation

RIGOR: Reusing Inference in Graph Cuts for Generating Object Regions

no code implementations CVPR 2014 Ahmad Humayun, Fuxin Li, James M. Rehg

By precomputing a graph which can be used for parametric min-cuts over different seeds, we speed up the generation of the segment pool.

Object Recognition

Modeling Actions through State Changes

no code implementations CVPR 2013 Alireza Fathi, James M. Rehg

The key to differentiating these actions is the ability to identify how they change the state of objects and materials in the environment.

Action Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.