Search Results for author: Uttaran Bhattacharya

Found 19 papers, 5 papers with code

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

no code implementations • 4 Mar 2024 • Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen, Ishita Dasgupta, Saayan Mitra, Minh Hoai

To generate images with realistic hands, we propose a novel diffusion-based architecture called HanDiffuser that achieves realism by injecting hand embeddings in the generative process.

Text-to-Image Generation

Paper
Add Code

VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

no code implementations • 4 Dec 2023 • Yizhou Wang, Ruiyi Zhang, Haoliang Wang, Uttaran Bhattacharya, Yun Fu, Gang Wu

Recent advancements in language-model-based video understanding have been progressing at a remarkable pace, spurred by the introduction of Large Language Models (LLMs).

Language Modelling Question Answering +2

Paper
Add Code

Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior

no code implementations • 1 Sep 2023 • Ashmit Khandelwal, Aditya Agrawal, Aanisha Bhattacharyya, Yaman K Singla, Somesh Singh, Uttaran Bhattacharya, Ishita Dasgupta, Stefano Petrangeli, Rajiv Ratn Shah, Changyou Chen, Balaji Krishnamurthy

We call these models Large Content and Behavior Models (LCBMs).

Domain Adaptation

Paper
Add Code

Show Me What I Like: Detecting User-Specific Video Highlights Using Content-Based Multi-Head Attention

no code implementations • 18 Jul 2022 • Uttaran Bhattacharya, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha

We propose a method to detect individualized highlights for users on given target videos based on their preferred highlight clips marked on previous videos they have watched.

Highlight Detection

Paper
Add Code

HighlightMe: Detecting Highlights from Human-Centric Videos

no code implementations • ICCV 2021 • Uttaran Bhattacharya, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha

We train our network to map the activity- and interaction-based latent structural representations of the different modalities to per-frame highlight scores based on the representativeness of the frames.

Paper
Add Code

Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning

1 code implementation • 31 Jul 2021 • Uttaran Bhattacharya, Elizabeth Childs, Nicholas Rewkowski, Dinesh Manocha

Our network consists of two components: a generator to synthesize gestures from a joint embedding space of features encoded from the input speech and the seed poses, and a discriminator to distinguish between the synthesized pose sequences and real 3D pose sequences.

Ranked #4 on Gesture Generation on TED Gesture Dataset

Generative Adversarial Network Gesture Generation

Paper
Code

Learning Unseen Emotions from Gestures via Semantically-Conditioned Zero-Shot Perception with Adversarial Autoencoders

no code implementations • 18 Sep 2020 • Abhishek Banerjee, Uttaran Bhattacharya, Aniket Bera

Our task is to map gestures to novel emotion categories not encountered in training.

Generalized Zero-Shot Learning Representation Learning

Paper
Add Code

EmotiCon: Context-Aware Multimodal Emotion Recognition using Frege's Principle

no code implementations • CVPR 2020 • Trisha Mittal, Pooja Guhan, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

We report an AP of 65. 83 across 4 categories on GroupWalk, which is also an improvement over prior methods.

Ranked #2 on Emotion Recognition in Context on CAER

Emotion Recognition in Context Multimodal Emotion Recognition

Paper
Add Code

Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using Affective Cues

no code implementations • 14 Mar 2020 • Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

Additionally, we extract and compare affective cues corresponding to perceived emotion from the two modalities within a video to infer whether the input video is "real" or "fake".

DeepFake Detection Face Swapping

Paper
Add Code

The Liar's Walk: Detecting Deception with Gait and Gesture

no code implementations • 14 Dec 2019 • Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha

We present a data-driven deep neural algorithm for detecting deceptive walking behavior using nonverbal cues like gaits and gestures.

Action Classification

Paper
Add Code

Forecasting Trajectory and Behavior of Road-Agents Using Spectral Clustering in Graph-LSTMs

no code implementations • arXiv 2019 • Rohan Chandra, Tianrui Guan, Srujan Panuganti, Trisha Mittal, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

In practice, our approach reduces the average prediction error by more than 54% over prior algorithms and achieves a weighted average accuracy of 91. 2% for behavior prediction.

Ranked #1 on Trajectory Prediction on ApolloScape

Robotics

Paper
Add Code

Take an Emotion Walk: Perceiving Emotions from Gaits Using Hierarchical Attention Pooling and Affective Mapping

no code implementations • ECCV 2020 • Uttaran Bhattacharya, Christian Roncal, Trisha Mittal, Rohan Chandra, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha

For the annotated data, we also train a classifier to map the latent embeddings to emotion labels.

Action Recognition Emotion Recognition

Paper
Add Code

M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues

no code implementations • 9 Nov 2019 • Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

Our approach combines cues from multiple co-occurring modalities (such as face, text, and speech) and also is more robust than other methods to sensor noise in any of the individual modalities.

Multimodal Emotion Recognition

Paper
Add Code

STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

1 code implementation • 28 Oct 2019 • Uttaran Bhattacharya, Trisha Mittal, Rohan Chandra, Tanmay Randhavane, Aniket Bera, Dinesh Manocha

We use hundreds of annotated real-world gait videos and augment them with thousands of annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE).

General Classification

Paper
Code

RobustTP: End-to-End Trajectory Prediction for Heterogeneous Road-Agents in Dense Traffic with Noisy Sensor Inputs

1 code implementation • 20 Jul 2019 • Rohan Chandra, Uttaran Bhattacharya, Christian Roncal, Aniket Bera, Dinesh Manocha

RobustTP is an approach that first computes trajectories using a combination of a non-linear motion model and a deep learning-based instance segmentation algorithm.

Robotics

191

Paper
Code

RoadTrack: Realtime Tracking of Road Agents in Dense and Heterogeneous Environments

1 code implementation • 25 Jun 2019 • Rohan Chandra, Uttaran Bhattacharya, Tanmay Randhavane, Aniket Bera, Dinesh Manocha

We present a realtime tracking algorithm, RoadTrack, to track heterogeneous road-agents in dense traffic videos.

Robotics

Paper
Code

Identifying Emotions from Walking using Affective and Deep Features

no code implementations • 14 Jun 2019 • Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha

We also present an EWalk (Emotion Walk) dataset that consists of videos of walking individuals with gaits and labeled emotions.

Emotion Recognition

Paper
Add Code

Efficient and Robust Registration on the 3D Special Euclidean Group

no code implementations • ICCV 2019 • Uttaran Bhattacharya, Venu Madhav Govindu

Our approach significantly outperforms the state-of-the-art robust 3D registration method based on a line process in terms of both speed and accuracy.

Motion Estimation

Paper
Add Code

TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions

2 code implementations • CVPR 2019 • Rohan Chandra, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

We evaluate the performance of our prediction algorithm, TraPHic, on the standard datasets and also introduce a new dense, heterogeneous traffic dataset corresponding to urban Asian videos and agent trajectories.

Ranked #1 on Trajectory Prediction on TRAF

Trajectory Prediction Robotics

191

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.