Search Results for author: Uttaran Bhattacharya

Found 19 papers, 5 papers with code

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

no code implementations4 Mar 2024 Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen, Ishita Dasgupta, Saayan Mitra, Minh Hoai

To generate images with realistic hands, we propose a novel diffusion-based architecture called HanDiffuser that achieves realism by injecting hand embeddings in the generative process.

Text-to-Image Generation

VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

no code implementations4 Dec 2023 Yizhou Wang, Ruiyi Zhang, Haoliang Wang, Uttaran Bhattacharya, Yun Fu, Gang Wu

Recent advancements in language-model-based video understanding have been progressing at a remarkable pace, spurred by the introduction of Large Language Models (LLMs).

Language Modelling Question Answering +2

Show Me What I Like: Detecting User-Specific Video Highlights Using Content-Based Multi-Head Attention

no code implementations18 Jul 2022 Uttaran Bhattacharya, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha

We propose a method to detect individualized highlights for users on given target videos based on their preferred highlight clips marked on previous videos they have watched.

Highlight Detection

HighlightMe: Detecting Highlights from Human-Centric Videos

no code implementations ICCV 2021 Uttaran Bhattacharya, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha

We train our network to map the activity- and interaction-based latent structural representations of the different modalities to per-frame highlight scores based on the representativeness of the frames.

Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning

1 code implementation31 Jul 2021 Uttaran Bhattacharya, Elizabeth Childs, Nicholas Rewkowski, Dinesh Manocha

Our network consists of two components: a generator to synthesize gestures from a joint embedding space of features encoded from the input speech and the seed poses, and a discriminator to distinguish between the synthesized pose sequences and real 3D pose sequences.

Generative Adversarial Network Gesture Generation

Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using Affective Cues

no code implementations14 Mar 2020 Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

Additionally, we extract and compare affective cues corresponding to perceived emotion from the two modalities within a video to infer whether the input video is "real" or "fake".

DeepFake Detection Face Swapping

The Liar's Walk: Detecting Deception with Gait and Gesture

no code implementations14 Dec 2019 Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha

We present a data-driven deep neural algorithm for detecting deceptive walking behavior using nonverbal cues like gaits and gestures.

Action Classification

Forecasting Trajectory and Behavior of Road-Agents Using Spectral Clustering in Graph-LSTMs

no code implementations arXiv 2019 Rohan Chandra, Tianrui Guan, Srujan Panuganti, Trisha Mittal, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

In practice, our approach reduces the average prediction error by more than 54% over prior algorithms and achieves a weighted average accuracy of 91. 2% for behavior prediction.

Robotics

M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues

no code implementations9 Nov 2019 Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

Our approach combines cues from multiple co-occurring modalities (such as face, text, and speech) and also is more robust than other methods to sensor noise in any of the individual modalities.

Multimodal Emotion Recognition

STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

1 code implementation28 Oct 2019 Uttaran Bhattacharya, Trisha Mittal, Rohan Chandra, Tanmay Randhavane, Aniket Bera, Dinesh Manocha

We use hundreds of annotated real-world gait videos and augment them with thousands of annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE).

General Classification

RobustTP: End-to-End Trajectory Prediction for Heterogeneous Road-Agents in Dense Traffic with Noisy Sensor Inputs

1 code implementation20 Jul 2019 Rohan Chandra, Uttaran Bhattacharya, Christian Roncal, Aniket Bera, Dinesh Manocha

RobustTP is an approach that first computes trajectories using a combination of a non-linear motion model and a deep learning-based instance segmentation algorithm.

Robotics

RoadTrack: Realtime Tracking of Road Agents in Dense and Heterogeneous Environments

1 code implementation25 Jun 2019 Rohan Chandra, Uttaran Bhattacharya, Tanmay Randhavane, Aniket Bera, Dinesh Manocha

We present a realtime tracking algorithm, RoadTrack, to track heterogeneous road-agents in dense traffic videos.

Robotics

Identifying Emotions from Walking using Affective and Deep Features

no code implementations14 Jun 2019 Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha

We also present an EWalk (Emotion Walk) dataset that consists of videos of walking individuals with gaits and labeled emotions.

Emotion Recognition

Efficient and Robust Registration on the 3D Special Euclidean Group

no code implementations ICCV 2019 Uttaran Bhattacharya, Venu Madhav Govindu

Our approach significantly outperforms the state-of-the-art robust 3D registration method based on a line process in terms of both speed and accuracy.

Motion Estimation

TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions

2 code implementations CVPR 2019 Rohan Chandra, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

We evaluate the performance of our prediction algorithm, TraPHic, on the standard datasets and also introduce a new dense, heterogeneous traffic dataset corresponding to urban Asian videos and agent trajectories.

Trajectory Prediction Robotics

Cannot find the paper you are looking for? You can Submit a new open access paper.