no code implementations • 23 Sep 2024 • Jin Huang, Subhadra Gopalakrishnan, Trisha Mittal, Jake Zuena, Jaclyn Pytlarz
Recent advancements in Artificial Intelligence have led to remarkable improvements in generating realistic human faces.
no code implementations • 26 Oct 2022 • Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald
Synthesizing natural head motion to accompany speech for an embodied conversational agent is necessary for providing a rich interactive experience.
1 code implementation • 26 Jul 2022 • Trisha Mittal, Ritwik Sinha, Viswanathan Swaminathan, John Collomosse, Dinesh Manocha
To this end, we present VideoSham, a dataset consisting of 826 videos (413 real and 413 manipulated).
no code implementations • CVPR 2022 • Vikram Gupta, Trisha Mittal, Puneet Mathur, Vaibhav Mishra, Mayank Maheshwari, Aniket Bera, Debdoot Mukherjee, Dinesh Manocha
We present 3MASSIV, a multilingual, multimodal and multi-aspect, expertly-annotated dataset of diverse short videos extracted from short-video social media platform - Moj.
2 code implementations • CVPR 2021 • Trisha Mittal, Puneet Mathur, Aniket Bera, Dinesh Manocha
We use an LSTM-based learning model for emotion perception.
no code implementations • 21 Feb 2021 • Puneet Mathur, Trisha Mittal, Dinesh Manocha
We present a new approach, that we call AdaGTCN, for identifying human reader intent from Electroencephalogram~(EEG) and Eye movement~(EM) data in order to help differentiate between normal reading and task-oriented reading.
no code implementations • 25 Apr 2020 • Abhishek Kumar, Trisha Mittal, Dinesh Manocha
We present MCQA, a learning-based algorithm for multimodal question answering.
no code implementations • CVPR 2020 • Trisha Mittal, Pooja Guhan, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha
We report an AP of 65. 83 across 4 categories on GroupWalk, which is also an improvement over prior methods.
Ranked #2 on
Emotion Recognition in Context
on CAER
Emotion Recognition in Context
Multimodal Emotion Recognition
no code implementations • 14 Mar 2020 • Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha
Additionally, we extract and compare affective cues corresponding to perceived emotion from the two modalities within a video to infer whether the input video is "real" or "fake".
no code implementations • arXiv 2019 • Rohan Chandra, Tianrui Guan, Srujan Panuganti, Trisha Mittal, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha
In practice, our approach reduces the average prediction error by more than 54% over prior algorithms and achieves a weighted average accuracy of 91. 2% for behavior prediction.
Ranked #1 on
Trajectory Prediction
on ApolloScape
Robotics
no code implementations • ECCV 2020 • Uttaran Bhattacharya, Christian Roncal, Trisha Mittal, Rohan Chandra, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha
For the annotated data, we also train a classifier to map the latent embeddings to emotion labels.
no code implementations • 9 Nov 2019 • Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha
Our approach combines cues from multiple co-occurring modalities (such as face, text, and speech) and also is more robust than other methods to sensor noise in any of the individual modalities.
1 code implementation • 28 Oct 2019 • Uttaran Bhattacharya, Trisha Mittal, Rohan Chandra, Tanmay Randhavane, Aniket Bera, Dinesh Manocha
We use hundreds of annotated real-world gait videos and augment them with thousands of annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE).
1 code implementation • 29 Jan 2018 • Ravi Kiran Sarvadevabhatla, Shiv Surya, Trisha Mittal, Venkatesh Babu Radhakrishnan
Similarly, performance on multi-disciplinary tasks such as Visual Question Answering (VQA) is considered a marker for gauging progress in Computer Vision.