Search Results for author: Dinesh Manocha

Found 166 papers, 57 papers with code

DocTime: A Document-level Temporal Dependency Graph Parser

no code implementations • NAACL 2022 • Puneet Mathur, Vlad Morariu, Verena Kaynig-Fittkau, Jiuxiang Gu, Franck Dernoncourt, Quan Tran, Ani Nenkova, Dinesh Manocha, Rajiv Jain

We introduce DocTime - a novel temporal dependency graph (TDG) parser that takes as input a text document and produces a temporal dependency graph.

Paper
Add Code

"Don't forget to put the milk back!" Dataset for Enabling Embodied Agents to Detect Anomalous Situations

no code implementations • 12 Apr 2024 • James F. Mullen Jr, Prasoon Goyal, Robinson Piramuthu, Michael Johnston, Dinesh Manocha, Reza Ghanadan

Our work assists in this goal by enabling robots to inform their users of dangerous or unsanitary anomalies in their home.

Paper
Add Code

AGL-NET: Aerial-Ground Cross-Modal Global Localization with Varying Scales

no code implementations • 4 Apr 2024 • Tianrui Guan, Ruiqi Xian, Xijun Wang, Xiyang Wu, Mohamed Elnoor, Daeun Song, Dinesh Manocha

We present AGL-NET, a novel learning-based method for global localization using LiDAR point clouds and satellite maps.

Paper
Add Code

PoCo: Point Context Cluster for RGBD Indoor Place Recognition

no code implementations • 3 Apr 2024 • Jing Liang, Zhuo Deng, Zheming Zhou, Omid Ghasemalizadeh, Dinesh Manocha, Min Sun, Cheng-Hao Kuo, Arnie Sen

We present a novel end-to-end algorithm (PoCo) for the indoor RGB-D place recognition task, aimed at identifying the most likely match for a given query frame within a reference database.

Paper
Add Code

CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP

no code implementations • 30 Mar 2024 • Chandra Kiran Reddy Evuru, Sreyan Ghosh, Sonal Kumar, Ramaneswaran S, Utkarsh Tyagi, Dinesh Manocha

We present CoDa (Constrained Generation based Data Augmentation), a controllable, effective, and training-free data augmentation technique for low-resource (data-scarce) NLP.

Data Augmentation Instruction Following

Paper
Add Code

Do Vision-Language Models Understand Compound Nouns?

no code implementations • 30 Mar 2024 • Sonal Kumar, Sreyan Ghosh, S Sakshi, Utkarsh Tyagi, Dinesh Manocha

We curate Compun, a novel benchmark with 400 unique and commonly used CNs, to evaluate the effectiveness of VLMs in interpreting CNs.

Image Retrieval Language Modelling +2

Paper
Add Code

Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis

no code implementations • 18 Mar 2024 • Vishnu Sashank Dorbala, Sanjoy Chowdhury, Dinesh Manocha

We present a novel approach to automatically synthesize "wayfinding instructions" for an embodied robot agent.

In-Context Learning Question Answering +1

Paper
Add Code

Global Optimality without Mixing Time Oracles in Average-reward RL via Multi-level Actor-Critic

no code implementations • 18 Mar 2024 • Bhrij Patel, Wesley A. Suttle, Alec Koppel, Vaneet Aggarwal, Brian M. Sadler, Amrit Singh Bedi, Dinesh Manocha

In the context of average-reward reinforcement learning, the requirement for oracle knowledge of the mixing time, a measure of the duration a Markov chain under a fixed policy needs to achieve its stationary distribution-poses a significant challenge for the global convergence of policy gradient methods.

Policy Gradient Methods

Paper
Add Code

Right Place, Right Time! Towards ObjectNav for Non-Stationary Goals

no code implementations • 14 Mar 2024 • Vishnu Sashank Dorbala, Bhrij Patel, Amrit Singh Bedi, Dinesh Manocha

We address this concern by inferring results on two cases for object placement: one where the objects placed follow a routine or a path, and the other where they are placed at random.

Object Visual Grounding

Paper
Add Code

Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning

no code implementations • 13 Mar 2024 • Peihong Yu, Manav Mishra, Alec Koppel, Carl Busart, Priya Narayan, Dinesh Manocha, Amrit Bedi, Pratap Tokekar

Multi-Agent Reinforcement Learning (MARL) algorithms face the challenge of efficient exploration due to the exponential increase in the size of the joint state-action space.

Efficient Exploration Multi-agent Reinforcement Learning +1

Paper
Add Code

On the Safety Concerns of Deploying LLMs/VLMs in Robotics: Highlighting the Risks and Vulnerabilities

no code implementations • 15 Feb 2024 • Xiyang Wu, Ruiqi Xian, Tianrui Guan, Jing Liang, Souradip Chakraborty, Fuxiao Liu, Brian Sadler, Dinesh Manocha, Amrit Singh Bedi

However, such integration can introduce significant vulnerabilities, in terms of their susceptibility to adversarial attacks due to the language models, potentially leading to catastrophic consequences.

Language Modelling

Paper
Add Code

MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences

no code implementations • 14 Feb 2024 • Souradip Chakraborty, Jiahao Qiu, Hui Yuan, Alec Koppel, Furong Huang, Dinesh Manocha, Amrit Singh Bedi, Mengdi Wang

Reinforcement Learning from Human Feedback (RLHF) aligns language models to human preferences by employing a singular reward model derived from preference data.

Fairness reinforcement-learning

Paper
Add Code

A Closer Look at the Limitations of Instruction Tuning

no code implementations • 3 Feb 2024 • Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Ramaneswaran S, Deepali Aneja, Zeyu Jin, Ramani Duraiswami, Dinesh Manocha

Our findings reveal that responses generated solely from pre-trained knowledge consistently outperform responses by models that learn any form of new knowledge from IT on open-source datasets.

Hallucination

Paper
Add Code

REBEL: A Regularization-Based Solution for Reward Overoptimization in Robotic Reinforcement Learning from Human Feedback

no code implementations • 22 Dec 2023 • Souradip Chakraborty, Anukriti Singh, Amisha Bhaskar, Pratap Tokekar, Dinesh Manocha, Amrit Singh Bedi

Current methods to mitigate this misalignment work by learning reward functions from human preferences; however, they inadvertently introduce a risk of reward overoptimization.

Bilevel Optimization Continuous Control +2

Paper
Add Code

Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech Recognition

1 code implementation • 20 Dec 2023 • Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

Specifically, first, we perform vanilla continued pre-training on an initial SSL pre-trained model on the target domain ASR dataset and call it the teacher.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

FusDom: Combining In-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning

1 code implementation • 20 Dec 2023 • Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

Continued pre-training (CP) offers multiple advantages, like target domain adaptation and the potential to exploit the continuous stream of unlabeled data available online.

Domain Adaptation Self-Supervised Learning

Paper
Code

APoLLo: Unified Adapter and Prompt Learning for Vision Language Models

no code implementations • 4 Dec 2023 • Sanjoy Chowdhury, Sayan Nag, Dinesh Manocha

Our method is designed to substantially improve the generalization capabilities of VLP models when they are fine-tuned in a few-shot setting.

Paper
Add Code

AV-RIR: Audio-Visual Room Impulse Response Estimation

no code implementations • 30 Nov 2023 • Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar, Purva Chiniya, Dinesh Manocha

We propose AV-RIR, a novel multi-modal multi-task learning approach to accurately estimate the RIR from a given reverberant speech signal and the visual cues of its corresponding environment.

Multi-Task Learning Room Impulse Response (RIR) +1

Paper
Add Code

HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View

2 code implementations • 27 Nov 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha

It seamlessly blends the visual features from the input image within a pretrained text-to-2Dimage stable diffusion model with a test-time optimization process for a careful bias-variance trade-off, which uses an Inverse Perspective Mapping (IPM) homography transformation to provide subtle cues for aerialview synthesis.

Novel View Synthesis

Paper
Code

UAV-Sim: NeRF-based Synthetic Data Generation for UAV-based Perception

no code implementations • 25 Oct 2023 • Christopher Maxey, Jaehoon Choi, Hyungtae Lee, Dinesh Manocha, Heesung Kwon

Using various synthetic renderers in conjunction with perception models is prevalent to create synthetic data to augment the learning in the ground-based imaging domain.

Data Augmentation Image Generation +2

Paper
Add Code

DALE: Generative Data Augmentation for Low-Resource Legal NLP

1 code implementation • 24 Oct 2023 • Sreyan Ghosh, Chandra Kiran Evuru, Sonal Kumar, S Ramaneswaran, S Sakshi, Utkarsh Tyagi, Dinesh Manocha

We present DALE, a novel and effective generative Data Augmentation framework for low-resource LEgal NLP.

Data Augmentation Denoising +2

Paper
Code

Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey

no code implementations • 23 Oct 2023 • Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit Singh Bedi

But in parallel to the development of detection frameworks, researchers have also concentrated on designing strategies to elude detection, i. e., focusing on the impossibilities of AI-generated text detection.

Misinformation Text Detection

Paper
Add Code

HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models

5 code implementations • 23 Oct 2023 • Tianrui Guan, Fuxiao Liu, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou

Our comprehensive case studies within HallusionBench shed light on the challenges of hallucination and illusion in LVLMs.

Ranked #1 on Visual Question Answering (VQA) on HallusionBench

Hallucination Visual Question Answering (VQA)

210

Paper
Code

Indoor Wireless Signal Modeling with Smooth Surface Diffraction Effects

no code implementations • 16 Oct 2023 • Ruichen Wang, Samuel Audia, Dinesh Manocha

We present a novel algorithm that enhances the accuracy of electromagnetic field simulations in indoor environments by incorporating the Uniform Geometrical Theory of Diffraction (UTD) for surface diffraction.

Paper
Add Code

CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models

no code implementations • 12 Oct 2023 • Sreyan Ghosh, Ashish Seth, Sonal Kumar, Utkarsh Tyagi, Chandra Kiran Evuru, S. Ramaneswaran, S. Sakshi, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha

In this paper, we propose CompA, a collection of two expert-annotated benchmarks with a majority of real-world audio samples, to evaluate compositional reasoning in ALMs.

Attribute Audio Classification +1

Paper
Add Code

RECAP: Retrieval-Augmented Audio Captioning

1 code implementation • 18 Sep 2023 • Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha

We present RECAP (REtrieval-Augmented Audio CAPtioning), a novel and effective audio captioning system that generates captions conditioned on an input audio and other captions similar to the audio retrieved from a datastore.

AudioCaps Audio captioning +2

Paper
Code

VAPOR: Legged Robot Navigation in Outdoor Vegetation Using Offline Reinforcement Learning

1 code implementation • 14 Sep 2023 • Kasun Weerakoon, Adarsh Jagan Sathyamoorthy, Mohamed Elnoor, Dinesh Manocha

We present VAPOR, a novel method for autonomous legged robot navigation in unstructured, densely vegetated outdoor environments using offline Reinforcement Learning (RL).

Offline RL reinforcement-learning +2

Paper
Code

AdVerb: Visually Guided Audio Dereverberation

no code implementations • ICCV 2023 • Sanjoy Chowdhury, Sreyan Ghosh, Subhrajyoti Dasgupta, Anton Ratnarajah, Utkarsh Tyagi, Dinesh Manocha

We present AdVerb, a novel audio-visual dereverberation framework that uses visual cues in addition to the reverberant sound to estimate clean audio.

Speaker Verification Speech Enhancement +2

Paper
Add Code

ASPIRE: Language-Guided Augmentation for Robust Image Classification

no code implementations • 19 Aug 2023 • Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Sakshi Singh, Sanjoy Chowdhury, Dinesh Manocha

This paper presents ASPIRE (Language-guided data Augmentation for SPurIous correlation REmoval), a simple yet effective solution for expanding the training dataset with synthetic images without spurious features.

Classification Data Augmentation +2

Paper
Add Code

PARL: A Unified Framework for Policy Alignment in Reinforcement Learning

no code implementations • 3 Aug 2023 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Dinesh Manocha, Huazheng Wang, Mengdi Wang, Furong Huang

We present a novel unified bilevel optimization-based framework, \textsf{PARL}, formulated to address the recently highlighted critical issue of policy alignment in reinforcement learning using utility or preference-based feedback.

Bilevel Optimization Procedure Learning +2

Paper
Add Code

LoLep: Single-View View Synthesis with Locally-Learned Planes and Self-Attention Occlusion Inference

no code implementations • ICCV 2023 • Cong Wang, Yu-Ping Wang, Dinesh Manocha

We demonstrate the effectiveness of our approach and generate state-of-the-art results on different datasets.

Paper
Add Code

Human Trajectory Forecasting with Explainable Behavioral Uncertainty

no code implementations • 4 Jul 2023 • Jiangbei Yue, Dinesh Manocha, He Wang

Model-free methods offer superior prediction accuracy but lack explainability, while model-based methods provide explainability but cannot predict well.

Self-Driving Cars Trajectory Forecasting

Paper
Add Code

iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning

1 code implementation • 9 Jun 2023 • Xiyang Wu, Rohan Chandra, Tianrui Guan, Amrit Singh Bedi, Dinesh Manocha

Our approach for intent-aware planning, iPLAN, allows agents to infer nearby drivers' intents solely from their local observations.

Autonomous Vehicles Collision Avoidance +3

Paper
Code

Ada-NAV: Adaptive Trajectory Length-Based Sample Efficient Policy Learning for Robotic Navigation

no code implementations • 9 Jun 2023 • Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Tianyi Zhou, Amrit Singh Bedi, Dinesh Manocha

Trajectory length stands as a crucial hyperparameter within reinforcement learning (RL) algorithms, significantly contributing to the sample inefficiency in robotics applications.

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER

1 code implementation • 1 Jun 2023 • Sreyan Ghosh, Utkarsh Tyagi, Manan Suri, Sonal Kumar, S Ramaneswaran, Dinesh Manocha

In addition, we demonstrate the application of ACLM to other domains that suffer from data scarcity (e. g., biomedical).

Data Augmentation Denoising +5

Paper
Code

PLAR: Prompt Learning for Action Recognition

no code implementations • 21 May 2023 • Xijun Wang, Ruiqi Xian, Tianrui Guan, Dinesh Manocha

We evaluate our approach on datasets consisting of both ground camera videos and aerial videos, and scenes with single-agent and multi-agent actions.

Ranked #1 on Action Recognition on Okutama-Action

Action Recognition Optical Flow Estimation

Paper
Add Code

BioAug: Conditional Generation based Data Augmentation for Low-Resource Biomedical NER

1 code implementation • 18 May 2023 • Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Dinesh Manocha

Though data augmentation has shown to be highly effective for low-resource NER in general, existing data augmentation techniques fail to produce factual and diverse augmentations for BioNER.

Data Augmentation named-entity-recognition +2

Paper
Code

PMI Sampler: Patch Similarity Guided Frame Selection for Aerial Action Recognition

1 code implementation • 14 Apr 2023 • Ruiqi Xian, Xijun Wang, Divya Kothandaraman, Dinesh Manocha

Our algorithm utilizes the motion bias within aerial videos, which enables the selection of motion-salient frames.

Ranked #1 on Action Recognition on UAV-Human

Action Recognition Temporal Action Localization

Paper
Code

On the Possibilities of AI-Generated Text Detection

no code implementations • 10 Apr 2023 • Souradip Chakraborty, Amrit Singh Bedi, Sicheng Zhu, Bang An, Dinesh Manocha, Furong Huang

Our work addresses the critical issue of distinguishing text generated by Large Language Models (LLMs) from human-produced text, a task essential for numerous applications.

Text Detection

Paper
Add Code

CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition

1 code implementation • ICCV 2023 • Tianrui Guan, Aswath Muthuselvam, Montana Hoover, Xijun Wang, Jing Liang, Adarsh Jagan Sathyamoorthy, Damon Conover, Dinesh Manocha

We present CrossLoc3D, a novel 3D place recognition method that solves a large-scale point matching problem in a cross-source setting.

Ranked #1 on 3D Place Recognition on CS-Campus3D

3D Place Recognition Metric Learning

Paper
Code

TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering

no code implementations • CVPR 2023 • Jaehoon Choi, Dongki Jung, Taejae Lee, SangWook Kim, Youngdong Jung, Dinesh Manocha, Donghwan Lee

We present a new pipeline for acquiring a textured mesh in the wild with a single smartphone which offers access to images, depth maps, and valid poses.

3D Reconstruction Surface Reconstruction +1

Paper
Add Code

PACE: Data-Driven Virtual Agent Interaction in Dense and Cluttered Environments

no code implementations • 24 Mar 2023 • James Mullen, Dinesh Manocha

We compare our method with prior motion generating techniques and highlight the benefits of our method with a perceptual study and physical plausibility metrics.

Motion Synthesis

Paper
Add Code

Dynamic EM Ray Tracing for Large Urban Scenes with Multiple Receivers

no code implementations • 19 Mar 2023 • Ruichen Wang, Dinesh Manocha

We present a novel ray tracing-based radio propagation algorithm that can handle large urban scenes with hundreds or thousands of dynamic objects and receivers.

Blocking

Paper
Add Code

Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances

1 code implementation • 17 Mar 2023 • Arun V. Reddy, Ketul Shah, William Paul, Rohita Mocharla, Judy Hoffman, Kapil D. Katyal, Dinesh Manocha, Celso M. de Melo, Rama Chellappa

The dataset is composed of both real and synthetic videos from seven gesture classes, and is intended to support the study of synthetic-to-real domain shift for video-based action recognition.

Action Recognition Domain Adaptation +1

Paper
Code

Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models

3 code implementations • 15 Mar 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha

Aerial Diffusion leverages a pretrained text-image diffusion model for prior knowledge.

Paper
Code

RE-MOVE: An Adaptive Policy Design for Robotic Navigation Tasks in Dynamic Environments via Language-Based Feedback

no code implementations • 14 Mar 2023 • Souradip Chakraborty, Kasun Weerakoon, Prithvi Poddar, Mohamed Elnoor, Priya Narayanan, Carl Busart, Pratap Tokekar, Amrit Singh Bedi, Dinesh Manocha

Reinforcement learning-based policies for continuous control robotic navigation tasks often fail to adapt to changes in the environment during real-time deployment, which may result in catastrophic failures.

Continuous Control Zero-Shot Learning

Paper
Add Code

UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation

1 code implementation • 10 Mar 2023 • Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

Unlike prior works, which directly fine-tune a self-supervised pre-trained encoder on a target dataset, we use the encoder to generate pseudo-labels for unsupervised fine-tuning before the actual fine-tuning step.

Audio Classification Self-Supervised Learning

Paper
Code

Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Guided Exploration for Zero-Shot Object Navigation

1 code implementation • 6 Mar 2023 • Vishnu Sashank Dorbala, James F. Mullen Jr., Dinesh Manocha

We present LGX (Language-guided Exploration), a novel algorithm for Language-Driven Zero-Shot Object Goal Navigation (L-ZSON), where an embodied agent navigates to a uniquely described target object in a previously unseen environment.

Motion Planning Object +3

Paper
Code

MITFAS: Mutual Information based Temporal Feature Alignment and Sampling for Aerial Video Action Recognition

1 code implementation • 5 Mar 2023 • Ruiqi Xian, Xijun Wang, Dinesh Manocha

We present a novel approach for action recognition in UAV videos.

Ranked #2 on Action Recognition on UAV-Human

Action Recognition Temporal Action Localization

Paper
Code

AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning

no code implementations • 2 Mar 2023 • Xijun Wang, Ruiqi Xian, Tianrui Guan, Celso M. de Melo, Stephen M. Nogar, Aniket Bera, Dinesh Manocha

We propose a novel approach for aerial video action recognition.

Ranked #1 on Action Recognition on RoCoG-v2

Action Recognition Temporal Action Localization

Paper
Add Code

CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network

1 code implementation • 2 Mar 2023 • Sreyan Ghosh, Manan Suri, Purva Chiniya, Utkarsh Tyagi, Sonal Kumar, Dinesh Manocha

The tremendous growth of social media users interacting in online conversations has led to significant growth in hate speech, affecting people from various demographics.

Paper
Code

Listen2Scene: Interactive material-aware binaural sound propagation for reconstructed 3D scenes

no code implementations • 2 Feb 2023 • Anton Ratnarajah, Dinesh Manocha

We propose a novel neural-network-based binaural sound propagation method to generate acoustic effects for indoor 3D models of real environments.

Generative Adversarial Network

Paper
Add Code

STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning

no code implementations • 28 Jan 2023 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Mengdi Wang, Furong Huang, Dinesh Manocha

Directed Exploration is a crucial challenge in reinforcement learning (RL), especially when rewards are sparse.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic

no code implementations • 28 Jan 2023 • Wesley A. Suttle, Amrit Singh Bedi, Bhrij Patel, Brian M. Sadler, Alec Koppel, Dinesh Manocha

Many existing reinforcement learning (RL) methods employ stochastic gradient iteration on the back end, whose stability hinges upon a hypothesis that the data-generating process mixes exponentially fast with a rate parameter that appears in the step-size selection.

Reinforcement Learning (RL)

Paper
Add Code

LayerDoc: Layer-wise Extraction of Spatial Hierarchical Structure in Visually-Rich Documents

no code implementations • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023 • Puneet Mathur, Rajiv Jain, Ashutosh Mehra, Jiuxiang Gu, Franck Dernoncourt, Anandhavelu N, Quan Tran, Verena Kaynig-Fittkau, Ani Nenkova, Dinesh Manocha, Vlad I. Morariu

Experiments show that our approach outperforms competitive baselines by 10-15% on three diverse datasets of forms and mobile app screen layouts for the tasks of spatial region classification, higher-order group identification, layout hierarchy extraction, reading order detection, and word grouping.

Reading Order Detection

Paper
Add Code

Synthetic Wave-Geometric Impulse Responses for Improved Speech Dereverberation

no code implementations • 10 Dec 2022 • Rohith Aralikatti, Zhenyu Tang, Dinesh Manocha

We present a novel approach to improve the performance of learning-based speech dereverberation using accurate synthetic datasets.

Speech Dereverberation

Paper
Add Code

Towards Improved Room Impulse Response Estimation for Speech Recognition

no code implementations • 8 Nov 2022 • Anton Ratnarajah, Ishwarya Ananthabhotla, Vamsi Krishna Ithapu, Pablo Hoffmann, Dinesh Manocha, Paul Calamia

We propose a novel approach for blind room impulse response (RIR) estimation systems in the context of a downstream application scenario, far-field automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

SLICER: Learning universal audio representations using low-resource self-supervised pre-training

1 code implementation • 2 Nov 2022 • Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

We present a new Self-Supervised Learning (SSL) approach to pre-train encoders on unlabeled audio data that reduces the need for large amounts of labeled data for audio and speech classification.

Audio Classification Clustering +3

Paper
Code

MAST: Multiscale Audio Spectrogram Transformers

1 code implementation • 2 Nov 2022 • Sreyan Ghosh, Ashish Seth, S. Umesh, Dinesh Manocha

We present Multiscale Audio Spectrogram Transformer (MAST) for audio classification, which brings the concept of multiscale feature hierarchies to the Audio Spectrogram Transformer (AST).

Audio Classification Keyword Spotting +1

Paper
Code

MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation

no code implementations • 19 Sep 2022 • Aaron M. Roth, Jing Liang, Ram Sriram, Elham Tabassi, Dinesh Manocha

Moreover, we present efficient policy distillation and tree-modification techniques that take advantage of the decision tree structure to allow improvements to a policy without retraining.

Imitation Learning reinforcement-learning +2

Paper
Add Code

Differentiable Frequency-based Disentanglement for Aerial Video Action Recognition

no code implementations • 15 Sep 2022 • Divya Kothandaraman, Ming Lin, Dinesh Manocha

We build a differentiable static-dynamic frequency mask prior to model the salient static and dynamic pixels in the video, crucial for the underlying task of action recognition.

Action Recognition Activity Recognition In Videos +2

Paper
Add Code

Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes

no code implementations • 13 Sep 2022 • James F. Mullen Jr, Divya Kothandaraman, Aniket Bera, Dinesh Manocha

We compare our method, which we call PAAK, with prior approaches, including POSA, PROX ground truth, and a motion synthesis method, and highlight the benefits of our method with a perceptual study.

Motion Synthesis

Paper
Add Code

DC-MRTA: Decentralized Multi-Robot Task Allocation and Navigation in Complex Environments

no code implementations • 7 Sep 2022 • Aakriti Agrawal, Senthil Hariharan, Amrit Singh Bedi, Dinesh Manocha

At the higher level, we solve the task allocation by formulating it in terms of Markov Decision Processes and choosing the appropriate rewards to minimize the Total Travel Delay (TTD).

Reinforcement Learning (RL)

Paper
Add Code

Vision-Centric BEV Perception: A Survey

1 code implementation • 4 Aug 2022 • Yuexin Ma, Tai Wang, Xuyang Bai, Huitong Yang, Yuenan Hou, Yaming Wang, Yu Qiao, Ruigang Yang, Dinesh Manocha, Xinge Zhu

In recent years, vision-centric Bird's Eye View (BEV) perception has garnered significant interest from both industry and academia due to its inherent advantages, such as providing an intuitive representation of the world and being conducive to data fusion.

636

Paper
Code

A Repulsive Force Unit for Garment Collision Handling in Neural Networks

no code implementations • 28 Jul 2022 • Qingyang Tan, Yi Zhou, Tuanfeng Wang, Duygu Ceylan, Xin Sun, Dinesh Manocha

Despite recent success, deep learning-based methods for predicting 3D garment deformation under body motion suffer from interpenetration problems between the garment and the body.

Paper
Add Code

Video Manipulations Beyond Faces: A Dataset with Human-Machine Analysis

1 code implementation • 26 Jul 2022 • Trisha Mittal, Ritwik Sinha, Viswanathan Swaminathan, John Collomosse, Dinesh Manocha

To this end, we present VideoSham, a dataset consisting of 826 videos (413 real and 413 manipulated).

Face Swapping Misinformation

Paper
Code

Human Trajectory Prediction via Neural Social Physics

1 code implementation • 21 Jul 2022 • Jiangbei Yue, Dinesh Manocha, He Wang

Our new model (Neural Social Physics or NSP) is a deep neural network within which we use an explicit physics model with learnable parameters.

Ranked #1 on Trajectory Prediction on Stanford Drone

Inductive Bias Trajectory Prediction

115

Paper
Code

D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights

1 code implementation • 21 Jul 2022 • Yuzhen Zhang, Wentong Wang, Weizhi Guo, Pei Lv, Mingliang Xu, Wei Chen, Dinesh Manocha

We present a trajectory prediction approach with respect to traffic lights, D2-TPred, which uses a spatial dynamic interaction graph (SDG) and a behavior dependency graph (BDG) to handle the problem of discontinuous dependency in the spatial-temporal space.

Trajectory Prediction

Paper
Code

Show Me What I Like: Detecting User-Specific Video Highlights Using Content-Based Multi-Head Attention

no code implementations • 18 Jul 2022 • Uttaran Bhattacharya, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha

We propose a method to detect individualized highlights for users on given target videos based on their preferred highlight clips marked on previous videos they have watched.

Highlight Detection

Paper
Add Code

FedBC: Calibrating Global and Local Models via Federated Learning Beyond Consensus

no code implementations • 22 Jun 2022 • Amrit Singh Bedi, Chen Fan, Alec Koppel, Anit Kumar Sahu, Brian M. Sadler, Furong Huang, Dinesh Manocha

In this work, we quantitatively calibrate the performance of global and local models in federated learning through a multi-criterion optimization-based framework, which we cast as a constrained program.

Federated Learning

Paper
Add Code

Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies

no code implementations • 12 Jun 2022 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Pratap Tokekar, Dinesh Manocha

In this paper, we present a novel Heavy-Tailed Stochastic Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems.

Continuous Control OpenAI Gym

Paper
Add Code

Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning

no code implementations • 2 Jun 2022 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Brian M. Sadler, Furong Huang, Pratap Tokekar, Dinesh Manocha

Model-based approaches to reinforcement learning (MBRL) exhibit favorable performance in practice, but their theoretical guarantees in large spaces are mostly restricted to the setting when transition model is Gaussian or Lipschitz, and demands a posterior estimate whose representational complexity grows unbounded with time.

Continuous Control Model-based Reinforcement Learning +2

Paper
Add Code

SALAD: Source-free Active Label-Agnostic Domain Adaptation for Classification, Segmentation and Detection

1 code implementation • 24 May 2022 • Divya Kothandaraman, Sumit Shekhar, Abhilasha Sancheti, Manoj Ghuhan, Tripti Shukla, Dinesh Manocha

SALAD has three key benefits: (i) it is task-agnostic, and can be applied across various visual tasks such as classification, segmentation and detection; (ii) it can handle shifts in output label space from the pre-trained source network to the target domain; (iii) it does not require access to source data for adaptation.

Active Learning Domain Adaptation +2

Paper
Code

MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes

2 code implementations • 18 May 2022 • Anton Ratnarajah, Zhenyu Tang, Rohith Chandrashekar Aralikatti, Dinesh Manocha

We show that the acoustic metrics of the IRs predicted from our MESH2IR match the ground truth with less than 10% error.

2k Speech Dereverberation +1

137

Paper
Code

Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks

1 code implementation • 3 May 2022 • Xiaoyu Pan, Jiaming Mai, Xinwei Jiang, Dongxue Tang, Jingxiang Li, Tianjia Shao, Kun Zhou, Xiaogang Jin, Dinesh Manocha

We present a learning algorithm that uses bone-driven motion networks to predict the deformation of loose-fitting garment meshes at interactive rates.

117

Paper
Code

STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes

1 code implementation • CVPR 2022 • Peishan Cong, Xinge Zhu, Feng Qiao, Yiming Ren, Xidong Peng, Yuenan Hou, Lan Xu, Ruigang Yang, Dinesh Manocha, Yuexin Ma

In addition, considering the property of sparse global distribution and density-varying local distribution of pedestrians, we further propose a novel method, Density-aware Hierarchical heatmap Aggregation (DHA), to enhance pedestrian perception in crowded scenes.

Pedestrian Detection Sensor Fusion

Paper
Code

M-MELD: A Multilingual Multi-Party Dataset for Emotion Recognition in Conversations

1 code implementation • 31 Mar 2022 • Sreyan Ghosh, S Ramaneswaran, Utkarsh Tyagi, Harshvardhan Srivastava, Samden Lepcha, S Sakshi, Dinesh Manocha

Expression of emotions is a crucial part of daily human communication.

Emotion Recognition

Paper
Code

MMER: Multimodal Multi-task Learning for Speech Emotion Recognition

1 code implementation • 31 Mar 2022 • Sreyan Ghosh, Utkarsh Tyagi, S Ramaneswaran, Harshvardhan Srivastava, Dinesh Manocha

In this paper, we propose MMER, a novel Multimodal Multi-task learning approach for Speech Emotion Recognition.

Ranked #2 on Speech Emotion Recognition on IEMOCAP (using extra training data)

Multi-Task Learning Speech Emotion Recognition

Paper
Code

3MASSIV: Multilingual, Multimodal and Multi-Aspect dataset of Social Media Short Videos

no code implementations • CVPR 2022 • Vikram Gupta, Trisha Mittal, Puneet Mathur, Vaibhav Mishra, Mayank Maheshwari, Aniket Bera, Debdoot Mukherjee, Dinesh Manocha

We present 3MASSIV, a multilingual, multimodal and multi-aspect, expertly-annotated dataset of diverse short videos extracted from short-video social media platform - Moj.

Paper
Add Code

FAR: Fourier Aerial Video Recognition

1 code implementation • 21 Mar 2022 • Divya Kothandaraman, Tianrui Guan, Xijun Wang, Sean Hu, Ming Lin, Dinesh Manocha

Our formulation uses a novel Fourier object disentanglement method to innately separate out the human agent (which is typically small) from the background.

Ranked #1 on Action Recognition on UAV Human

Action Recognition Disentanglement +1

Paper
Code

SelfTune: Metrically Scaled Monocular Depth Estimation through Self-Supervised Learning

no code implementations • 10 Mar 2022 • Jaehoon Choi, Dongki Jung, Yonghan Lee, Deokhwa Kim, Dinesh Manocha, Donghwan Lee

Given these metric poses and monocular sequences, we propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation.

Monocular Depth Estimation Robot Navigation +2

Paper
Add Code

Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models

no code implementations • 16 Feb 2022 • Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram

Experimental results indicate that both audio and text-based models improve the emotion recognition performance and that the proposed multimodal solution achieves state-of-the-art results on the IEMOCAP benchmark.

Data Augmentation Emotional Intelligence +3

Paper
Add Code

An Intelligent Self-driving Truck System For Highway Transportation

no code implementations • 31 Dec 2021 • Dawei Wang, Lingping Gao, Ziquan Lan, Wei Li, Jiaping Ren, Jiahui Zhang, Peng Zhang, Pei Zhou, Shengao Wang, Jia Pan, Dinesh Manocha, Ruigang Yang

Recently, there have been many advances in autonomous driving society, attracting a lot of attention from academia and industry.

Autonomous Driving Decision Making

Paper
Add Code

N-Cloth: Predicting 3D Cloth Deformation with Mesh-Based Networks

no code implementations • 13 Dec 2021 • Yudi Li, Min Tang, Yun Yang, Zi Huang, Ruofeng Tong, Shuangcai Yang, Yao Li, Dinesh Manocha

We present a novel mesh-based learning approach (N-Cloth) for plausible 3D cloth deformation prediction.

Paper
Add Code

Active Learning of Neural Collision Handler for Complex 3D Mesh Deformations

no code implementations • 8 Oct 2021 • Qingyang Tan, Zherong Pan, Breannan Smith, Takaaki Shiratori, Dinesh Manocha

We present a robust learning algorithm to detect and handle collisions in 3D deforming meshes.

Active Learning

Paper
Add Code

FAST-RIR: Fast neural diffuse room impulse response generator

2 code implementations • 7 Oct 2021 • Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu

We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

137

Paper
Code

HighlightMe: Detecting Highlights from Human-Centric Videos

no code implementations • ICCV 2021 • Uttaran Bhattacharya, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha

We train our network to map the activity- and interaction-based latent structural representations of the different modalities to per-frame highlight scores based on the representativeness of the frames.

Paper
Add Code

METEOR:A Dense, Heterogeneous, and Unstructured Traffic Dataset With Rare Behaviors

no code implementations • 16 Sep 2021 • Rohan Chandra, Xijun Wang, Mridul Mahajan, Rahul Kala, Rishitha Palugulla, Chandrababu Naidu, Alok Jain, Dinesh Manocha

We present a new traffic dataset, METEOR, which captures traffic patterns and multi-agent driving behaviors in unstructured scenarios.

Autonomous Driving object-detection +1

Paper
Add Code

MotionHint: Self-Supervised Monocular Visual Odometry with Motion Constraints

1 code implementation • 14 Sep 2021 • Cong Wang, Yu-Ping Wang, Dinesh Manocha

A key aspect of our approach is to use an appropriate motion model that can help existing self-supervised monocular VO (SSM-VO) algorithms to overcome issues related to the local minima within their self-supervised loss functions.

Monocular Visual Odometry

Paper
Code

DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes

no code implementations • ICCV 2021 • Dongki Jung, Jaehoon Choi, Yonghan Lee, Deokhwa Kim, Changick Kim, Dinesh Manocha, Donghwan Lee

We present a novel approach for estimating depth from a monocular camera as it moves through complex and crowded indoor environments, e. g., a department store or a metro station.

3D Reconstruction Depth Estimation

Paper
Add Code

Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation

no code implementations • 5 Aug 2021 • Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram

Automatic speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.

Emotion Classification Speaker Recognition +2

Paper
Add Code

TIMERS: Document-level Temporal Relation Extraction

no code implementations • ACL 2021 • Puneet Mathur, Rajiv Jain, Franck Dernoncourt, Vlad Morariu, Quan Hung Tran, Dinesh Manocha

We present TIMERS - a TIME, Rhetorical and Syntactic-aware model for document-level temporal relation classification in the English language.

Ranked #3 on Temporal Relation Classification on TB-Dense

Relation Relation Classification +1

Paper
Add Code

Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning

1 code implementation • 31 Jul 2021 • Uttaran Bhattacharya, Elizabeth Childs, Nicholas Rewkowski, Dinesh Manocha

Our network consists of two components: a generator to synthesize gestures from a joint embedding space of features encoded from the input speech and the seed poses, and a discriminator to distinguish between the synthesized pose sequences and real 3D pose sequences.

Ranked #4 on Gesture Generation on TED Gesture Dataset

Generative Adversarial Network Gesture Generation

Paper
Code

Improving Reverberant Speech Separation with Multi-stage Training and Curriculum Learning

no code implementations • 19 Jul 2021 • Rohith Aralikatti, Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha

We present a novel approach that improves the performance of reverberant speech separation.

Speech Separation

Paper
Add Code

M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers

1 code implementation • 24 Apr 2021 • Tianrui Guan, Jun Wang, Shiyi Lan, Rohan Chandra, Zuxuan Wu, Larry Davis, Dinesh Manocha

We present a novel architecture for 3D object detection, M3DeTR, which combines different point cloud representations (raw, voxels, bird-eye view) with different feature scales based on multi-scale feature pyramids.

Ranked #1 on 3D Object Detection on KITTI Cars Hard val

3D Object Detection object-detection +1

Paper
Code

XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees

1 code implementation • 22 Apr 2021 • Aaron M. Roth, Jing Liang, Dinesh Manocha

In order to increase the reliability and handle the failure cases of the expert policy, we combine with a policy extraction technique to transform the resulting policy into a decision tree format.

Explainable Artificial Intelligence (XAI) Robot Navigation

Paper
Code

Scene-aware Far-field Automatic Speech Recognition

no code implementations • 21 Apr 2021 • Zhenyu Tang, Dinesh Manocha

We use a deep learning-based estimator to non-intrusively compute the sub-band reverberation time of an environment from its speech samples.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Robust 2D/3D Vehicle Parsing in CVIS

no code implementations • 11 Mar 2021 • Hui Miao, Feixiang Lu, Zongdai Liu, Liangjun Zhang, Dinesh Manocha, Bin Zhou

We combine these novel algorithms and datasets to develop a robust approach for 2D/3D vehicle parsing for CVIS.

Benchmarking Data Augmentation +4

Paper
Add Code

Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality

2 code implementations • CVPR 2021 • Trisha Mittal, Puneet Mathur, Aniket Bera, Dinesh Manocha

We use an LSTM-based learning model for emotion perception.

Scene Understanding Time Series +1

Paper
Code

GANav: Efficient Terrain Segmentation for Robot Navigation in Unstructured Outdoor Environments

1 code implementation • 7 Mar 2021 • Tianrui Guan, Divya Kothandaraman, Rohan Chandra, Adarsh Jagan Sathyamoorthy, Kasun Weerakoon, Dinesh Manocha

We interface GANav with a deep reinforcement learning-based navigation algorithm and highlight its benefits in terms of navigation in real-world unstructured terrains.

Ranked #1 on Semantic Segmentation on RUGD

Robot Navigation Semantic Segmentation

Paper
Code

Dynamic Graph Modeling of Simultaneous EEG and Eye-tracking Data for Reading Task Identification

no code implementations • 21 Feb 2021 • Puneet Mathur, Trisha Mittal, Dinesh Manocha

We present a new approach, that we call AdaGTCN, for identifying human reader intent from Electroencephalogram~(EEG) and Eye movement~(EM) data in order to help differentiate between normal reading and task-oriented reading.

EEG Graph Learning

Paper
Add Code

Example-based Real-time Clothing Synthesis for Virtual Agents

no code implementations • 8 Jan 2021 • Nannan Wu, Qianwen Chao, Yanzhen Chen, Weiwei Xu, Chen Liu, Dinesh Manocha, Wenxin Sun, Yi Han, Xinran Yao, Xiaogang Jin

Given a query shape and pose of the virtual agent, we synthesize the resulting clothing deformation by blending the Taylor expansion results of nearby anchoring points.

Graphics

Paper
Add Code

Robust 2D/3D Vehicle Parsing in Arbitrary Camera Views for CVIS

1 code implementation • ICCV 2021 • Hui Miao, Feixiang Lu, Zongdai Liu, Liangjun Zhang, Dinesh Manocha, Bin Zhou

We combine these novel algorithms and datasets to develop a robust approach for 2D/3D vehicle parsing for CVIS.

Benchmarking Data Augmentation +4

Paper
Code

Fast 3D Acoustic Scattering via Discrete Laplacian Based Implicit Function Encoders

no code implementations • 1 Jan 2021 • Hsien-Yu Meng, Zhenyu Tang, Dinesh Manocha

Acoustic properties of objects corresponding to scattering characteristics are frequently used for 3D audio content creation, environmental acoustic effects, localization and acoustic scene analysis, etc.

Paper
Add Code

Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation

1 code implementation • 15 Dec 2020 • Feixiang Lu, Zongdai Liu, Hui Miao, Peng Wang, Liangjun Zhang, Ruigang Yang, Dinesh Manocha, Bin Zhou

For autonomous driving, the dynamics and states of vehicle parts such as doors, the trunk, and the bonnet can provide meaningful semantic information and interaction states, which are essential to ensuring the safety of the self-driving vehicle.

Autonomous Driving Data Augmentation +3

Paper
Code

SS-SFDA : Self-Supervised Source-Free Domain Adaptation for Road Segmentation in Hazardous Environments

2 code implementations • 27 Nov 2020 • Divya Kothandaraman, Rohan Chandra, Dinesh Manocha

We present a novel approach for unsupervised road segmentation in adverse weather conditions such as rain or fog.

Autonomous Vehicles Road Segmentation +4

Paper
Code

Developing an Effective and Automated Patient Engagement Estimator for Telehealth: A Machine Learning Approach

no code implementations • 17 Nov 2020 • Pooja Guhan, Naman Awasthi, and Kathryn McDonald, Kristin Bussell, Dinesh Manocha, Gloria Reeves, Aniket Bera

We discuss MET, a learning-based algorithm proposed for perceiving a patient's level of engagement during telehealth sessions.

regression

Paper
Add Code

Sound Synthesis, Propagation, and Rendering: A Survey

no code implementations • 11 Nov 2020 • Shiguang Liu, Dinesh Manocha

To the best of our knowledge, this is the first attempt to provide a comprehensive summary of sound research in the field of computer graphics.

Sound Graphics

Paper
Add Code

SelfDeco: Self-Supervised Monocular Depth Completion in Challenging Indoor Environments

no code implementations • 10 Nov 2020 • Jaehoon Choi, Dongki Jung, Yonghan Lee, Deokhwa Kim, Dinesh Manocha, Donghwan Lee

We present a novel algorithm for self-supervised monocular depth completion.

Depth Completion

Paper
Add Code

B-GAP: Behavior-Rich Simulation and Navigation for Autonomous Driving

3 code implementations • 7 Nov 2020 • Angelos Mavrogiannis, Rohan Chandra, Dinesh Manocha

We address the problem of ego-vehicle navigation in dense simulated traffic environments populated by road agents with varying driver behaviors.

Robotics

2,358

Paper
Code

Multi-Window Data Augmentation Approach for Speech Emotion Recognition

no code implementations • 19 Oct 2020 • Sarala Padi, Dinesh Manocha, Ram D. Sriram

MWA-SER is a unimodal approach that focuses on two key concepts; designing the speech augmentation method and building the deep learning model to recognize the underlying emotion of an audio signal.

Data Augmentation Speech Emotion Recognition

Paper
Add Code

BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in Unstructured Driving Environments

1 code implementation • 22 Sep 2020 • Divya Kothandaraman, Rohan Chandra, Dinesh Manocha

We present an unsupervised adaptation approach for visual scene understanding in unstructured traffic environments.

Scene Understanding Semantic Segmentation +1

Paper
Code

Multi-Agent Coverage in Urban Environments

1 code implementation • 17 Aug 2020 • Shivang Patel, Senthil Hariharan, Pranav Dhulipala, Ming C Lin, Dinesh Manocha, Huan Xu, Michael Otte

We study multi-agent coverage algorithms for autonomous monitoring and patrol in urban environments.

Robotics

Paper
Code

PerMO: Perceiving More at Once from a Single Image for Autonomous Driving

no code implementations • 16 Jul 2020 • Feixiang Lu, Zongdai Liu, Xibin Song, Dingfu Zhou, Wei Li, Hui Miao, Miao Liao, Liangjun Zhang, Bin Zhou, Ruigang Yang, Dinesh Manocha

We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image for autonomous driving.

3D Reconstruction Autonomous Driving +3

Paper
Add Code

AutoTrajectory: Label-free Trajectory Extraction and Prediction from Videos using Dynamic Points

no code implementations • ECCV 2020 • Yuexin Ma, Xinge ZHU, Xinjing Cheng, Ruigang Yang, Jiming Liu, Dinesh Manocha

Then we aggregate dynamic points to instance points, which stand for moving objects such as pedestrians in videos.

Image Reconstruction Trajectory Prediction

Paper
Add Code

MCQA: Multimodal Co-attention Based Network for Question Answering

no code implementations • 25 Apr 2020 • Abhishek Kumar, Trisha Mittal, Dinesh Manocha

We present MCQA, a learning-based algorithm for multimodal question answering.

Question Answering

Paper
Add Code

OF-VO: Efficient Navigation among Pedestrians Using Commodity Sensors

no code implementations • 23 Apr 2020 • Jing Liang, Yi-Ling Qiao, Dinesh Manocha

Overall, our OF-VO algorithm using learning-based perception and model-based planning methods offers better performance than prior algorithms in terms of navigation time and success rate of collision avoidance.

Robotics

Paper
Add Code

Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using Affective Cues

no code implementations • 14 Mar 2020 • Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

Additionally, we extract and compare affective cues corresponding to perceived emotion from the two modalities within a video to infer whether the input video is "real" or "fake".

DeepFake Detection Face Swapping

Paper
Add Code

EmotiCon: Context-Aware Multimodal Emotion Recognition using Frege's Principle

no code implementations • CVPR 2020 • Trisha Mittal, Pooja Guhan, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

We report an AP of 65. 83 across 4 categories on GroupWalk, which is also an improvement over prior methods.

Ranked #2 on Emotion Recognition in Context on CAER

Emotion Recognition in Context Multimodal Emotion Recognition

Paper
Add Code

ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation

1 code implementation • 2 Mar 2020 • Venkatraman Narayanan, Bala Murali Manoghar, Vishnu Sashank Dorbala, Dinesh Manocha, Aniket Bera

Our approach predicts the perceived emotions of a pedestrian from walking gaits, which is then used for emotion-guided navigation taking into account social and proxemic constraints.

Ranked #1 on Emotion Classification on EWALK

Emotion Classification Emotion Recognition +3

Paper
Code

SPA: Verbal Interactions between Agents and Avatars in Shared Virtual Environments using Propositional Planning

no code implementations • 8 Feb 2020 • Andrew Best, Sahil Narang, Dinesh Manocha

We present a novel approach for generating plausible verbal interactions between virtual human-like agents and user avatars in shared virtual environments.

Single Particle Analysis

Paper
Add Code

Deep Differentiable Grasp Planner for High-DOF Grippers

no code implementations • 4 Feb 2020 • Min Liu, Zherong Pan, Kai Xu, Kanishka Ganguly, Dinesh Manocha

We present an end-to-end algorithm for training deep neural networks to grasp novel objects.

Robotics

Paper
Add Code

The Liar's Walk: Detecting Deception with Gait and Gesture

no code implementations • 14 Dec 2019 • Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha

We present a data-driven deep neural algorithm for detecting deceptive walking behavior using nonverbal cues like gaits and gestures.

Action Classification

Paper
Add Code

Reinforcement Learning-based Visual Navigation with Information-Theoretic Regularization

1 code implementation • 9 Dec 2019 • Qiaoyun Wu, Kai Xu, Jun Wang, Mingliang Xu, Dinesh Manocha

The regularization maximizes the mutual information between navigation actions and visual observation transforms of an agent, thus promoting more informed navigation decisions.

Robotics

Paper
Code

Forecasting Trajectory and Behavior of Road-Agents Using Spectral Clustering in Graph-LSTMs

no code implementations • arXiv 2019 • Rohan Chandra, Tianrui Guan, Srujan Panuganti, Trisha Mittal, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

In practice, our approach reduces the average prediction error by more than 54% over prior algorithms and achieves a weighted average accuracy of 91. 2% for behavior prediction.

Ranked #1 on Trajectory Prediction on ApolloScape

Robotics

Paper
Add Code

Take an Emotion Walk: Perceiving Emotions from Gaits Using Hierarchical Attention Pooling and Affective Mapping

no code implementations • ECCV 2020 • Uttaran Bhattacharya, Christian Roncal, Trisha Mittal, Rohan Chandra, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha

For the annotated data, we also train a classifier to map the latent embeddings to emotion labels.

Action Recognition Emotion Recognition

Paper
Add Code

Scene-Aware Audio Rendering via Deep Acoustic Analysis

no code implementations • 14 Nov 2019 • Zhenyu Tang, Nicholas J. Bryan, DIngzeyu Li, Timothy R. Langlois, Dinesh Manocha

We present a new method to capture the acoustic characteristics of real-world rooms using commodity devices, and use the captured characteristics to generate similar sounding sources with virtual models.

Sound Graphics Multimedia Audio and Speech Processing

Paper
Add Code

M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues

no code implementations • 9 Nov 2019 • Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

Our approach combines cues from multiple co-occurring modalities (such as face, text, and speech) and also is more robust than other methods to sensor noise in any of the individual modalities.

Multimodal Emotion Recognition

Paper
Add Code

Personality-Aware Probabilistic Map for Trajectory Prediction of Pedestrians

no code implementations • 1 Nov 2019 • Chaochao Li, Pei Lv, Mingliang Xu, Xinyu Wang, Dinesh Manocha, Bing Zhou, Meng Wang

We update this map dynamically based on the agents in the environment and prior trajectory of a pedestrian.

Trajectory Prediction

Paper
Add Code

STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

1 code implementation • 28 Oct 2019 • Uttaran Bhattacharya, Trisha Mittal, Rohan Chandra, Tanmay Randhavane, Aniket Bera, Dinesh Manocha

We use hundreds of annotated real-world gait videos and augment them with thousands of annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE).

General Classification

Paper
Code

Learning Resilient Behaviors for Navigation Under Uncertainty

no code implementations • 22 Oct 2019 • Tingxiang Fan, Pinxin Long, Wenxi Liu, Jia Pan, Ruigang Yang, Dinesh Manocha

Deep reinforcement learning has great potential to acquire complex, adaptive behaviors for autonomous agents automatically.

Autonomous Driving

Paper
Add Code

DeepMNavigate: Deep Reinforced Multi-Robot Navigation Unifying Local & Global Collision Avoidance

no code implementations • 4 Oct 2019 • Qingyang Tan, Tingxiang Fan, Jia Pan, Dinesh Manocha

We present a novel algorithm (DeepMNavigate) for global multi-agent navigation in dense scenarios using deep reinforcement learning (DRL).

Collision Avoidance Position +3

Paper
Add Code

Realtime Simulation of Thin-Shell Deformable Materials using CNN-Based Mesh Embedding

no code implementations • 26 Sep 2019 • Qingyang Tan, Zherong Pan, Lin Gao, Dinesh Manocha

We present a new algorithm to embed a high-dimensional configuration space of deformable objects in a low-dimensional feature space, where the configurations of objects and feature points have approximate one-to-one mapping.

Dimensionality Reduction Robot Manipulation

Paper
Add Code

Training a Constrained Natural Media Painting Agent using Reinforcement Learning

no code implementations • 25 Sep 2019 • Biao Jia, Jonathan Brandt, Radomir Mech, Ning Xu, Byungmoon Kim, Dinesh Manocha

We present a novel approach to train a natural media painting using reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

RobustTP: End-to-End Trajectory Prediction for Heterogeneous Road-Agents in Dense Traffic with Noisy Sensor Inputs

1 code implementation • 20 Jul 2019 • Rohan Chandra, Uttaran Bhattacharya, Christian Roncal, Aniket Bera, Dinesh Manocha

RobustTP is an approach that first computes trajectories using a combination of a non-linear motion model and a deep learning-based instance segmentation algorithm.

Robotics

191

Paper
Code

Improving Reverberant Speech Training Using Diffuse Acoustic Simulation

no code implementations • 9 Jul 2019 • Zhenyu Tang, Lian-Wu Chen, Bo Wu, Dong Yu, Dinesh Manocha

We present an efficient and realistic geometric acoustic simulation approach for generating and augmenting training data in speech-related machine learning tasks.

BIG-bench Machine Learning Keyword Spotting +2

Paper
Add Code

FVA: Modeling Perceived Friendliness of Virtual Agents Using Movement Characteristics

no code implementations • 30 Jun 2019 • Tanmay Randhavane, Aniket Bera, Kyra Kapsaskis, Kurt Gray, Dinesh Manocha

We also investigate the perception of a user in an AR setting and observe that an FVA has a statistically significant improvement in terms of the perceived friendliness and social presence of a user compared to an agent without the friendliness modeling.

Paper
Add Code

RoadTrack: Realtime Tracking of Road Agents in Dense and Heterogeneous Environments

1 code implementation • 25 Jun 2019 • Rohan Chandra, Uttaran Bhattacharya, Tanmay Randhavane, Aniket Bera, Dinesh Manocha

We present a realtime tracking algorithm, RoadTrack, to track heterogeneous road-agents in dense traffic videos.

Robotics

Paper
Code

NeoNav: Improving the Generalization of Visual Navigation via Generating Next Expected Observations

1 code implementation • 17 Jun 2019 • Qiaoyun Wu, Dinesh Manocha, Jun Wang, Kai Xu

First, the latent distribution is conditioned on current observations and the target view, leading to a model-based, target-driven navigation.

Visual Navigation

Paper
Code

LPaintB: Learning to Paint from Self-Supervision

no code implementations • 17 Jun 2019 • Biao Jia, Jonathan Brandt, Radomir Mech, Byungmoon Kim, Dinesh Manocha

We present a novel reinforcement learning-based natural media painting algorithm.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Identifying Emotions from Walking using Affective and Deep Features

no code implementations • 14 Jun 2019 • Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha

We also present an EWalk (Emotion Walk) dataset that consists of videos of walking individuals with gaits and labeled emotions.

Emotion Recognition

Paper
Add Code

Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks

1 code implementation • 17 Apr 2019 • Zhenyu Tang, John D. Kanu, Kevin Hogan, Dinesh Manocha

We present a novel learning-based approach to estimate the direction-of-arrival (DOA) of a sound source using a convolutional recurrent neural network (CRNN) trained via regression on synthetic data and Cartesian labels.

Ranked #1 on Direction of Arrival Estimation on SOFA (using extra training data)

Direction of Arrival Estimation General Classification +1

Paper
Code

PaintBot: A Reinforcement Learning Approach for Natural Media Painting

no code implementations • 3 Apr 2019 • Biao Jia, Chen Fang, Jonathan Brandt, Byungmoon Kim, Dinesh Manocha

Action selection is guided by a given reference image, which the agent attempts to replicate subject to the limitations of the action space and the agent's learned policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Generating Grasp Poses for a High-DOF Gripper Using Neural Networks

no code implementations • 1 Mar 2019 • Min Liu, Zherong Pan, Kai Xu, Kanishka Ganguly, Dinesh Manocha

The quality of the grasp poses is on par with the groundtruth poses in the dataset.

Robotics

Paper
Add Code

AADS: Augmented Autonomous Driving Simulation using Data-driven Algorithms

1 code implementation • 23 Jan 2019 • Wei Li, Chengwei Pan, Rong Zhang, Jiaping Ren, Yuexin Ma, Jin Fang, Feilong Yan, Qichuan Geng, Xinyu Huang, Huajun Gong, Weiwei Xu, Guoping Wang, Dinesh Manocha, Ruigang Yang

Our augmented approach combines the flexibility in a virtual environment (e. g., vehicle movements) with the richness of the real world to allow effective simulation of anywhere on earth.

Autonomous Driving

528

Paper
Code

TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions

2 code implementations • CVPR 2019 • Rohan Chandra, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

We evaluate the performance of our prediction algorithm, TraPHic, on the standard datasets and also introduce a new dense, heterogeneous traffic dataset corresponding to urban Asian videos and agent trajectories.

Ranked #1 on Trajectory Prediction on TRAF

Trajectory Prediction Robotics

191

Paper
Code

VV-Net: Voxel VAE Net with Group Convolutions for Point Cloud Segmentation

1 code implementation • 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2020 • Hsien-Yu Meng, Lin Gao, Yu-Kun Lai, Dinesh Manocha

Our approach results in a good volumetric representation that effectively tackles noisy point cloud datasets and is more robust for learning.

Graphics

Paper
Code

TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents

1 code implementation • 6 Nov 2018 • Yuexin Ma, Xinge Zhu, Sibo Zhang, Ruigang Yang, Wenping Wang, Dinesh Manocha

To safely and efficiently navigate in complex urban traffic, autonomous vehicles must make responsible predictions in relation to surrounding traffic-agents (vehicles, bicycles, pedestrians, etc.).

Ranked #1 on Trajectory Prediction on Apolloscape Trajectory

Autonomous Vehicles Navigate +2

528

Paper
Code

Pedestrian Dominance Modeling for Socially-Aware Robot Navigation

no code implementations • 15 Oct 2018 • Tanmay Randhavane, Aniket Bera, Emily Kubin, Austin Wang, Kurt Gray, Dinesh Manocha

We present a Pedestrian Dominance Model (PDM) to identify the dominance characteristics of pedestrians for robot navigation.

Robotics

Paper
Add Code

TZC: Efficient Inter-Process Communication for Robotics Middleware with Partial Serialization

2 code implementations • 1 Oct 2018 • Yu-Ping Wang, Wende Tan, Xu-Qiang Hu, Dinesh Manocha, Shi-Min Hu

We show that by using TZC, the braking distance can be shortened by 16% than ROS.

Robotics

Paper
Code

Data-Driven Modeling of Group Entitativity in Virtual Environments

no code implementations • 28 Sep 2018 • Aniket Bera, Tanmay Randhavane, Emily Kubin, Husam Shaik, Kurt Gray, Dinesh Manocha

We also present a novel interactive multi-agent simulation algorithm to model entitative groups and conduct a VR user study to validate the socio-emotional predictive power of our algorithm.

Graphics Human-Computer Interaction

Paper
Add Code

Safe Navigation with Human Instructions in Complex Scenes

no code implementations • 12 Sep 2018 • Zhe Hu, Jia Pan, Tingxiang Fan, Ruigang Yang, Dinesh Manocha

In this paper, we present a robotic navigation algorithm with natural language interfaces, which enables a robot to safely walk through a changing environment with moving persons by following human instructions such as "go to the restaurant and keep away from people".

Collision Avoidance Motion Planning +2

Paper
Add Code

Transferring Grasp Configurations using Active Learning and Local Replanning

no code implementations • 22 Jul 2018 • Hao Tian, Changbo Wang, Dinesh Manocha, Xin-Yu Zhang

We compute a grasp space for each part of the example object using active learning.

Robotics

Paper
Add Code

PORCA: Modeling and Planning for Autonomous Driving among Many Pedestrians

no code implementations • 30 May 2018 • Yuanfu Luo, Panpan Cai, Aniket Bera, David Hsu, Wee Sun Lee, Dinesh Manocha

Our planning system combines a POMDP algorithm with the pedestrian motion model and runs in near real time.

Robotics

Paper
Add Code

Efficient Reciprocal Collision Avoidance between Heterogeneous Agents Using CTMAT

no code implementations • 7 Apr 2018 • Yuexin Ma, Dinesh Manocha, Wenping Wang

We present a novel algorithm for reciprocal collision avoidance between heterogeneous agents of different shapes and sizes.

Collision Avoidance

Paper
Add Code

Identifying Driver Behaviors using Trajectory Features for Vehicle Navigation

no code implementations • 2 Mar 2018 • Ernest Cheung, Aniket Bera, Emily Kubin, Kurt Gray, Dinesh Manocha

We present a novel approach to automatically identify driver behaviors from vehicle trajectories and use them for safe navigation of autonomous vehicles.

Robotics

Paper
Add Code

MixedPeds: Pedestrian Detection in Unannotated Videos using Synthetically Generated Human-agents for Training

no code implementations • 28 Jul 2017 • Ernest C. Cheung, Tsan Kwong Wong, Aniket Bera, Dinesh Manocha

We present a new method for training pedestrian detectors on an unannotated set of images.

Mixed Reality Pedestrian Detection

Paper
Add Code

Efficient Generation of Motion Plans from Attribute-Based Natural Language Instructions Using Dynamic Constraint Mapping

no code implementations • 8 Jul 2017 • Jae Sung Park, Biao Jia, Mohit Bansal, Dinesh Manocha

We generate a factor graph from natural language instructions called the Dynamic Grounding Graph (DGG), which takes latent parameters into account.

Robotics

Paper
Add Code

Recurrent 3D Attentional Networks for End-to-End Active Object Recognition

no code implementations • 14 Oct 2016 • Min Liu, Yifei Shi, Lintao Zheng, Kai Xu, Hui Huang, Dinesh Manocha

Active vision is inherently attention-driven: The agent actively selects views to attend in order to fast achieve the vision task while improving its internal representation of the scene being observed.

Object Recognition

Paper
Add Code

LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning

no code implementations • 29 Jun 2016 • Ernest Cheung, Tsan Kwong Wong, Aniket Bera, Xiaogang Wang, Dinesh Manocha

We present a novel procedural framework to generate an arbitrary number of labeled crowd videos (LCrowdV).

General Classification Pedestrian Detection +1

Paper
Add Code

Exemplar-AMMs: Recognizing Crowd Movements from Pedestrian Trajectories

no code implementations • 31 Mar 2016 • Wenxi Liu, Rynson W. H. Lau, Xiaogang Wang, Dinesh Manocha

Specifically, we propose an optimization framework that filters out the unknown noise in the crowd trajectories and measures their similarity to the exemplar-AMMs to produce a crowd motion feature.

Multi-Label Classification

Paper
Add Code

3D Reconstruction in the Presence of Glasses by Acoustic and Stereo Fusion

no code implementations • CVPR 2015 • Mao Ye, Yu Zhang, Ruigang Yang, Dinesh Manocha

We present a novel sensor fusion algorithm that first segments the depth map into different categories such as opaque/transparent/infinity (e. g., too far to measure) and then updates the depth map based on the segmentation outcome.

3D Reconstruction Sensor Fusion +1

Paper
Add Code

Real-time Crowd Tracking using Parameter Optimized Mixture of Motion Models

no code implementations • 16 Sep 2014 • Aniket Bera, David Wolinski, Julien Pettré, Dinesh Manocha

We automatically compute the optimal parameters for each of these different models based on prior tracked data and use the best model as motion prior for our particle-filter based tracking algorithm.

Combinatorial Optimization

Paper
Add Code

Spoke-Darts for High-Dimensional Blue-Noise Sampling

1 code implementation • 5 Aug 2014 • Scott A. Mitchell, Mohamed S. Ebeida, Muhammad A. Awad, Chonhyon Park, Anjul Patney, Ahmad A. Rushdi, Laura P. Swiler, Dinesh Manocha, Li-Yi Wei

Blue noise sampling has proved useful for many graphics applications, but remains underexplored in high-dimensional spaces due to the difficulty of generating distributions and proving properties about them.

Graphics

Paper
Code

Realtime Multilevel Crowd Tracking using Reciprocal Velocity Obstacles

no code implementations • 11 Feb 2014 • Aniket Bera, Dinesh Manocha

We present a novel, realtime algorithm to compute the trajectory of each pedestrian in moderately dense crowd scenes.

Paper
Add Code

Leveraging Long-Term Predictions and Online-Learning in Agent-based Multiple Person Tracking

no code implementations • 10 Feb 2014 • Wenxi Liu, Antoni B. Chan, Rynson W. H. Lau, Dinesh Manocha

We present a multiple-person tracking algorithm, based on combining particle filters and RVO, an agent-based crowd model that infers collision-free velocities so as to predict pedestrian's motion.

Position

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.