1 code implementation • 18 Mar 2021 • Karnik Ram, Chaitanya Kharyal, Sudarshan S. Harithas, K. Madhava Krishna
We evaluate our approach on this dataset, and three diverse sequences from standard datasets including two real-world dynamic sequences and show a significant improvement in robustness and accuracy over a state-of-the-art monocular visual-inertial odometry system.
3 code implementations • 19 Feb 2020 • Kaustubh Mani, Swapnil Daga, Shubhika Garg, N. Sai Shankar, Krishna Murthy Jatavallabhula, K. Madhava Krishna
We dub this problem amodal scene layout estimation, which involves "hallucinating" scene layout for even parts of the world that are occluded in the image.
1 code implementation • 26 Feb 2018 • Sarthak Sharma, Junaid Ahmed Ansari, J. Krishna Murthy, K. Madhava Krishna
This paper introduces geometry and object shape and pose costs for multi-object tracking in urban driving scenarios.
Ranked #2 on 3D Multi-Object Tracking on KITTI
1 code implementation • 15 Mar 2021 • Udit Singh Parihar, Aniket Gujarathi, Kinal Mehta, Satyajit Tourani, Sourav Garg, Michael Milford, K. Madhava Krishna
The use of local detectors and descriptors in typical computer vision pipelines work well until variations in viewpoint and appearance change become extreme.
1 code implementation • 9 May 2020 • Sravan Mylavarapu, Mahtab Sandhu, Priyesh Vijayan, K. Madhava Krishna, Balaraman Ravindran, Anoop Namboodiri
We present a novel Multi-Relational Graph Convolutional Network (MRGCN) based framework to model on-road vehicle behaviors from a sequence of temporally ordered frames as grabbed by a moving monocular camera.
Ranked #1 on Test results on KITTI
1 code implementation • 16 Feb 2020 • Sai Shubodh Puligilla, Satyajit Tourani, Tushar Vaidya, Udit Singh Parihar, Ravi Kiran Sarvadevabhatla, K. Madhava Krishna
At the intermediate level, the map is represented as a Manhattan Graph where the nodes and edges are characterized by Manhattan properties and as a Pose Graph at the lower-most level of detail.
1 code implementation • 3 Feb 2020 • Sravan Mylavarapu, Mahtab Sandhu, Priyesh Vijayan, K. Madhava Krishna, Balaraman Ravindran, Anoop Namboodiri
Understanding on-road vehicle behaviour from a temporal sequence of sensor data is gaining in popularity.
1 code implementation • 16 Mar 2021 • Meher Shashwat Nigam, Avinash Prabhu, Anurag Sahu, Puru Gupta, Tanvi Karandikar, N. Sai Shankar, Ravi Kiran Sarvadevabhatla, K. Madhava Krishna
Given a monocular colour image of a warehouse rack, we aim to predict the bird's-eye view layout for each shelf in the rack, which we term as multi-layer layout prediction.
4 code implementations • 6 Mar 2018 • Junaid Ahmed Ansari, Sarthak Sharma, Anshuman Majumdar, J. Krishna Murthy, K. Madhava Krishna
The proposed approach significantly improves the state-of-the-art for monocular object localization on arbitrarily-shaped roads.
1 code implementation • 13 Sep 2020 • Nivedita Rufus, Unni Krishnan R Nair, K. Madhava Krishna, Vineet Gandhi
In this paper, we present a simple baseline for visual grounding for autonomous driving which outperforms the state of the art methods, while retaining minimal design choices.
Ranked #6 on Referring Expression Comprehension on Talk2Car
1 code implementation • 25 Nov 2020 • Rahul Sajnani, AadilMehdi Sanchawala, Krishna Murthy Jatavallabhula, Srinath Sridhar, K. Madhava Krishna
We present DRACO, a method for Dense Reconstruction And Canonicalization of Object shape from one or more RGB images.
1 code implementation • 22 Mar 2018 • Ganesh Iyer, R. Karnik Ram., J. Krishna Murthy, K. Madhava Krishna
During training, the network only takes as input a LiDAR point cloud, the corresponding monocular image, and the camera calibration matrix K. At train time, we do not impose direct supervision (i. e., we do not directly regress to the calibration parameters, for example).
2 code implementations • 12 Mar 2020 • Aasheesh Singh, Aditya Kamireddypalli, Vineet Gandhi, K. Madhava Krishna
In this paper, we present a method to reliably detect such obstacles through a multi-modal framework of sparse LiDAR(VLP-16) and Monocular vision.
no code implementations • 11 Apr 2018 • Ganesh Iyer, J. Krishna Murthy, Gunshi Gupta, K. Madhava Krishna, Liam Paull
We show that using a noisy teacher, which could be a standard VO pipeline, and by designing a loss term that enforces geometric consistency of the trajectory, we can train accurate deep models for VO that do not require ground-truth labels.
no code implementations • 17 Mar 2018 • Krishnam Gupta, Syed Ashar Javed, Vineet Gandhi, K. Madhava Krishna
We present here, a novel network architecture called MergeNet for discovering small obstacles for on-road scenes in the context of autonomous driving.
no code implementations • 27 Feb 2018 • Parijat Dewangan, S Phaniteja, K. Madhava Krishna, Abhishek Sarkar, Balaraman Ravindran
In this paper, we propose a new approach for simultaneous training of multiple tasks sharing a set of common actions in continuous action spaces, which we call as DiGrad (Differential Policy Gradient).
no code implementations • 26 Feb 2018 • Parv Parkhiya, Rishabh Khawad, J. Krishna Murthy, Brojeshwar Bhowmick, K. Madhava Krishna
These category models are instance-independent and aid in the design of object landmark observations that can be incorporated into a generic monocular SLAM framework.
no code implementations • 6 Oct 2017 • Roopal Nahar, Akanksha Baranwal, K. Madhava Krishna
Efficient and real time segmentation of color images has a variety of importance in many fields of computer vision such as image compression, medical imaging, mapping and autonomous navigation.
no code implementations • 10 Jun 2017 • Aseem Saxena, Harit Pandya, Gourav Kumar, Ayush Gaud, K. Madhava Krishna
In this paper, we present an end-to-end learning based approach for visual servoing in diverse scenes where the knowledge of camera parameters and scene geometry is not available a priori.
no code implementations • 18 Apr 2017 • Nazrul Haque, N. Dinesh Reddy, K. Madhava Krishna
This paper proposes an approach to fuse semantic features and motion clues using CNNs, to address the problem of monocular semantic motion segmentation.
no code implementations • 29 Sep 2016 • J. Krishna Murthy, G. V. Sai Krishna, Falak Chhaya, K. Madhava Krishna
We then formulate a shape-aware adjustment problem that uses the learnt shape priors to recover the 3D pose and shape of a query object from an image.
no code implementations • 27 Apr 2015 • N. Dinesh Reddy, Prateek Singhal, Visesh Chari, K. Madhava Krishna
We show results on the challenging KITTI urban dataset for accuracy of motion segmentation and reconstruction of the trajectory and shape of moving objects relative to ground truth.
no code implementations • 24 Apr 2015 • N. Dinesh Reddy, Prateek Singhal, K. Madhava Krishna
We pro- pose an algorithm that jointly infers the semantic class and motion labels of an object.
no code implementations • 23 Dec 2013 • Prateek Singhal, Aditya Deshpande, N. Dinesh Reddy, K. Madhava Krishna
to perform better classification and merging .
no code implementations • 17 Nov 2018 • Meha Kaushik, Phaniteja S, K. Madhava Krishna
This paper proposes a novel architecture to learn multiple driving behaviors in a traffic scenario.
no code implementations • 23 Dec 2018 • Vignesh Prasad, Karmesh Yadav, Rohitashva Singh Saurabh, Swapnil Daga, Nahas Pareekutty, K. Madhava Krishna, Balaraman Ravindran, Brojeshwar Bhowmick
Monocular SLAM refers to using a single camera to estimate robot ego motion while building a map of the environment.
Robotics
no code implementations • 9 Jun 2019 • Sriram N. N., Gourav Kumar, Abhay Singh, M. Siva Karthik, Saket Saurav Brojeshwar Bhowmick, K. Madhava Krishna
In the indoor setting, we use an autonomous drone to navigate various scenarios and also a ground robot which can explore the environment using the trajectories proposed by our framework.
no code implementations • 1 Dec 2018 • Akanksha Baranwal, Ishan Bansal, Roopal Nahar, K. Madhava Krishna
However, with the limited on-chip memory and computation resources of FPGA, meeting the high memory throughput requirement and exploiting the parallelism of CNNs is a major challenge.
Distributed, Parallel, and Cluster Computing
no code implementations • 2 Oct 2019 • Ayush Gaud, Y V S Harish, K. Madhava Krishna
We leverage the expressiveness of the popular stacked hourglass architecture and augment it by adopting memory units between intermediate layers of the network with weights shared across stages for video frames.
no code implementations • 10 Feb 2020 • Gokul B. Nair, Swapnil Daga, Rahul Sajnani, Anirudha Ramesh, Junaid Ahmed Ansari, Krishna Murthy Jatavallabhula, K. Madhava Krishna
In this paper, we tackle the problem of multibody SLAM from a monocular camera.
no code implementations • 8 Mar 2020 • Y V S Harish, Harit Pandya, Ayush Gaud, Shreya Terupally, Sai Shankar, K. Madhava Krishna
We further present an extensive benchmark in a photo-realistic 3D simulation across diverse scenes to study the convergence and generalisation of visual servoing approaches.
no code implementations • 12 May 2019 • Nayan Joshi, Yogesh Sharma, Parv Parkhiya, Rishabh Khawad, K. Madhava Krishna, Brojeshwar Bhowmick
The proposed parameterization associates 3D category-specific CAD model and object under consideration using a dictionary based RANSAC method that uses object Viewpoints as prior and edges detected in the respective intensity image of the scene.
Robotics
no code implementations • 25 Apr 2020 • Aniket Pokale, Aditya Aggarwal, K. Madhava Krishna
This paper presents a new system to obtain dense object reconstructions along with 6-DoF poses from a single image.
no code implementations • 3 Oct 2020 • Satyajit Tourani, Dhagash Desai, Udit Singh Parihar, Sourav Garg, Ravi Kiran Sarvadevabhatla, Michael Milford, K. Madhava Krishna
In particular, our integration of VPR with SLAM by leveraging the robustness of deep-learned features and our homography-based extreme viewpoint invariance significantly boosts the performance of VPR, feature correspondence, and pose graph submodules of the SLAM pipeline.
no code implementations • 15 Nov 2020 • Swapnil Daga, Gokul B. Nair, Anirudha Ramesh, Rahul Sajnani, Junaid Ahmed Ansari, K. Madhava Krishna
In this paper, we present BirdSLAM, a novel simultaneous localization and mapping (SLAM) system for the challenging scenario of autonomous driving platforms equipped with only a monocular camera.
no code implementations • 20 Aug 2021 • Kaustubh Mani, N. Sai Shankar, Krishna Murthy Jatavallabhula, K. Madhava Krishna
Given an image or a video captured from a monocular camera, amodal layout estimation is the task of predicting semantics and occupancy in bird's eye view.
no code implementations • 10 Mar 2022 • Abhishek Peri, Kinal Mehta, Avneesh Mishra, Michael Milford, Sourav Garg, K. Madhava Krishna
Sparse local feature matching is pivotal for many computer vision and robotics tasks.
no code implementations • 24 Sep 2022 • Kanishk Jain, Varun Chhangani, Amogh Tiwari, K. Madhava Krishna, Vineet Gandhi
We investigate the Vision-and-Language Navigation (VLN) problem in the context of autonomous driving in outdoor settings.
no code implementations • 27 Sep 2022 • Kushagra Srivastava, Dhruv Patel, Aditya Kumar Jha, Mohhit Kumar Jha, Jaskirat Singh, Ravi Kiran Sarvadevabhatla, Pradeep Kumar Ramancharla, Harikumar Kandath, K. Madhava Krishna
Unmanned Aerial Vehicle (UAV) based remote sensing system incorporated with computer vision has demonstrated potential for assisting building construction and in disaster management like damage assessment during earthquakes.
no code implementations • 30 Nov 2022 • Pranjali Pathre, Anurag Sahu, Ashwin Rao, Avinash Prabhu, Meher Shashwat Nigam, Tanvi Karandikar, Harit Pandya, K. Madhava Krishna
To the best of our knowledge, this is the first such work to portray a 3D rendering of a warehouse scene in terms of its semantic components - Racks, Shelves and Objects - all from a single monocular camera.
no code implementations • 3 Oct 2023 • Tushar Choudhary, Vikrant Dewangan, Shivam Chandhok, Shubham Priyadarshan, Anushka Jain, Arun K. Singh, Siddharth Srivastava, Krishna Murthy Jatavallabhula, K. Madhava Krishna
Talk2BEV is a large vision-language model (LVLM) interface for bird's-eye view (BEV) maps in autonomous driving contexts.
no code implementations • 24 Nov 2023 • Dhruv Patel, Shivani Chepuri, Sarvesh Thakur, K. Harikumar, Ravi Kiran S., K. Madhava Krishna
Despite the technological advancements in the construction and surveying sector, the inspection of salient features like windows in an under-construction or existing building is predominantly a manual process.