no code implementations • 12 Apr 2024 • James F. Mullen Jr, Prasoon Goyal, Robinson Piramuthu, Michael Johnston, Dinesh Manocha, Reza Ghanadan
Our work assists in this goal by enabling robots to inform their users of dangerous or unsanitary anomalies in their home.
no code implementations • 28 Nov 2023 • Jacob Zhiyuan Fang, Skyler Zheng, Vasu Sharma, Robinson Piramuthu
Regardless of their effectiveness, larger architectures unavoidably prevent the models from being extended to real-world applications, so building a lightweight VL architecture and an efficient learning schema is of great practical value.
no code implementations • 27 Nov 2023 • Shiyuan Huang, Robinson Piramuthu, Vicente Ordonez, Shih-Fu Chang, Gunnar A. Sigurdsson
From our experiments, we have observed only 5. 2%-5. 8% loss of performance with only 10% of video lengths, which corresponds to 2-4 frames selected from each video.
no code implementations • 12 Mar 2023 • Siddharth Singi, Zhanpeng He, Alvin Pan, Sandip Patel, Gunnar A. Sigurdsson, Robinson Piramuthu, Shuran Song, Matei Ciocarlie
In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed.
no code implementations • 30 Jan 2023 • Gunnar A. Sigurdsson, Jesse Thomason, Gaurav S. Sukhatme, Robinson Piramuthu
Armed with this intuition, using only a generic vision-language scoring model with minor modifications for 3d encoding and operating in an embodied environment, we demonstrate an absolute performance gain of 9. 84% on remote object grounding above state of the art models for REVERIE and of 5. 04% on FAO.
no code implementations • 30 Nov 2022 • Vishnu Sashank Dorbala, Gunnar Sigurdsson, Robinson Piramuthu, Jesse Thomason, Gaurav S. Sukhatme
Our results on the coarse-grained instruction following task of REVERIE demonstrate the navigational capability of CLIP, surpassing the supervised baseline in terms of both success rate (SR) and success weighted by path length (SPL).
1 code implementation • 15 Oct 2022 • Shiyuan Huang, Robinson Piramuthu, Shih-Fu Chang, Gunnar A. Sigurdsson
Specifically, we insert a lightweight Feature Compression Module (FeatComp) into a VideoQA model which learns to extract task-specific tiny features as little as 10 bits, which are optimal for answering certain types of questions.
no code implementations • 21 Jun 2022 • Brandon Trabucco, Gunnar Sigurdsson, Robinson Piramuthu, Gaurav S. Sukhatme, Ruslan Salakhutdinov
Physically rearranging objects is an important capability for embodied agents.
3 code implementations • 1 Oct 2021 • Aishwarya Padmakumar, Jesse Thomason, Ayush Shrivastava, Patrick Lange, Anjali Narayan-Chen, Spandana Gella, Robinson Piramuthu, Gokhan Tur, Dilek Hakkani-Tur
Robots operating in human spaces must be able to engage in natural language interaction with people, both understanding and executing instructions, and using conversation to resolve ambiguity and recover from mistakes.
1 code implementation • Findings (ACL) 2022 • Ayush Shrivastava, Karthik Gopalakrishnan, Yang Liu, Robinson Piramuthu, Gokhan Tür, Devi Parikh, Dilek Hakkani-Tür
Interactive robots navigating photo-realistic environments need to be trained to effectively leverage and handle the dynamic nature of dialogue in addition to the challenges underlying vision-and-language navigation (VLN).
no code implementations • 26 Mar 2021 • Yun-Chun Chen, Marco Piccirilli, Robinson Piramuthu, Ming-Hsuan Yang
The key insights of our method are two-fold.
Ranked #53 on 3D Human Pose Estimation on MPI-INF-3DHP
1 code implementation • CVPR 2020 • Yu-Ting Chang, Qiaosong Wang, Wei-Chih Hung, Robinson Piramuthu, Yi-Hsuan Tsai, Ming-Hsuan Yang
Existing weakly-supervised semantic segmentation methods using image-level annotations typically rely on initial responses to locate object regions.
no code implementations • 3 Aug 2020 • Yu-Ting Chang, Qiaosong Wang, Wei-Chih Hung, Robinson Piramuthu, Yi-Hsuan Tsai, Ming-Hsuan Yang
Obtaining object response maps is one important step to achieve weakly-supervised semantic segmentation using image-level labels.
1 code implementation • 18 Dec 2018 • Muratcan Cicek, Jinrong Xie, Qiaosong Wang, Robinson Piramuthu
Unlike desktop and laptop computers, they are also much easier to carry indoors and outdoors. To address this, we implement and open source button that is sensitive to head movements tracked from the front camera of iPhone X.
Human-Computer Interaction
no code implementations • 26 Nov 2018 • Xiao Ma, Lina Mezghani, Kimberly Wilber, Hui Hong, Robinson Piramuthu, Mor Naaman, Serge Belongie
In this work, we conducted a large-scale study on the quality of user-generated images in peer-to-peer marketplaces.
1 code implementation • 23 Oct 2018 • M. Hadi Kiapour, Robinson Piramuthu
In this work, we analyze learned visual representations by deep networks that are trained to recognize fashion brands.
no code implementations • 24 Sep 2018 • Bryan A. Plummer, M. Hadi Kiapour, Shuai Zheng, Robinson Piramuthu
In this paper, we introduce an attribute-based interactive image search which can leverage human-in-the-loop feedback to iteratively refine image search results.
no code implementations • 6 Jul 2018 • Kevin Lin, Fan Yang, Qiaosong Wang, Robinson Piramuthu
Fine-grained image search is still a challenging problem due to the difficulty in capturing subtle differences regardless of pose variations of objects from fine-grained categories.
2 code implementations • 3 Jul 2018 • Shuai Zheng, Fan Yang, M. Hadi Kiapour, Robinson Piramuthu
Understanding clothes from a single image has strong commercial and cultural impacts on modern societies.
1 code implementation • ECCV 2018 • Bryan A. Plummer, Paige Kordas, M. Hadi Kiapour, Shuai Zheng, Robinson Piramuthu, Svetlana Lazebnik
This paper presents an approach for grounding phrases in images which jointly learns multiple text-conditioned embeddings in a single end-to-end model.
no code implementations • 31 Jul 2017 • Mahyar Najibi, Fan Yang, Qiaosong Wang, Robinson Piramuthu
In this work, we propose an efficient and effective approach for unconstrained salient object detection in images using deep convolutional neural networks.
no code implementations • 10 Jun 2017 • Fan Yang, Ajinkya Kale, Yury Bubnov, Leon Stein, Qiaosong Wang, Hadi Kiapour, Robinson Piramuthu
We harness the availability of large image collection of eBay listings and state-of-the-art deep learning techniques to perform visual search at scale.
no code implementations • CVPR 2016 • Qiaosong Wang, Wen Zheng, Robinson Piramuthu
We propose an unsupervised bottom-up saliency detection approach by exploiting novel graph structure and background priors.
no code implementations • ICCV 2015 • Zhicheng Yan, Hao Zhang, Robinson Piramuthu, Vignesh Jagadeesh, Dennis Decoste, Wei Di, Yizhou Yu
In this paper, we introduce hierarchical deep CNNs (HD-CNNs) by embedding deep CNNs into a category hierarchy.
no code implementations • 19 Nov 2014 • Kevin Shih, Wei Di, Vignesh Jagadeesh, Robinson Piramuthu
Text is ubiquitous in the artificial world and easily attainable when it comes to book title and author names.
no code implementations • 19 Nov 2014 • Kota Hara, Vignesh Jagadeesh, Robinson Piramuthu
In this work, we propose and address a new computer vision task, which we call fashion item detection, where the aim is to detect various fashion items a person in the image is wearing or carrying.
no code implementations • CVPR 2015 • Bolei Zhou, Vignesh Jagadeesh, Robinson Piramuthu
Discovering visual knowledge from weakly labeled data is crucial to scale up computer vision recognition system, since it is expensive to obtain fully labeled data for a large number of concept categories.
no code implementations • 3 Oct 2014 • Qiaosong Wang, Vignesh Jagadeesh, Bryan Ressler, Robinson Piramuthu
In this paper, we propose a method for capturing accurate human body shape and anthropometrics from a single consumer grade depth sensor.
4 code implementations • 3 Oct 2014 • Zhicheng Yan, Hao Zhang, Robinson Piramuthu, Vignesh Jagadeesh, Dennis Decoste, Wei Di, Yizhou Yu
In this paper, we introduce hierarchical deep CNNs (HD-CNNs) by embedding deep CNNs into a category hierarchy.
Ranked #174 on Image Classification on CIFAR-100
no code implementations • 13 Jun 2014 • Wei Di, Anurag Bhardwaj, Vignesh Jagadeesh, Robinson Piramuthu, Elizabeth Churchill
This study aims to address the effectiveness of types of image in showcasing fashion apparel in terms of its attractiveness, i. e. the ability to draw consumer's attention, interest, and in return their engagement.
Human-Computer Interaction K.4.4; H.2.8
no code implementations • CVPR 2014 • Chen-Yu Lee, Anurag Bhardwaj, Wei Di, Vignesh Jagadeesh, Robinson Piramuthu
We present a new feature representation method for scene text recognition problem, particularly focusing on improving scene character recognition.
no code implementations • 15 Mar 2014 • Zixuan Wang, Wei Di, Anurag Bhardwaj, Vignesh Jagadeesh, Robinson Piramuthu
We present a novel compact image descriptor for large scale image search.
no code implementations • 8 Jan 2014 • Vignesh Jagadeesh, Robinson Piramuthu, Anurag Bhardwaj, Wei Di, Neel Sundaresan
We describe a completely automated large scale visual recommendation system for fashion.