Search Results for author: Fu-Jen Chu

Found 13 papers, 5 papers with code

OmniPose6D: Towards Short-Term Object Pose Tracking in Dynamic Scenes from Monocular RGB

no code implementations9 Oct 2024 Yunzhi Lin, Yipu Zhao, Fu-Jen Chu, Xingyu Chen, Weiyao Wang, Hao Tang, Patricio A. Vela, Matt Feiszli, Kevin Liang

To address the challenge of short-term object pose tracking in dynamic environments with monocular RGB input, we introduce a large-scale synthetic dataset OmniPose6D, crafted to mirror the diversity of real-world conditions.

Benchmarking Diversity +2

Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos

no code implementations30 Sep 2024 Md Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang, Fu-Jen Chu, Kris Kitani, Gedas Bertasius, Xitong Yang

Goal-oriented planning, or anticipating a series of actions that transition an agent from its current state to a predefined objective, is crucial for developing intelligent assistants aiding users in daily procedural tasks.

Unlocking Exocentric Video-Language Data for Egocentric Video Representation Learning

no code implementations7 Aug 2024 Zi-Yi Dou, Xitong Yang, Tushar Nagarajan, Huiyu Wang, Jing Huang, Nanyun Peng, Kris Kitani, Fu-Jen Chu

We present EMBED (Egocentric Models Built with Exocentric Data), a method designed to transform exocentric video-language data for egocentric video representation learning.

Multi-Instance Retrieval Representation Learning +1

HyperMix: Out-of-Distribution Detection and Classification in Few-Shot Settings

no code implementations22 Dec 2023 Nikhil Mehta, Kevin J Liang, Jing Huang, Fu-Jen Chu, Li Yin, Tal Hassner

Out-of-distribution (OOD) detection is an important topic for real-world machine learning systems, but settings with limited in-distribution samples have been underexplored.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

2 code implementations CVPR 2024 Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.

Video Understanding

GKNet: grasp keypoint network for grasp candidates detection

no code implementations16 Jun 2021 Ruinian Xu, Fu-Jen Chu, Patricio A. Vela

Decreasing the detection difficulty by grouping keypoints into pairs boosts performance.

Keypoint Detection Triplet

Recognizing Object Affordances to Support Scene Reasoning for Manipulation Tasks

1 code implementation12 Sep 2019 Fu-Jen Chu, Ruinian Xu, Chao Tang, Patricio A. Vela

Unfortunately, the top performing affordance recognition methods use object category priors to boost the accuracy of affordance detection and segmentation.

Affordance Detection Affordance Recognition +4

Using Synthetic Data and Deep Networks to Recognize Primitive Shapes for Object Grasping

1 code implementation12 Sep 2019 Yunzhi Lin, Chao Tang, Fu-Jen Chu, Patricio A. Vela

Each primitive shape is designed with parametrized grasp families, permitting the pipeline to identify multiple grasp candidates per shape primitive region.

Deep Grasp: Detection and Localization of Grasps with Deep Neural Networks

4 code implementations1 Feb 2018 Fu-Jen Chu, Ruinian Xu, Patricio A. Vela

By defining the learning problem to be classification with null hypothesis competition instead of regression, the deep neural network with RGB-D image input predicts multiple grasp candidates for a single object or multiple objects, in a single shot.

Robotics

When Crowdsourcing Meets Mobile Sensing: A Social Network Perspective

no code implementations3 Aug 2015 Pin-Yu Chen, Shin-Ming Cheng, Pai-Shun Ting, Chia-Wei Lien, Fu-Jen Chu

Mobile sensing is an emerging technology that utilizes agent-participatory data for decision making or state estimation, including multimedia applications.

Decision Making

Supervised Collective Classification for Crowdsourcing

no code implementations23 Jul 2015 Pin-Yu Chen, Chia-Wei Lien, Fu-Jen Chu, Pai-Shun Ting, Shin-Ming Cheng

Crowdsourcing utilizes the wisdom of crowds for collective classification via information (e. g., labels of an item) provided by labelers.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.