no code implementations • 1 Oct 2024 • Yunze Liu, Li Yi
We found that using the correct autoregressive pretraining can significantly boost the performance of the Mamba architecture.
no code implementations • 6 Sep 2024 • Yecheng Wu, Zhuoyang Zhang, Junyu Chen, Haotian Tang, Dacheng Li, Yunhao Fang, Ligeng Zhu, Enze Xie, Hongxu Yin, Li Yi, Song Han, Yao Lu
VILA-U is a Unified foundation model that integrates Video, Image, Language understanding and generation.
1 code implementation • 27 Jun 2024 • Chengwen Zhang, Yun Liu, Ruofan Xing, Bingda Tang, Li Yi
With 1K human-object-human motion sequences captured in the real world, we enrich CORE4D by contributing an iterative collaboration retargeting strategy to augment motions to a variety of novel objects.
Human-Object Interaction Detection Human-Object Interaction Generation +2
no code implementations • 15 Jun 2024 • Zhikai Zhang, Yitang Li, Haofeng Huang, Mingxian Lin, Li Yi
At the same time, foundation models trained with internet-scale image and text data have demonstrated surprising world knowledge and reasoning ability for various downstream tasks.
no code implementations • 14 Jun 2024 • Xiaoyan Cong, Haitao Yang, Liyan Chen, Kaifeng Zhang, Li Yi, Chandrajit Bajaj, QiXing Huang
To this end, we introduce a novel approach to compute correspondences between adjacent textured implicit surfaces, which are used to define the ARAP regularization term.
no code implementations • CVPR 2024 • Haowen Luo, Yunze Liu, Li Yi
The credibility and practicality of a reconstructed hand-object interaction sequence depend largely on its physical plausibility.
1 code implementation • 11 Apr 2024 • Xueyi Liu, Kangbo Lyu, Jieqiong Zhang, Tao Du, Li Yi
We explore the dexterous manipulation transfer problem by designing simulators.
no code implementations • CVPR 2024 • Xiangyue Liu, Han Xue, Kunming Luo, Ping Tan, Li Yi
We present GenN2N, a unified NeRF-to-NeRF translation framework for various NeRF translation tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
no code implementations • 1 Apr 2024 • Yunze Liu, Changxi Chen, Chenjing Ding, Li Yi
Humanoid Reaction Synthesis is pivotal for creating highly interactive and empathetic robots that can seamlessly integrate into human environments, enhancing the way we live, work, and communicate.
3 code implementations • 27 Feb 2024 • Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi, Kaisheng Ma
This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM) designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages.
Ranked #1 on 3D Question Answering (3D-QA) on 3D MM-Vet
1 code implementation • 22 Feb 2024 • Xueyi Liu, Li Yi
We tackle those challenges through a novel approach, GeneOH Diffusion, incorporating two key designs: an innovative contact-centric HOI representation named GeneOH and a new domain-generalizable denoising scheme.
1 code implementation • 19 Feb 2024 • Keyang Xuan, Li Yi, Fan Yang, Ruochen Wu, Yi R. Fung, Heng Ji
In this paper, we first investigate the potential of LVLM on multimodal misinformation detection.
1 code implementation • 22 Jan 2024 • Feiyu Yao, Zongkai Wu, Li Yi
In this paper, we use well-designed Body Pose Graph (BPG) to represent the human body and translate the challenge into a prediction problem of graph missing nodes.
no code implementations • 17 Jan 2024 • Yunze Liu, Changxi Chen, Zifan Wang, Li Yi
This paper introduces a novel approach named CrossVideo, which aims to enhance self-supervised cross-modal contrastive learning in the field of point cloud video understanding.
no code implementations • CVPR 2024 • Yun Liu, Haolin Yang, Xu Si, Ling Liu, Zipeng Li, Yuxiang Zhang, Yebin Liu, Li Yi
Humans commonly work with multiple objects in daily life and can intuitively transfer manipulation skills to novel objects by understanding object functional regularities.
no code implementations • 1 Jan 2024 • Zifan Wang, Junyu Chen, Ziqing Chen, Pengwei Xie, Rui Chen, Li Yi
We further introduce a distillation-friendly demonstration generation method that automatically generates a million high-quality demonstrations suitable for learning.
no code implementations • CVPR 2024 • Zifan Wang, Junyu Chen, Ziqing Chen, Pengwei Xie, Rui Chen, Li Yi
This paper presents GenH2R a framework for learning generalizable vision-based human-to-robot (H2R) handover skills.
1 code implementation • 14 Dec 2023 • Yunze Liu, Changxi Chen, Li Yi
To support this task, we construct two datasets named HHI and CoChair and propose a unified method.
no code implementations • 13 Dec 2023 • Zifan Wang, Zhuorui Ye, Haoran Wu, Junyu Chen, Li Yi
To tackle this challenging problem, we properly model the synergetic relationship between future forecasting and semantic scene completion through a novel network named SCSFNet.
no code implementations • 29 Nov 2023 • Yingdong Hu, Fanqi Lin, Tong Zhang, Li Yi, Yang Gao
In this study, we are interested in imbuing robots with the capability of physically-grounded task planning.
no code implementations • 12 Oct 2023 • Yuhao Dong, Zhuoyang Zhang, Yunze Liu, Li Yi
We integrate NSM4D with state-of-the-art 4D perception backbones, demonstrating significant improvements on various online perception benchmarks in indoor and outdoor settings.
1 code implementation • 20 Sep 2023 • Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi
This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.
Ranked #5 on Visual Question Answering on MMBench
no code implementations • 18 Sep 2023 • Liuyu Bian, Pengyang Shi, Weihang Chen, Jing Xu, Li Yi, Rui Chen
By approximating and optimizing the utility function, we can optimize the probing locations given a fixed touching budget to better improve the network's performance on real objects.
1 code implementation • ICCV 2023 • Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao
To address this issue, the Transporter method was introduced for 2D data, which reconstructs the target frame from the source frame to incorporate both spatial and temporal information.
1 code implementation • ICCV 2023 • Xueyi Liu, Bin Wang, He Wang, Li Yi
By observing an articulated object dataset containing only a few examples, we wish to learn a model that can generate diverse meshes with high visual fidelity and physical validity.
no code implementations • ICCV 2023 • Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang
We propose a novel, object-agnostic method for learning a universal policy for dexterous object grasping from realistic point cloud observations and proprioceptive information under a table-top setting, namely UniDexGrasp++.
1 code implementation • CVPR 2023 • Xiaomeng Xu, Yanchao Yang, Kaichun Mo, Boxiao Pan, Li Yi, Leonidas Guibas
We propose a method that trains a neural radiance field (NeRF) to encode not only the appearance of the scene but also semantic correlations between scene points, regions, or entities -- aiming to capture their mutual co-variation patterns.
no code implementations • CVPR 2023 • Gengxin Liu, Qian Sun, Haibin Huang, Chongyang Ma, Yulan Guo, Li Yi, Hui Huang, Ruizhen Hu
First, although 3D dataset with fully annotated motion labels is limited, there are existing datasets and methods for object part semantic segmentation at large scale.
1 code implementation • CVPR 2023 • Xuanyao Chen, Zhijian Liu, Haotian Tang, Li Yi, Hang Zhao, Song Han
High-resolution images enable neural networks to learn richer visual representations.
no code implementations • CVPR 2023 • Juntian Zheng, Qingyuan Zheng, Lixing Fang, Yun Liu, Li Yi
In this work, we focus on a novel task of category-level functional hand-object manipulation synthesis covering both rigid and articulated object categories.
no code implementations • 20 Mar 2023 • Li Yi
The phrase connector then selects a series of phrases from the phrase pool that minimizes a multi-term loss function that considers rhyme, song structure, and fluency.
1 code implementation • CVPR 2023 • Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang, Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang
Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.
Ranked #1 on Pedestrian Attribute Recognition on PA-100K (using extra training data)
1 code implementation • CVPR 2023 • Yinzhen Xu, Weikang Wan, Jialiang Zhang, Haoran Liu, Zikang Shan, Hao Shen, Ruicheng Wang, Haoran Geng, Yijia Weng, Jiayi Chen, Tengyu Liu, Li Yi, He Wang
Trained on our synthesized large-scale dexterous grasp dataset, this model enables us to sample diverse and high-quality dexterous grasp poses for the object point cloud. For the second stage, we propose to replace the motion planning used in parallel gripper grasping with a goal-conditioned grasp policy, due to the complexity involved in dexterous grasping execution.
1 code implementation • 28 Feb 2023 • Xueyi Liu, Ji Zhang, Ruizhen Hu, Haibin Huang, He Wang, Li Yi
Category-level articulated object pose estimation aims to estimate a hierarchy of articulation-aware object poses of an unseen articulated object from a known category.
4 code implementations • 5 Feb 2023 • Zekun Qi, Runpei Dong, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng Ma, Li Yi
This motivates us to learn 3D representations by sharing the merits of both paradigms, which is non-trivial due to the pattern difference between the two paradigms.
Ranked #1 on Zero-Shot Transfer 3D Point Cloud Classification on ModelNet10 (using extra training data)
no code implementations • 31 Jan 2023 • Li Yi, Gezheng Xu, Pengcheng Xu, Jiaqi Li, Ruizhi Pu, Charles Ling, A. Ian McLeod, Boyu Wang
We also prove that such a difference makes existing LLN methods that rely on their distribution assumptions unable to address the label noise in SFDA.
no code implementations • ICCV 2023 • Yunze Liu, Junyu Chen, Zekai Zhang, Jingwei Huang, Li Yi
With such frames, we can factorize geometry and motion to facilitate a feature-space geometric reconstruction for more effective 4D learning.
4 code implementations • 16 Dec 2022 • Runpei Dong, Zekun Qi, Linfeng Zhang, Junbo Zhang, Jianjian Sun, Zheng Ge, Li Yi, Kaisheng Ma
The success of deep learning heavily relies on large-scale data with comprehensive labels, which is more expensive and time-consuming to fetch in 3D compared to 2D images or natural languages.
Ranked #7 on Few-Shot 3D Point Cloud Classification on ModelNet40 10-way (10-shot) (using extra training data)
Few-Shot 3D Point Cloud Classification Knowledge Distillation +1
no code implementations • CVPR 2023 • Zhuoyang Zhang, Yuhao Dong, Yunze Liu, Li Yi
Recent work on 4D point cloud sequences has attracted a lot of attention.
1 code implementation • 25 Nov 2022 • Junbo Zhang, Guofan Fan, Guanghan Wang, Zhengyuan Su, Kaisheng Ma, Li Yi
To guide 3D feature learning toward important geometric attributes and scene context, we explore the help of textual scene descriptions.
1 code implementation • CVPR 2023 • Haoran Geng, Helin Xu, Chengyang Zhao, Chao Xu, Li Yi, Siyuan Huang, He Wang
Based on GAPartNet, we investigate three cross-category tasks: part segmentation, part pose estimation, and part-based object manipulation.
1 code implementation • 17 Oct 2022 • Zhan Xu, Yang Zhou, Li Yi, Evangelos Kalogerakis
We present MoRig, a method that automatically rigs character meshes driven by single-view point cloud streams capturing the motion of performing characters.
1 code implementation • 8 Oct 2022 • Yun Liu, Xiaomeng Xu, Weihang Chen, Haocheng Yuan, He Wang, Jing Xu, Rui Chen, Li Yi
When manipulating an object to accomplish complex tasks, humans rely on both vision and touch to keep track of the object's 6D pose.
no code implementations • 24 Sep 2022 • Jiayi Chen, Mi Yan, Jiazhao Zhang, Yinzhen Xu, Xiaolong Li, Yijia Weng, Li Yi, Shuran Song, He Wang
We for the first time propose a point cloud based hand joint tracking network, HandTrackNet, to estimate the inter-frame hand joint motion.
1 code implementation • 1 Sep 2022 • Li Yi, Haochen Hu, Jingwei Zhao, Gus Xia
We propose AccoMontage2, a system capable of doing full-length song harmonization and accompaniment arrangement based on a lead melody.
1 code implementation • 30 Jul 2022 • Hao Wen, Yunze Liu, Jingwei Huang, Bo Duan, Li Yi
This paper proposes a 4D backbone for long-term point cloud video understanding.
no code implementations • 21 Jun 2022 • Jiafei Duan, Samson Yu, Nicholas Tan, Li Yi, Cheston Tan
Humans with an average level of social cognition can infer the beliefs of others based solely on the nonverbal communication signals (e. g. gaze, gesture, pose and contextual information) exhibited during social interactions.
1 code implementation • CVPR 2022 • Zhan Xu, Matthew Fisher, Yang Zhou, Deepali Aneja, Rushikesh Dudhat, Li Yi, Evangelos Kalogerakis
Rigged puppets are one of the most prevalent representations to create 2D character animations.
no code implementations • CVPR 2022 • Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan
Specifically, FixNet consists of a perception module to extract the structured representation from the 3D point cloud, a physical dynamics prediction module to simulate the results of interactions on 3D objects, and a functionality prediction module to evaluate the functionality and choose the correct fix.
no code implementations • CVPR 2022 • Hong-Xing Yu, Jiajun Wu, Li Yi
To incorporate object-level rotation equivariance into 3D object detectors, we need a mechanism to extract equivariant features with local object-level spatial support while being able to model cross-object context information.
no code implementations • CVPR 2022 • Kai Ye, Siyan Dong, Qingnan Fan, He Wang, Li Yi, Fei Xia, Jue Wang, Baoquan Chen
Previous approaches either choose the frontier as the goal position via a myopic solution that hinders the time efficiency, or maximize the long-term value via reinforcement learning to directly regress the goal position, but does not guarantee the complete map construction.
no code implementations • CVPR 2022 • Tianchen Zhao, Niansong Zhang, Xuefei Ning, He Wang, Li Yi, Yu Wang
We propose CodedVTR (Codebook-based Voxel TRansformer), which improves data efficiency and generalization ability for 3D sparse voxel transformers.
1 code implementation • CVPR 2022 • Xueyi Liu, Xiaomeng Xu, Anyi Rao, Chuang Gan, Li Yi
To solve the above issues, we propose AutoGPart, a generic method enabling training generalizable 3D part segmentation networks with the task prior considered.
1 code implementation • CVPR 2022 • Yunze Liu, Yun Liu, Che Jiang, Kangbo Lyu, Weikang Wan, Hao Shen, Boqiang Liang, Zhoujie Fu, He Wang, Li Yi
We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction.
1 code implementation • CVPR 2022 • Li Yi, Sheng Liu, Qi She, A. Ian McLeod, Boyu Wang
To address this issue, we focus on learning robust contrastive representations of data on which the classifier is hard to memorize the label noise under the CE loss.
no code implementations • NeurIPS 2021 • Yining Hong, Li Yi, Joshua B. Tenenbaum, Antonio Torralba, Chuang Gan
A critical aspect of human visual perception is the ability to parse visual scenes into individual objects and further into object parts, forming part-whole hierarchies.
no code implementations • NeurIPS 2021 • Xiaolong Li, Yijia Weng, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang
Category-level object pose estimation aims to find 6D object poses of previously unseen object instances from known categories without access to object CAD models.
1 code implementation • ICCV 2021 • Yunze Liu, Qingnan Fan, Shanghang Zhang, Hao Dong, Thomas Funkhouser, Li Yi
Another approach is to concatenate all the modalities into a tuple and then contrast positive and negative tuple correspondences.
Ranked #75 on Semantic Segmentation on NYU Depth v2
no code implementations • NeurIPS 2021 • Xiaolong Li, Yijia Weng, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang
To reduce the huge amount of pose annotations needed for category-level learning, we propose for the first time a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds.
no code implementations • 24 Dec 2020 • Yunze Liu, Li Yi, Shanghang Zhang, Qingnan Fan, Thomas Funkhouser, Hao Dong
Self-supervised representation learning is a critical problem in computer vision, as it provides a way to pretrain feature extractors on large unlabeled datasets that can be used as an initialization for more efficient and effective training on downstream tasks.
1 code implementation • CVPR 2021 • Siyan Dong, Qingnan Fan, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, Leonidas Guibas
Localizing the camera in a known indoor environment is a key building block for scene mapping, robot navigation, AR, etc.
1 code implementation • 4 Dec 2020 • Songfang Han, Jiayuan Gu, Kaichun Mo, Li Yi, Siyu Hu, Xuejin Chen, Hao Su
However, there remains a much more difficult and under-explored issue on how to generalize the learned skills over unseen object categories that have very different shape geometry distributions.
no code implementations • CVPR 2021 • Li Yi, Boqing Gong, Thomas Funkhouser
We study an unsupervised domain adaptation problem for the semantic labeling of 3D point clouds, with a particular focus on domain discrepancies induced by different LiDAR sensors.
no code implementations • 12 Jun 2020 • He Wang, Zetian Jiang, Li Yi, Kaichun Mo, Hao Su, Leonidas J. Guibas
We further study how different evaluation metrics weigh the sampling pattern against the geometry and propose several perceptual metrics forming a sampling spectrum of metrics.
1 code implementation • CVPR 2020 • Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su
To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable.
1 code implementation • ECCV 2020 • Yueqi Duan, Haidong Zhu, He Wang, Li Yi, Ram Nevatia, Leonidas J. Guibas
When learning to sketch, beginners start with simple and flexible shapes, and then gradually strive for more complex and accurate ones in the subsequent training sessions.
2 code implementations • CVPR 2020 • Xiaolong Li, He Wang, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song
We develop a deep network based on PointNet++ that predicts ANCSH from a single depth point cloud, including part segmentation, normalized coordinates, and joint parameters in the canonical object space.
1 code implementation • CVPR 2020 • Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas
Learning to encode differences in the geometry and (topological) structure of the shapes of ordinary objects is key to generating semantically plausible variations of a given shape, transferring edits from one shape to another, and many other applications in 3D content creation.
2 code implementations • 1 Aug 2019 • Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas
We introduce StructureNet, a hierarchical graph network which (i) can directly encode shapes represented as such n-ary graphs; (ii) can be robustly trained on large and complex shape families; and (iii) can be used to generate a great diversity of realistic structured shape geometries.
no code implementations • CVPR 2020 • Chenyang Zhu, Kai Xu, Siddhartha Chaudhuri, Li Yi, Leonidas Guibas, Hao Zhang
While the part prior network can be trained with noisy and inconsistently segmented shapes, the final output of AdaCoSeg is a consistent part labeling for the input set, with each shape segmented into up to (a user-specified) K parts.
no code implementations • CVPR 2019 • Tong He, Haibin Huang, Li Yi, Yuqian Zhou, Chi-Hao Wu, Jue Wang, Stefano Soatto
Surface-based geodesic topology provides strong cues for object semantic analysis and geometric modeling.
1 code implementation • CVPR 2019 • Li Yi, Wang Zhao, He Wang, Minhyuk Sung, Leonidas Guibas
We introduce a novel 3D object proposal approach named Generative Shape Proposal Network (GSPN) for instance segmentation in point cloud data.
Ranked #28 on 3D Object Detection on ScanNetV2
5 code implementations • CVPR 2019 • Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, Hao Su
We present PartNet: a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information.
Ranked #3 on 3D Instance Segmentation on PartNet
1 code implementation • CVPR 2019 • Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkhouser, Matthias Nießner, Leonidas Guibas
We introduce, TextureNet, a neural network architecture designed to extract features from high-resolution signals associated with 3D surface meshes (e. g., color texture maps).
Ranked #23 on Semantic Segmentation on ScanNet (test mIoU metric)
2 code implementations • CVPR 2019 • Lingxiao Li, Minhyuk Sung, Anastasia Dubrovina, Li Yi, Leonidas Guibas
Fitting geometric primitives to 3D point cloud data bridges a gap between low-level digitized 3D data and high-level structural information on the underlying 3D shapes.
1 code implementation • 19 Sep 2018 • Li Yi, Haibin Huang, Difan Liu, Evangelos Kalogerakis, Hao Su, Leonidas Guibas
In this paper, we explore how the observation of different articulation states provides evidence for part structure and motion of 3D objects.
1 code implementation • 17 Oct 2017 • Li Yi, Lin Shao, Manolis Savva, Haibin Huang, Yang Zhou, Qirui Wang, Benjamin Graham, Martin Engelcke, Roman Klokov, Victor Lempitsky, Yuan Gan, Pengyu Wang, Kun Liu, Fenggen Yu, Panpan Shui, Bingyang Hu, Yan Zhang, Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Minki Jeong, Jaehoon Choi, Changick Kim, Angom Geetchandra, Narasimha Murthy, Bhargava Ramu, Bharadwaj Manda, M. Ramanathan, Gautam Kumar, P Preetham, Siddharth Srivastava, Swati Bhugra, Brejesh lall, Christian Haene, Shubham Tulsiani, Jitendra Malik, Jared Lafer, Ramsey Jones, Siyuan Li, Jie Lu, Shi Jin, Jingyi Yu, Qi-Xing Huang, Evangelos Kalogerakis, Silvio Savarese, Pat Hanrahan, Thomas Funkhouser, Hao Su, Leonidas Guibas
We introduce a large-scale 3D shape understanding benchmark using data and annotation from ShapeNet 3D object database.
64 code implementations • NeurIPS 2017 • Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas
By exploiting metric space distances, our network is able to learn local features with increasing contextual scales.
Ranked #2 on Semantic Segmentation on Toronto-3D L002
no code implementations • CVPR 2018 • Cewu Lu, Hao Su, Yongyi Lu, Li Yi, Chi-Keung Tang, Leonidas Guibas
Important high-level vision tasks such as human-object interaction, image captioning and robotic manipulation require rich semantic descriptions of objects at part level.
no code implementations • CVPR 2017 • Li Yi, Hao Su, Xingwen Guo, Leonidas Guibas
To enable the prediction of vertex functions on them by convolutional neural networks, we resort to spectral CNN method that enables weight sharing by parameterizing kernels in the spectral domain spanned by graph laplacian eigenbases.
Ranked #55 on 3D Part Segmentation on ShapeNet-Part
no code implementations • 13 Oct 2016 • Yiping Song, Lili Mou, Rui Yan, Li Yi, Zinan Zhu, Xiaohua Hu, Ming Zhang
In human-computer conversation systems, the context of a user-issued utterance is particularly important because it provides useful background information of the conversation.
15 code implementations • 9 Dec 2015 • Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qi-Xing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, Fisher Yu
We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects.
no code implementations • 18 Nov 2015 • Jiang Aiwen, Li Hanxi, Li Yi, Wang Mingwen
As a result, an efficient linear semantic down mapping is jointly learned for multimodal data, leading to a common space where they can be compared.
no code implementations • 26 Nov 2014 • Hao Su, Fan Wang, Li Yi, Leonidas Guibas
In this paper, given a single input image of an object, we synthesize new features for other views of the same object.