Search Results for author: Li Yi

Found 84 papers, 43 papers with code

MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining

no code implementations1 Oct 2024 Yunze Liu, Li Yi

We found that using the correct autoregressive pretraining can significantly boost the performance of the Mamba architecture.

Mamba

CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement

1 code implementation27 Jun 2024 Chengwen Zhang, Yun Liu, Ruofan Xing, Bingda Tang, Li Yi

With 1K human-object-human motion sequences captured in the real world, we enrich CORE4D by contributing an iterative collaboration retargeting strategy to augment motions to a variety of novel objects.

Human-Object Interaction Detection Human-Object Interaction Generation +2

FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models

no code implementations15 Jun 2024 Zhikai Zhang, Yitang Li, Haofeng Huang, Mingxian Lin, Li Yi

At the same time, foundation models trained with internet-scale image and text data have demonstrated surprising world knowledge and reasoning ability for various downstream tasks.

Motion Synthesis World Knowledge

4DRecons: 4D Neural Implicit Deformable Objects Reconstruction from a single RGB-D Camera with Geometrical and Topological Regularizations

no code implementations14 Jun 2024 Xiaoyan Cong, Haitao Yang, Liyan Chen, Kaifeng Zhang, Li Yi, Chandrajit Bajaj, QiXing Huang

To this end, we introduce a novel approach to compute correspondences between adjacent textured implicit surfaces, which are used to define the ARAP regularization term.

Physics-aware Hand-object Interaction Denoising

no code implementations CVPR 2024 Haowen Luo, Yunze Liu, Li Yi

The credibility and practicality of a reconstructed hand-object interaction sequence depend largely on its physical plausibility.

Denoising Object

QuasiSim: Parameterized Quasi-Physical Simulators for Dexterous Manipulations Transfer

1 code implementation11 Apr 2024 Xueyi Liu, Kangbo Lyu, Jieqiong Zhang, Tao Du, Li Yi

We explore the dexterous manipulation transfer problem by designing simulators.

GenN2N: Generative NeRF2NeRF Translation

no code implementations CVPR 2024 Xiangyue Liu, Han Xue, Kunming Luo, Ping Tan, Li Yi

We present GenN2N, a unified NeRF-to-NeRF translation framework for various NeRF translation tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.

Colorization Contrastive Learning +2

PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation

no code implementations1 Apr 2024 Yunze Liu, Changxi Chen, Chenjing Ding, Li Yi

Humanoid Reaction Synthesis is pivotal for creating highly interactive and empathetic robots that can seamlessly integrate into human environments, enhancing the way we live, work, and communicate.

ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

3 code implementations27 Feb 2024 Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi, Kaisheng Ma

This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM) designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages.

3D geometry 3D Object Captioning +12

GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion

1 code implementation22 Feb 2024 Xueyi Liu, Li Yi

We tackle those challenges through a novel approach, GeneOH Diffusion, incorporating two key designs: an innovative contact-centric HOI representation named GeneOH and a new domain-generalizable denoising scheme.

Denoising

Full-Body Motion Reconstruction with Sparse Sensing from Graph Perspective

1 code implementation22 Jan 2024 Feiyu Yao, Zongkai Wu, Li Yi

In this paper, we use well-designed Body Pose Graph (BPG) to represent the human body and translate the challenge into a prediction problem of graph missing nodes.

Graph Neural Network

CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding

no code implementations17 Jan 2024 Yunze Liu, Changxi Chen, Zifan Wang, Li Yi

This paper introduces a novel approach named CrossVideo, which aims to enhance self-supervised cross-modal contrastive learning in the field of point cloud video understanding.

Contrastive Learning point cloud video understanding +2

TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding

no code implementations CVPR 2024 Yun Liu, Haolin Yang, Xu Si, Ling Liu, Zipeng Li, Yuxiang Zhang, Yebin Liu, Li Yi

Humans commonly work with multiple objects in daily life and can intuitively transfer manipulation skills to novel objects by understanding object functional regularities.

Action Recognition Benchmarking +2

GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation

no code implementations1 Jan 2024 Zifan Wang, Junyu Chen, Ziqing Chen, Pengwei Xie, Rui Chen, Li Yi

We further introduce a distillation-friendly demonstration generation method that automatically generates a million high-quality demonstrations suitable for learning.

Grasp Generation Imitation Learning

Interactive Humanoid: Online Full-Body Motion Reaction Synthesis with Social Affordance Canonicalization and Forecasting

1 code implementation14 Dec 2023 Yunze Liu, Changxi Chen, Li Yi

To support this task, we construct two datasets named HHI and CoChair and propose a unified method.

Semantic Complete Scene Forecasting from a 4D Dynamic Point Cloud Sequence

no code implementations13 Dec 2023 Zifan Wang, Zhuorui Ye, Haoran Wu, Junyu Chen, Li Yi

To tackle this challenging problem, we properly model the synergetic relationship between future forecasting and semantic scene completion through a novel network named SCSFNet.

Look Before You Leap: Unveiling the Power of GPT-4V in Robotic Vision-Language Planning

no code implementations29 Nov 2023 Yingdong Hu, Fanqi Lin, Tong Zhang, Li Yi, Yang Gao

In this study, we are interested in imbuing robots with the capability of physically-grounded task planning.

NSM4D: Neural Scene Model Based Online 4D Point Cloud Sequence Understanding

no code implementations12 Oct 2023 Yuhao Dong, Zhuoyang Zhang, Yunze Liu, Li Yi

We integrate NSM4D with state-of-the-art 4D perception backbones, demonstrating significant improvements on various online perception benchmarks in indoor and outdoor settings.

Action Segmentation Autonomous Driving +1

DreamLLM: Synergistic Multimodal Comprehension and Creation

1 code implementation20 Sep 2023 Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi

This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.

multimodal generation Visual Question Answering +2

TransTouch: Learning Transparent Objects Depth Sensing Through Sparse Touches

no code implementations18 Sep 2023 Liuyu Bian, Pengyang Shi, Weihang Chen, Jing Xu, Li Yi, Rui Chen

By approximating and optimizing the utility function, we can optimize the probing locations given a fixed touching budget to better improve the network's performance on real objects.

Transparent objects

3D Implicit Transporter for Temporally Consistent Keypoint Discovery

1 code implementation ICCV 2023 Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao

To address this issue, the Transporter method was introduced for 2D data, which reconstructs the target frame from the source frame to incorporate both spatial and temporal information.

Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation

1 code implementation ICCV 2023 Xueyi Liu, Bin Wang, He Wang, Li Yi

By observing an articulated object dataset containing only a few examples, we wish to learn a model that can generate diverse meshes with high visual fidelity and physical validity.

Diversity Philosophy

UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning

no code implementations ICCV 2023 Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang

We propose a novel, object-agnostic method for learning a universal policy for dexterous object grasping from realistic point cloud observations and proprioceptive information under a table-top setting, namely UniDexGrasp++.

Object

JacobiNeRF: NeRF Shaping with Mutual Information Gradients

1 code implementation CVPR 2023 Xiaomeng Xu, Yanchao Yang, Kaichun Mo, Boxiao Pan, Li Yi, Leonidas Guibas

We propose a method that trains a neural radiance field (NeRF) to encode not only the appearance of the scene but also semantic correlations between scene points, regions, or entities -- aiming to capture their mutual co-variation patterns.

Instance Segmentation Semantic Segmentation

Semi-Weakly Supervised Object Kinematic Motion Prediction

no code implementations CVPR 2023 Gengxin Liu, Qian Sun, Haibin Huang, Chongyang Ma, Yulan Guo, Li Yi, Hui Huang, Ruizhen Hu

First, although 3D dataset with fully annotated motion labels is limited, there are existing datasets and methods for object part semantic segmentation at large scale.

Graph Neural Network motion prediction +4

CAMS: CAnonicalized Manipulation Spaces for Category-Level Functional Hand-Object Manipulation Synthesis

no code implementations CVPR 2023 Juntian Zheng, Qingyuan Zheng, Lixing Fang, Yun Liu, Li Yi

In this work, we focus on a novel task of category-level functional hand-object manipulation synthesis covering both rigid and articulated object categories.

Object

Controllable Ancient Chinese Lyrics Generation Based on Phrase Prototype Retrieving

no code implementations20 Mar 2023 Li Yi

The phrase connector then selects a series of phrases from the phrase pool that minimizes a multi-term loss function that considers rhyme, song structure, and fluency.

HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining

1 code implementation CVPR 2023 Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang, Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang

Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.

 Ranked #1 on Pedestrian Attribute Recognition on PA-100K (using extra training data)

Attribute Autonomous Driving +5

UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy

1 code implementation CVPR 2023 Yinzhen Xu, Weikang Wan, Jialiang Zhang, Haoran Liu, Zikang Shan, Hao Shen, Ruicheng Wang, Haoran Geng, Yijia Weng, Jiayi Chen, Tengyu Liu, Li Yi, He Wang

Trained on our synthesized large-scale dexterous grasp dataset, this model enables us to sample diverse and high-quality dexterous grasp poses for the object point cloud. For the second stage, we propose to replace the motion planning used in parallel gripper grasping with a goal-conditioned grasp policy, due to the complexity involved in dexterous grasping execution.

Motion Planning

Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance

1 code implementation28 Feb 2023 Xueyi Liu, Ji Zhang, Ruizhen Hu, Haibin Huang, He Wang, Li Yi

Category-level articulated object pose estimation aims to estimate a hierarchy of articulation-aware object poses of an unseen articulated object from a known category.

Disentanglement Object +1

Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

4 code implementations5 Feb 2023 Zekun Qi, Runpei Dong, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng Ma, Li Yi

This motivates us to learn 3D representations by sharing the merits of both paradigms, which is non-trivial due to the pattern difference between the two paradigms.

3D Point Cloud Linear Classification Decoder +3

When Source-Free Domain Adaptation Meets Learning with Noisy Labels

no code implementations31 Jan 2023 Li Yi, Gezheng Xu, Pengcheng Xu, Jiaqi Li, Ruizhi Pu, Charles Ling, A. Ian McLeod, Boyu Wang

We also prove that such a difference makes existing LLN methods that rely on their distribution assumptions unable to address the label noise in SFDA.

Learning with noisy labels Source-Free Domain Adaptation

LeaF: Learning Frames for 4D Point Cloud Sequence Understanding

no code implementations ICCV 2023 Yunze Liu, Junyu Chen, Zekai Zhang, Jingwei Huang, Li Yi

With such frames, we can factorize geometry and motion to facilitate a feature-space geometric reconstruction for more effective 4D learning.

Descriptive

Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?

4 code implementations16 Dec 2022 Runpei Dong, Zekun Qi, Linfeng Zhang, Junbo Zhang, Jianjian Sun, Zheng Ge, Li Yi, Kaisheng Ma

The success of deep learning heavily relies on large-scale data with comprehensive labels, which is more expensive and time-consuming to fetch in 3D compared to 2D images or natural languages.

Few-Shot 3D Point Cloud Classification Knowledge Distillation +1

Language-Assisted 3D Feature Learning for Semantic Scene Understanding

1 code implementation25 Nov 2022 Junbo Zhang, Guofan Fan, Guanghan Wang, Zhengyuan Su, Kaisheng Ma, Li Yi

To guide 3D feature learning toward important geometric attributes and scene context, we explore the help of textual scene descriptions.

Descriptive Instance Segmentation +5

Morig: Motion-aware rigging of character meshes from point clouds

1 code implementation17 Oct 2022 Zhan Xu, Yang Zhou, Li Yi, Evangelos Kalogerakis

We present MoRig, a method that automatically rigs character meshes driven by single-view point cloud streams capturing the motion of performing characters.

Enhancing Generalizable 6D Pose Tracking of an In-Hand Object with Tactile Sensing

1 code implementation8 Oct 2022 Yun Liu, Xiaomeng Xu, Weihang Chen, Haocheng Yuan, He Wang, Jing Xu, Rui Chen, Li Yi

When manipulating an object to accomplish complex tasks, humans rely on both vision and touch to keep track of the object's 6D pose.

hand-object pose Object +1

Tracking and Reconstructing Hand Object Interactions from Point Cloud Sequences in the Wild

no code implementations24 Sep 2022 Jiayi Chen, Mi Yan, Jiazhao Zhang, Yinzhen Xu, Xiaolong Li, Yijia Weng, Li Yi, Shuran Song, He Wang

We for the first time propose a point cloud based hand joint tracking network, HandTrackNet, to estimate the inter-frame hand joint motion.

hand-object pose Object +2

AccoMontage2: A Complete Harmonization and Accompaniment Arrangement System

1 code implementation1 Sep 2022 Li Yi, Haochen Hu, Jingwei Zhao, Gus Xia

We propose AccoMontage2, a system capable of doing full-length song harmonization and accompaniment arrangement based on a lead melody.

Retrieval Template Matching

BOSS: A Benchmark for Human Belief Prediction in Object-context Scenarios

no code implementations21 Jun 2022 Jiafei Duan, Samson Yu, Nicholas Tan, Li Yi, Cheston Tan

Humans with an average level of social cognition can infer the beliefs of others based solely on the nonverbal communication signals (e. g. gaze, gesture, pose and contextual information) exhibited during social interactions.

Object

Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction

no code implementations CVPR 2022 Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

Specifically, FixNet consists of a perception module to extract the structured representation from the 3D point cloud, a physical dynamics prediction module to simulate the results of interactions on 3D objects, and a functionality prediction module to evaluate the functionality and choose the correct fix.

Rotationally Equivariant 3D Object Detection

no code implementations CVPR 2022 Hong-Xing Yu, Jiajun Wu, Li Yi

To incorporate object-level rotation equivariance into 3D object detectors, we need a mechanism to extract equivariant features with local object-level spatial support while being able to model cross-object context information.

3D Object Detection Autonomous Driving +2

Multi-Robot Active Mapping via Neural Bipartite Graph Matching

no code implementations CVPR 2022 Kai Ye, Siyan Dong, Qingnan Fan, He Wang, Li Yi, Fei Xia, Jue Wang, Baoquan Chen

Previous approaches either choose the frontier as the goal position via a myopic solution that hinders the time efficiency, or maximize the long-term value via reinforcement learning to directly regress the goal position, but does not guarantee the complete map construction.

Graph Matching Graph Neural Network +4

CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

no code implementations CVPR 2022 Tianchen Zhao, Niansong Zhang, Xuefei Ning, He Wang, Li Yi, Yu Wang

We propose CodedVTR (Codebook-based Voxel TRansformer), which improves data efficiency and generalization ability for 3D sparse voxel transformers.

3D Semantic Segmentation

AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation

1 code implementation CVPR 2022 Xueyi Liu, Xiaomeng Xu, Anyi Rao, Chuang Gan, Li Yi

To solve the above issues, we propose AutoGPart, a generic method enabling training generalizable 3D part segmentation networks with the task prior considered.

3D Part Segmentation Domain Generalization +1

HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction

1 code implementation CVPR 2022 Yunze Liu, Yun Liu, Che Jiang, Kangbo Lyu, Weikang Wan, Hao Shen, Boqiang Liang, Zhoujie Fu, He Wang, Li Yi

We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction.

Action Segmentation Benchmarking +6

On Learning Contrastive Representations for Learning with Noisy Labels

1 code implementation CVPR 2022 Li Yi, Sheng Liu, Qi She, A. Ian McLeod, Boyu Wang

To address this issue, we focus on learning robust contrastive representations of data on which the classifier is hard to memorize the label noise under the CE loss.

Learning with noisy labels Memorization +1

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning

no code implementations NeurIPS 2021 Yining Hong, Li Yi, Joshua B. Tenenbaum, Antonio Torralba, Chuang Gan

A critical aspect of human visual perception is the ability to parse visual scenes into individual objects and further into object parts, forming part-whole hierarchies.

Instance Segmentation Object +2

Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation

no code implementations NeurIPS 2021 Xiaolong Li, Yijia Weng, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang

Category-level object pose estimation aims to find 6D object poses of previously unseen object instances from known categories without access to object CAD models.

Object Pose Estimation +1

Leveraging SE(3) Equivariance for Self-supervised Category-Level Object Pose Estimation from Point Clouds

no code implementations NeurIPS 2021 Xiaolong Li, Yijia Weng, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang

To reduce the huge amount of pose annotations needed for category-level learning, we propose for the first time a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds.

Object Pose Estimation +1

P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding

no code implementations24 Dec 2020 Yunze Liu, Li Yi, Shanghang Zhang, Qingnan Fan, Thomas Funkhouser, Hao Dong

Self-supervised representation learning is a critical problem in computer vision, as it provides a way to pretrain feature extractors on large unlabeled datasets that can be used as an initialization for more efficient and effective training on downstream tasks.

Contrastive Learning Representation Learning +1

Compositionally Generalizable 3D Structure Prediction

1 code implementation4 Dec 2020 Songfang Han, Jiayuan Gu, Kaichun Mo, Li Yi, Siyu Hu, Xuejin Chen, Hao Su

However, there remains a much more difficult and under-explored issue on how to generalize the learned skills over unseen object categories that have very different shape geometry distributions.

3D Shape Reconstruction Object +1

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds

no code implementations CVPR 2021 Li Yi, Boqing Gong, Thomas Funkhouser

We study an unsupervised domain adaptation problem for the semantic labeling of 3D point clouds, with a particular focus on domain discrepancies induced by different LiDAR sensors.

Semantic Segmentation Unsupervised Domain Adaptation

Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks

no code implementations12 Jun 2020 He Wang, Zetian Jiang, Li Yi, Kaichun Mo, Hao Su, Leonidas J. Guibas

We further study how different evaluation metrics weigh the sampling pattern against the geometry and propose several perceptual metrics forming a sampling spectrum of metrics.

Clustering valid

SAPIEN: A SimulAted Part-based Interactive ENvironment

1 code implementation CVPR 2020 Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su

To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable.

Attribute Reinforcement Learning

Curriculum DeepSDF

1 code implementation ECCV 2020 Yueqi Duan, Haidong Zhu, He Wang, Li Yi, Ram Nevatia, Leonidas J. Guibas

When learning to sketch, beginners start with simple and flexible shapes, and then gradually strive for more complex and accurate ones in the subsequent training sessions.

3D Shape Representation Representation Learning

Category-Level Articulated Object Pose Estimation

2 code implementations CVPR 2020 Xiaolong Li, He Wang, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song

We develop a deep network based on PointNet++ that predicts ANCSH from a single depth point cloud, including part segmentation, normalized coordinates, and joint parameters in the canonical object space.

Object Pose Estimation

StructEdit: Learning Structural Shape Variations

1 code implementation CVPR 2020 Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas

Learning to encode differences in the geometry and (topological) structure of the shapes of ordinary objects is key to generating semantically plausible variations of a given shape, transferring edits from one shape to another, and many other applications in 3D content creation.

StructureNet: Hierarchical Graph Networks for 3D Shape Generation

2 code implementations1 Aug 2019 Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas

We introduce StructureNet, a hierarchical graph network which (i) can directly encode shapes represented as such n-ary graphs; (ii) can be robustly trained on large and complex shape families; and (iii) can be used to generate a great diversity of realistic structured shape geometries.

3D Shape Generation

AdaCoSeg: Adaptive Shape Co-Segmentation with Group Consistency Loss

no code implementations CVPR 2020 Chenyang Zhu, Kai Xu, Siddhartha Chaudhuri, Li Yi, Leonidas Guibas, Hao Zhang

While the part prior network can be trained with noisy and inconsistently segmented shapes, the final output of AdaCoSeg is a consistent part labeling for the input set, with each shape segmented into up to (a user-specified) K parts.

Instance Segmentation Segmentation +1

TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes

1 code implementation CVPR 2019 Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkhouser, Matthias Nießner, Leonidas Guibas

We introduce, TextureNet, a neural network architecture designed to extract features from high-resolution signals associated with 3D surface meshes (e. g., color texture maps).

Ranked #23 on Semantic Segmentation on ScanNet (test mIoU metric)

3D Semantic Segmentation

Supervised Fitting of Geometric Primitives to 3D Point Clouds

2 code implementations CVPR 2019 Lingxiao Li, Minhyuk Sung, Anastasia Dubrovina, Li Yi, Leonidas Guibas

Fitting geometric primitives to 3D point cloud data bridges a gap between low-level digitized 3D data and high-level structural information on the underlying 3D shapes.

Shape Representation Of 3D Point Clouds

Deep Part Induction from Articulated Object Pairs

1 code implementation19 Sep 2018 Li Yi, Haibin Huang, Difan Liu, Evangelos Kalogerakis, Hao Su, Leonidas Guibas

In this paper, we explore how the observation of different articulation states provides evidence for part structure and motion of 3D objects.

Object

Beyond Holistic Object Recognition: Enriching Image Understanding with Part States

no code implementations CVPR 2018 Cewu Lu, Hao Su, Yongyi Lu, Li Yi, Chi-Keung Tang, Leonidas Guibas

Important high-level vision tasks such as human-object interaction, image captioning and robotic manipulation require rich semantic descriptions of objects at part level.

Human-Object Interaction Detection Image Captioning +1

SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation

no code implementations CVPR 2017 Li Yi, Hao Su, Xingwen Guo, Leonidas Guibas

To enable the prediction of vertex functions on them by convolutional neural networks, we resort to spectral CNN method that enables weight sharing by parameterizing kernels in the spectral domain spanned by graph laplacian eigenbases.

3D Part Segmentation

Dialogue Session Segmentation by Embedding-Enhanced TextTiling

no code implementations13 Oct 2016 Yiping Song, Lili Mou, Rui Yan, Li Yi, Zinan Zhu, Xiaohua Hu, Ming Zhang

In human-computer conversation systems, the context of a user-issued utterance is particularly important because it provides useful background information of the conversation.

Word Embeddings

Learning Discriminative Representations for Semantic Cross Media Retrieval

no code implementations18 Nov 2015 Jiang Aiwen, Li Hanxi, Li Yi, Wang Mingwen

As a result, an efficient linear semantic down mapping is jointly learned for multimodal data, leading to a common space where they can be compared.

Representation Learning Retrieval

3D-Assisted Image Feature Synthesis for Novel Views of an Object

no code implementations26 Nov 2014 Hao Su, Fan Wang, Li Yi, Leonidas Guibas

In this paper, given a single input image of an object, we synthesize new features for other views of the same object.

Image Retrieval Object +1

Cannot find the paper you are looking for? You can Submit a new open access paper.