Search Results for author: Li Yi

Found 62 papers, 33 papers with code

DreamLLM: Synergistic Multimodal Comprehension and Creation

1 code implementation20 Sep 2023 Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi

This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.

TransTouch: Learning Transparent Objects Depth Sensing Through Sparse Touches

no code implementations18 Sep 2023 Liuyu Bian, Pengyang Shi, Weihang Chen, Jing Xu, Li Yi, Rui Chen

By approximating and optimizing the utility function, we can optimize the probing locations given a fixed touching budget to better improve the network's performance on real objects.

Transparent objects

3D Implicit Transporter for Temporally Consistent Keypoint Discovery

1 code implementation10 Sep 2023 Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao

To address this issue, the Transporter method was introduced for 2D data, which reconstructs the target frame from the source frame to incorporate both spatial and temporal information.

Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation

1 code implementation21 Aug 2023 Xueyi Liu, Bin Wang, He Wang, Li Yi

By observing an articulated object dataset containing only a few examples, we wish to learn a model that can generate diverse meshes with high visual fidelity and physical validity.


UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning

no code implementations2 Apr 2023 Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang

We propose a novel, object-agnostic method for learning a universal policy for dexterous object grasping from realistic point cloud observations and proprioceptive information under a table-top setting, namely UniDexGrasp++.

JacobiNeRF: NeRF Shaping with Mutual Information Gradients

1 code implementation CVPR 2023 Xiaomeng Xu, Yanchao Yang, Kaichun Mo, Boxiao Pan, Li Yi, Leonidas Guibas

We propose a method that trains a neural radiance field (NeRF) to encode not only the appearance of the scene but also semantic correlations between scene points, regions, or entities -- aiming to capture their mutual co-variation patterns.

Instance Segmentation Semantic Segmentation

Semi-Weakly Supervised Object Kinematic Motion Prediction

no code implementations CVPR 2023 Gengxin Liu, Qian Sun, Haibin Huang, Chongyang Ma, Yulan Guo, Li Yi, Hui Huang, Ruizhen Hu

First, although 3D dataset with fully annotated motion labels is limited, there are existing datasets and methods for object part semantic segmentation at large scale.

motion prediction Semantic Segmentation +1

CAMS: CAnonicalized Manipulation Spaces for Category-Level Functional Hand-Object Manipulation Synthesis

no code implementations CVPR 2023 Juntian Zheng, Qingyuan Zheng, Lixing Fang, Yun Liu, Li Yi

In this work, we focus on a novel task of category-level functional hand-object manipulation synthesis covering both rigid and articulated object categories.

Controllable Ancient Chinese Lyrics Generation Based on Phrase Prototype Retrieving

no code implementations20 Mar 2023 Li Yi

The phrase connector then selects a series of phrases from the phrase pool that minimizes a multi-term loss function that considers rhyme, song structure, and fluency.

HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining

1 code implementation CVPR 2023 Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang, Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang

Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.

Autonomous Driving Crowd Counting +4

UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy

no code implementations CVPR 2023 Yinzhen Xu, Weikang Wan, Jialiang Zhang, Haoran Liu, Zikang Shan, Hao Shen, Ruicheng Wang, Haoran Geng, Yijia Weng, Jiayi Chen, Tengyu Liu, Li Yi, He Wang

Trained on our synthesized large-scale dexterous grasp dataset, this model enables us to sample diverse and high-quality dexterous grasp poses for the object point cloud. For the second stage, we propose to replace the motion planning used in parallel gripper grasping with a goal-conditioned grasp policy, due to the complexity involved in dexterous grasping execution.

Motion Planning

Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance

1 code implementation28 Feb 2023 Xueyi Liu, Ji Zhang, Ruizhen Hu, Haibin Huang, He Wang, Li Yi

Category-level articulated object pose estimation aims to estimate a hierarchy of articulation-aware object poses of an unseen articulated object from a known category.

Disentanglement Pose Estimation

Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

2 code implementations5 Feb 2023 Zekun Qi, Runpei Dong, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng Ma, Li Yi

This motivates us to learn 3D representations by sharing the merits of both paradigms, which is non-trivial due to the pattern difference between the two paradigms.

 Ranked #1 on 3D Point Cloud Linear Classification on ModelNet40 (using extra training data)

3D Point Cloud Linear Classification Few-Shot 3D Point Cloud Classification +2

When Source-Free Domain Adaptation Meets Learning with Noisy Labels

no code implementations31 Jan 2023 Li Yi, Gezheng Xu, Pengcheng Xu, Jiaqi Li, Ruizhi Pu, Charles Ling, A. Ian McLeod, Boyu Wang

We also prove that such a difference makes existing LLN methods that rely on their distribution assumptions unable to address the label noise in SFDA.

Learning with noisy labels Source-Free Domain Adaptation

Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?

2 code implementations16 Dec 2022 Runpei Dong, Zekun Qi, Linfeng Zhang, Junbo Zhang, Jianjian Sun, Zheng Ge, Li Yi, Kaisheng Ma

The success of deep learning heavily relies on large-scale data with comprehensive labels, which is more expensive and time-consuming to fetch in 3D compared to 2D images or natural languages.

Few-Shot 3D Point Cloud Classification Knowledge Distillation +1

Language-Assisted 3D Feature Learning for Semantic Scene Understanding

1 code implementation25 Nov 2022 Junbo Zhang, Guofan Fan, Guanghan Wang, Zhengyuan Su, Kaisheng Ma, Li Yi

To guide 3D feature learning toward important geometric attributes and scene context, we explore the help of textual scene descriptions.

Descriptive Instance Segmentation +4

Morig: Motion-aware rigging of character meshes from point clouds

1 code implementation17 Oct 2022 Zhan Xu, Yang Zhou, Li Yi, Evangelos Kalogerakis

We present MoRig, a method that automatically rigs character meshes driven by single-view point cloud streams capturing the motion of performing characters.

Enhancing Generalizable 6D Pose Tracking of an In-Hand Object with Tactile Sensing

no code implementations8 Oct 2022 Xiaomeng Xu, Yun Liu, Weihang Chen, Haocheng Yuan, He Wang, Jing Xu, Rui Chen, Li Yi

To test our method in real scenarios and enable future studies on generalizable visual-tactile tracking, we collect a real visual-tactile in-hand object pose tracking dataset.

hand-object pose Pose Tracking

Tracking and Reconstructing Hand Object Interactions from Point Cloud Sequences in the Wild

no code implementations24 Sep 2022 Jiayi Chen, Mi Yan, Jiazhao Zhang, Yinzhen Xu, Xiaolong Li, Yijia Weng, Li Yi, Shuran Song, He Wang

We for the first time propose a point cloud based hand joint tracking network, HandTrackNet, to estimate the inter-frame hand joint motion.

hand-object pose Object Tracking +1

AccoMontage2: A Complete Harmonization and Accompaniment Arrangement System

1 code implementation1 Sep 2022 Li Yi, Haochen Hu, Jingwei Zhao, Gus Xia

We propose AccoMontage2, a system capable of doing full-length song harmonization and accompaniment arrangement based on a lead melody.

Retrieval Template Matching

BOSS: A Benchmark for Human Belief Prediction in Object-context Scenarios

no code implementations21 Jun 2022 Jiafei Duan, Samson Yu, Nicholas Tan, Li Yi, Cheston Tan

Humans with an average level of social cognition can infer the beliefs of others based solely on the nonverbal communication signals (e. g. gaze, gesture, pose and contextual information) exhibited during social interactions.

Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction

no code implementations CVPR 2022 Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

Specifically, FixNet consists of a perception module to extract the structured representation from the 3D point cloud, a physical dynamics prediction module to simulate the results of interactions on 3D objects, and a functionality prediction module to evaluate the functionality and choose the correct fix.

Rotationally Equivariant 3D Object Detection

no code implementations CVPR 2022 Hong-Xing Yu, Jiajun Wu, Li Yi

To incorporate object-level rotation equivariance into 3D object detectors, we need a mechanism to extract equivariant features with local object-level spatial support while being able to model cross-object context information.

3D Object Detection Autonomous Driving +1

Multi-Robot Active Mapping via Neural Bipartite Graph Matching

no code implementations CVPR 2022 Kai Ye, Siyan Dong, Qingnan Fan, He Wang, Li Yi, Fei Xia, Jue Wang, Baoquan Chen

Previous approaches either choose the frontier as the goal position via a myopic solution that hinders the time efficiency, or maximize the long-term value via reinforcement learning to directly regress the goal position, but does not guarantee the complete map construction.

Graph Matching reinforcement-learning +1

CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

no code implementations CVPR 2022 Tianchen Zhao, Niansong Zhang, Xuefei Ning, He Wang, Li Yi, Yu Wang

We propose CodedVTR (Codebook-based Voxel TRansformer), which improves data efficiency and generalization ability for 3D sparse voxel transformers.

3D Semantic Segmentation

AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation

1 code implementation CVPR 2022 Xueyi Liu, Xiaomeng Xu, Anyi Rao, Chuang Gan, Li Yi

To solve the above issues, we propose AutoGPart, a generic method enabling training generalizable 3D part segmentation networks with the task prior considered.

3D Part Segmentation Domain Generalization

On Learning Contrastive Representations for Learning with Noisy Labels

1 code implementation CVPR 2022 Li Yi, Sheng Liu, Qi She, A. Ian McLeod, Boyu Wang

To address this issue, we focus on learning robust contrastive representations of data on which the classifier is hard to memorize the label noise under the CE loss.

Learning with noisy labels Memorization +1

HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction

1 code implementation CVPR 2022 Yunze Liu, Yun Liu, Che Jiang, Kangbo Lyu, Weikang Wan, Hao Shen, Boqiang Liang, Zhoujie Fu, He Wang, Li Yi

We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction.

Action Segmentation Benchmarking +4

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning

no code implementations NeurIPS 2021 Yining Hong, Li Yi, Joshua B. Tenenbaum, Antonio Torralba, Chuang Gan

A critical aspect of human visual perception is the ability to parse visual scenes into individual objects and further into object parts, forming part-whole hierarchies.

Instance Segmentation Semantic Segmentation +1

Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation

no code implementations NeurIPS 2021 Xiaolong Li, Yijia Weng, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang

Category-level object pose estimation aims to find 6D object poses of previously unseen object instances from known categories without access to object CAD models.

Pose Estimation Self-Supervised Learning

Leveraging SE(3) Equivariance for Self-supervised Category-Level Object Pose Estimation from Point Clouds

no code implementations NeurIPS 2021 Xiaolong Li, Yijia Weng, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang

To reduce the huge amount of pose annotations needed for category-level learning, we propose for the first time a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds.

Pose Estimation Self-Supervised Learning

P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding

no code implementations24 Dec 2020 Yunze Liu, Li Yi, Shanghang Zhang, Qingnan Fan, Thomas Funkhouser, Hao Dong

Self-supervised representation learning is a critical problem in computer vision, as it provides a way to pretrain feature extractors on large unlabeled datasets that can be used as an initialization for more efficient and effective training on downstream tasks.

Contrastive Learning Representation Learning +1

Compositionally Generalizable 3D Structure Prediction

1 code implementation4 Dec 2020 Songfang Han, Jiayuan Gu, Kaichun Mo, Li Yi, Siyu Hu, Xuejin Chen, Hao Su

However, there remains a much more difficult and under-explored issue on how to generalize the learned skills over unseen object categories that have very different shape geometry distributions.

3D Shape Reconstruction Translation

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds

no code implementations CVPR 2021 Li Yi, Boqing Gong, Thomas Funkhouser

We study an unsupervised domain adaptation problem for the semantic labeling of 3D point clouds, with a particular focus on domain discrepancies induced by different LiDAR sensors.

Semantic Segmentation Unsupervised Domain Adaptation

Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks

no code implementations12 Jun 2020 He Wang, Zetian Jiang, Li Yi, Kaichun Mo, Hao Su, Leonidas J. Guibas

We further study how different evaluation metrics weigh the sampling pattern against the geometry and propose several perceptual metrics forming a sampling spectrum of metrics.


Curriculum DeepSDF

1 code implementation ECCV 2020 Yueqi Duan, Haidong Zhu, He Wang, Li Yi, Ram Nevatia, Leonidas J. Guibas

When learning to sketch, beginners start with simple and flexible shapes, and then gradually strive for more complex and accurate ones in the subsequent training sessions.

3D Shape Representation Representation Learning

SAPIEN: A SimulAted Part-based Interactive ENvironment

1 code implementation CVPR 2020 Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su

To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable.

Category-Level Articulated Object Pose Estimation

2 code implementations CVPR 2020 Xiaolong Li, He Wang, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song

We develop a deep network based on PointNet++ that predicts ANCSH from a single depth point cloud, including part segmentation, normalized coordinates, and joint parameters in the canonical object space.

Pose Estimation

StructEdit: Learning Structural Shape Variations

1 code implementation CVPR 2020 Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas

Learning to encode differences in the geometry and (topological) structure of the shapes of ordinary objects is key to generating semantically plausible variations of a given shape, transferring edits from one shape to another, and many other applications in 3D content creation.

StructureNet: Hierarchical Graph Networks for 3D Shape Generation

2 code implementations1 Aug 2019 Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas

We introduce StructureNet, a hierarchical graph network which (i) can directly encode shapes represented as such n-ary graphs; (ii) can be robustly trained on large and complex shape families; and (iii) can be used to generate a great diversity of realistic structured shape geometries.

3D Shape Generation

AdaCoSeg: Adaptive Shape Co-Segmentation with Group Consistency Loss

no code implementations CVPR 2020 Chenyang Zhu, Kai Xu, Siddhartha Chaudhuri, Li Yi, Leonidas Guibas, Hao Zhang

While the part prior network can be trained with noisy and inconsistently segmented shapes, the final output of AdaCoSeg is a consistent part labeling for the input set, with each shape segmented into up to (a user-specified) K parts.

Instance Segmentation Semantic Segmentation

TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes

1 code implementation CVPR 2019 Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkhouser, Matthias Nießner, Leonidas Guibas

We introduce, TextureNet, a neural network architecture designed to extract features from high-resolution signals associated with 3D surface meshes (e. g., color texture maps).

Ranked #17 on Semantic Segmentation on ScanNet (test mIoU metric)

3D Semantic Segmentation

Supervised Fitting of Geometric Primitives to 3D Point Clouds

2 code implementations CVPR 2019 Lingxiao Li, Minhyuk Sung, Anastasia Dubrovina, Li Yi, Leonidas Guibas

Fitting geometric primitives to 3D point cloud data bridges a gap between low-level digitized 3D data and high-level structural information on the underlying 3D shapes.

Shape Representation Of 3D Point Clouds

Deep Part Induction from Articulated Object Pairs

1 code implementation19 Sep 2018 Li Yi, Haibin Huang, Difan Liu, Evangelos Kalogerakis, Hao Su, Leonidas Guibas

In this paper, we explore how the observation of different articulation states provides evidence for part structure and motion of 3D objects.

Beyond Holistic Object Recognition: Enriching Image Understanding with Part States

no code implementations CVPR 2018 Cewu Lu, Hao Su, Yongyi Lu, Li Yi, Chi-Keung Tang, Leonidas Guibas

Important high-level vision tasks such as human-object interaction, image captioning and robotic manipulation require rich semantic descriptions of objects at part level.

Human-Object Interaction Detection Image Captioning +1

SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation

no code implementations CVPR 2017 Li Yi, Hao Su, Xingwen Guo, Leonidas Guibas

To enable the prediction of vertex functions on them by convolutional neural networks, we resort to spectral CNN method that enables weight sharing by parameterizing kernels in the spectral domain spanned by graph laplacian eigenbases.

3D Part Segmentation

Dialogue Session Segmentation by Embedding-Enhanced TextTiling

no code implementations13 Oct 2016 Yiping Song, Lili Mou, Rui Yan, Li Yi, Zinan Zhu, Xiaohua Hu, Ming Zhang

In human-computer conversation systems, the context of a user-issued utterance is particularly important because it provides useful background information of the conversation.

Word Embeddings

Learning Discriminative Representations for Semantic Cross Media Retrieval

no code implementations18 Nov 2015 Jiang Aiwen, Li Hanxi, Li Yi, Wang Mingwen

As a result, an efficient linear semantic down mapping is jointly learned for multimodal data, leading to a common space where they can be compared.

Representation Learning Retrieval

3D-Assisted Image Feature Synthesis for Novel Views of an Object

no code implementations26 Nov 2014 Hao Su, Fan Wang, Li Yi, Leonidas Guibas

In this paper, given a single input image of an object, we synthesize new features for other views of the same object.

Image Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.