Search Results for author: Longlong Jing

Found 24 papers, 8 papers with code

3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation

no code implementations • 4 Jan 2024 • Zihao Xiao, Longlong Jing, Shangxuan Wu, Alex Zihao Zhu, Jingwei Ji, Chiyu Max Jiang, Wei-Chih Hung, Thomas Funkhouser, Weicheng Kuo, Anelia Angelova, Yin Zhou, Shiwei Sheng

3D panoptic segmentation is a challenging perception task, especially in autonomous driving.

Autonomous Driving Classification +3

Paper
Add Code

Point Cloud Self-supervised Learning via 3D to Multi-view Masked Autoencoder

1 code implementation • 17 Nov 2023 • Zhimin Chen, Yingwei Li, Longlong Jing, Liang Yang, Bing Li

However, a notable limitation of these approaches is that they do not fully utilize the multi-view attributes inherent in 3D point clouds, which is crucial for a deeper understanding of 3D structures.

3D Object Classification 3D Object Detection +3

Paper
Code

Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models

1 code implementation • NeurIPS 2023 • Zhimin Chen, Longlong Jing, Yingwei Li, Bing Li

Foundation models have achieved remarkable results in 2D and language tasks like image segmentation, object detection, and visual-language understanding.

3D Object Detection Image Captioning +7

Paper
Code

AsyInst: Asymmetric Affinity with DepthGrad and Color for Box-Supervised Instance Segmentation

no code implementations • 7 Dec 2022 • Siwei Yang, Longlong Jing, Junfei Xiao, Hang Zhao, Alan Yuille, Yingwei Li

Through systematic analysis, we found that the commonly used pairwise affinity loss has two limitations: (1) it works with color affinity but leads to inferior performance with other modalities such as depth gradient, (2)the original affinity loss does not prevent trivial predictions as intended but actually accelerates this process due to the affinity loss term being symmetric.

Box-supervised Instance Segmentation Segmentation +2

Paper
Add Code

Class-Level Confidence Based 3D Semi-Supervised Learning

1 code implementation • 18 Oct 2022 • Zhimin Chen, Longlong Jing, Liang Yang, Yingwei Li, Bing Li

Firstly, a dynamic thresholding strategy is proposed to utilize more unlabeled data, especially for low learning status classes.

Paper
Code

R4D: Utilizing Reference Objects for Long-Range Distance Estimation

no code implementations • ICLR 2022 • Yingwei Li, Tiffany Chen, Maya Kabkab, Ruichi Yu, Longlong Jing, Yurong You, Hang Zhao

An edge in the graph encodes the relative distance information between a pair of target and reference objects.

Autonomous Driving

Paper
Add Code

Depth Estimation Matters Most: Improving Per-Object Depth Estimation for Monocular 3D Detection and Tracking

no code implementations • 8 Jun 2022 • Longlong Jing, Ruichi Yu, Henrik Kretzschmar, Kang Li, Charles R. Qi, Hang Zhao, Alper Ayvaci, Xu Chen, Dillon Cower, Yingwei Li, Yurong You, Han Deng, CongCong Li, Dragomir Anguelov

Monocular image-based 3D perception has become an active research area in recent years owing to its applications in autonomous driving.

Autonomous Driving Depth Estimation +1

Paper
Add Code

Disentangling Object Motion and Occlusion for Unsupervised Multi-frame Monocular Depth

1 code implementation • 29 Mar 2022 • Ziyue Feng, Liang Yang, Longlong Jing, HaiYan Wang, YingLi Tian, Bing Li

Conventional self-supervised monocular depth prediction methods are based on a static environment assumption, which leads to accuracy degradation in dynamic scenes due to the mismatch and occlusion problems introduced by object motions.

Ranked #1 on Unsupervised Monocular Depth Estimation on KITTI Eigen Split Improved Ground Truth

Depth Prediction Disentanglement +4

118

Paper
Code

Learning from Temporal Gradient for Semi-supervised Action Recognition

1 code implementation • CVPR 2022 • Junfei Xiao, Longlong Jing, Lin Zhang, Ju He, Qi She, Zongwei Zhou, Alan Yuille, Yingwei Li

Our method achieves the state-of-the-art performance on three video action recognition benchmarks (i. e., Kinetics-400, UCF-101, and HMDB-51) under several typical semi-supervised settings (i. e., different ratios of labeled data).

Action Recognition Temporal Action Localization

Paper
Code

Multimodal Semi-Supervised Learning for 3D Objects

1 code implementation • 22 Oct 2021 • Zhimin Chen, Longlong Jing, Yang Liang, YingLi Tian, Bing Li

This paper explores how the coherence of different modelities of 3D data (e. g. point cloud, image, and mesh) can be used to improve data efficiency for both 3D classification and retrieval tasks.

3D Classification Retrieval

Paper
Code

Self-Supervised Modality-Invariant and Modality-Specific Feature Learning for 3D Objects

no code implementations • 29 Sep 2021 • Longlong Jing, Zhimin Chen, Bing Li, YingLi Tian

Our proposed novel self-supervised model learns two types of distinct features: modality-invariant features and modality-specific features.

3D Object Recognition Cross-Modal Retrieval +1

Paper
Add Code

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

2 code implementations • 20 Sep 2021 • Ziyue Feng, Longlong Jing, Peng Yin, YingLi Tian, Bing Li

Unlike the existing methods that use sparse LiDAR mainly in a manner of time-consuming iterative post-processing, our model fuses monocular image features and sparse LiDAR features to predict initial depth maps.

Ranked #1 on Depth Completion on KITTI

Depth Completion Depth Prediction +3

Paper
Code

Cross-Modal Center Loss for 3D Cross-Modal Retrieval

no code implementations • CVPR 2021 • Longlong Jing, Elahe Vahdani, Jiaxing Tan, YingLi Tian

Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Cross-modal Center Loss

no code implementations • 8 Aug 2020 • Longlong Jing, Elahe Vahdani, Jiaxing Tan, YingLi Tian

Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Self-supervised Modal and View Invariant Feature Learning

no code implementations • 28 May 2020 • Longlong Jing, Yu-cheng Chen, Ling Zhang, Mingyi He, YingLi Tian

By exploring the inherent multi-modality attributes of 3D objects, in this paper, we propose to jointly learn modal-invariant and view-invariant features from different modalities including image, point cloud, and mesh with heterogeneous networks for 3D data.

Cross-Modal Retrieval Retrieval

Paper
Add Code

An Isolated-Signing RGBD Dataset of 100 American Sign Language Signs Produced by Fluent ASL Signers

no code implementations • LREC 2020 • Saad Hassan, Larwan Berke, Elahe Vahdani, Longlong Jing, YingLi Tian, Matt Huenerfauth

We have collected a new dataset consisting of color and depth videos of fluent American Sign Language (ASL) signers performing sequences of 100 ASL signs from a Kinect v2 sensor.

Paper
Add Code

Recognizing American Sign Language Nonmanual Signal Grammar Errors in Continuous Videos

no code implementations • 1 May 2020 • Elahe Vahdani, Longlong Jing, YingLi Tian, Matt Huenerfauth

Our system is able to recognize grammatical elements on ASL-HW-RGBD from manual gestures, facial expressions, and head movements and successfully detect 8 ASL grammatical mistakes.

Paper
Add Code

Self-supervised Feature Learning by Cross-modality and Cross-view Correspondences

no code implementations • 13 Apr 2020 • Longlong Jing, Yu-cheng Chen, Ling Zhang, Mingyi He, YingLi Tian

Specifically, 2D image features of rendered images from different views are extracted by a 2D convolutional neural network, and 3D point cloud features are extracted by a graph convolution neural network.

3D Part Segmentation 3D Shape Classification +4

Paper
Add Code

VideoSSL: Semi-Supervised Learning for Video Classification

no code implementations • 29 Feb 2020 • Longlong Jing, Toufiq Parag, Zhe Wu, YingLi Tian, Hongcheng Wang

To minimize the dependence on a large annotated dataset, our proposed semi-supervised method trains from a small number of labeled examples and exploits two regulatory signals from unlabeled data.

Classification General Classification +1

Paper
Add Code

Recognizing American Sign Language Manual Signs from RGB-D Videos

no code implementations • 7 Jun 2019 • Longlong Jing, Elahe Vahdani, Matt Huenerfauth, YingLi Tian

In this paper, we propose a 3D Convolutional Neural Network (3DCNN) based multi-stream framework to recognize American Sign Language (ASL) manual signs (consisting of movements of the hands, as well as non-manual face movements in some cases) in real-time from RGB-D videos, by fusing multimodality features including hand gestures, facial expressions, and body poses from multi-channels (RGB, depth, motion, and skeleton joints).

Paper
Add Code

Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

no code implementations • 16 Feb 2019 • Longlong Jing, YingLi Tian

This paper provides an extensive review of deep learning-based self-supervised general visual feature learning methods from images or videos.

Self-Supervised Image Classification Self-Supervised Learning

Paper
Add Code

LGAN: Lung Segmentation in CT Scans Using Generative Adversarial Network

1 code implementation • 11 Jan 2019 • Jiaxing Tan, Longlong Jing, Yumei Huo, YingLi Tian, Oguz Akin

Lung segmentation in computerized tomography (CT) images is an important procedure in various lung disease diagnosis.

Generative Adversarial Network Segmentation

Paper
Code

Coarse-to-fine Semantic Segmentation from Image-level Labels

no code implementations • 28 Dec 2018 • Longlong Jing, Yu-cheng Chen, YingLi Tian

The enhanced coarse mask is fed to a fully convolutional neural network to be recursively refined.

Foreground Segmentation Object +2

Paper
Add Code

Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction

no code implementations • 28 Nov 2018 • Longlong Jing, Xiaodong Yang, Jingen Liu, YingLi Tian

The success of deep neural networks generally requires a vast amount of training data to be labeled, which is expensive and unfeasible in scale, especially for video collections.

Ranked #42 on Self-Supervised Action Recognition on HMDB51

Self-Supervised Action Recognition Temporal Action Localization +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.