Search Results for author: Junsong Yuan

Found 111 papers, 34 papers with code

3D Hand Shape and Pose Estimation from a Single RGB Image

2 code implementations CVPR 2019 Liuhao Ge, Zhou Ren, Yuncheng Li, Zehao Xue, Yingying Wang, Jianfei Cai, Junsong Yuan

This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image.

3D Hand Pose Estimation

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

2 code implementations ICCV 2019 Fu Xiong, Boshen Zhang, Yang Xiao, Zhiguo Cao, Taidong Yu, Joey Tianyi Zhou, Junsong Yuan

For 3D hand and body pose estimation task in depth image, a novel anchor-based approach termed Anchor-to-Joint regression network (A2J) with the end-to-end learning ability is proposed.

3D Pose Estimation Depth Estimation +1

NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions

1 code implementation ICCV 2023 Zhang Chen, Zhong Li, Liangchen Song, Lele Chen, Jingyi Yu, Junsong Yuan, Yi Xu

The spatial positions of their neural features are fixed on grid nodes and cannot well adapt to target signals.

GRiT: A Generative Region-to-text Transformer for Object Understanding

1 code implementation1 Dec 2022 Jialian Wu, JianFeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang

Specifically, GRiT consists of a visual encoder to extract image features, a foreground object extractor to localize objects, and a text decoder to generate open-set object descriptions.

Dense Captioning Descriptive +3

MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video

1 code implementation CVPR 2022 Jinlu Zhang, Zhigang Tu, Jianyu Yang, Yujin Chen, Junsong Yuan

Recent transformer-based solutions have been introduced to estimate 3D human pose from 2D keypoint sequence by considering body joints among all frames globally to learn spatio-temporal correlation.

Monocular 3D Human Pose Estimation

Model-based 3D Hand Reconstruction via Self-Supervised Learning

1 code implementation CVPR 2021 Yujin Chen, Zhigang Tu, Di Kang, Linchao Bao, Ying Zhang, Xuefei Zhe, Ruizhi Chen, Junsong Yuan

For the first time, we demonstrate the feasibility of training an accurate 3D hand reconstruction network without relying on manual annotations.

Self-Supervised Learning

AiATrack: Attention in Attention for Transformer Visual Tracking

1 code implementation20 Jul 2022 Shenyuan Gao, Chunluan Zhou, Chao Ma, Xinggang Wang, Junsong Yuan

However, the independent correlation computation in the attention mechanism could result in noisy and ambiguous attention weights, which inhibits further performance improvement.

Visual Object Tracking Visual Tracking

Kernel Cross-Correlator

3 code implementations12 Sep 2017 Chen Wang, Le Zhang, Lihua Xie, Junsong Yuan

Cross-correlator plays a significant role in many visual perception tasks, such as object detection and tracking.

Human Activity Recognition object-detection +2

Non-iterative SLAM for Warehouse Robots Using Ground Textures

2 code implementations16 Oct 2017 Chen Wang, Minh-Chung Hoang, Lihua Xie, Junsong Yuan

We present a novel visual SLAM method for the warehouse robot with a single downward-facing camera using ground textures.

Robotics

PointCloud Saliency Maps

3 code implementations ICCV 2019 Tianhang Zheng, Changyou Chen, Junsong Yuan, Bo Li, Kui Ren

Our motivation for constructing a saliency map is by point dropping, which is a non-differentiable operator.

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation

1 code implementation13 Aug 2020 Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, Junsong Yuan

In the classification tree, as the number of parent class nodes are significantly less, their logits are less noisy and can be utilized to suppress the wrong/noisy logits existed in the fine-grained class nodes.

Classification Few-Shot Object Detection +7

SPAGAN: Shortest Path Graph Attention Network

1 code implementation10 Jan 2021 Yiding Yang, Xinchao Wang, Mingli Song, Junsong Yuan, DaCheng Tao

SPAGAN therefore allows for a more informative and intact exploration of the graph structure and further {a} more effective aggregation of information from distant neighbors into the center node, as compared to node-based GCN methods.

Graph Attention

Kervolutional Neural Networks

6 code implementations CVPR 2019 Chen Wang, Jianfei Yang, Lihua Xie, Junsong Yuan

Convolutional neural networks (CNNs) have enabled the state-of-the-art performance in many computer vision tasks.

Structure-Aware Human-Action Generation

1 code implementation ECCV 2020 Ping Yu, Yang Zhao, Chunyuan Li, Junsong Yuan, Changyou Chen

Generating long-range skeleton-based human actions has been a challenging problem since small deviations of one frame can cause a malformed action sequence.

Action Generation graph construction +1

Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene

1 code implementation11 Aug 2020 Xinke Li, Chongshou Li, Zekun Tong, Andrew Lim, Junsong Yuan, Yuwei Wu, Jing Tang, Raymond Huang

Based on it, we formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.

Instance Segmentation Point Cloud Segmentation +3

ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection

2 code implementations14 Aug 2020 Ye Liu, Junsong Yuan, Chang Wen Chen

We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of <human, action, object> in images.

Human-Object Interaction Detection Object +1

Language-guided Human Motion Synthesis with Atomic Actions

1 code implementation18 Aug 2023 Yuanhao Zhai, Mingzhen Huang, Tianyu Luan, Lu Dong, Ifeoma Nwogu, Siwei Lyu, David Doermann, Junsong Yuan

In this paper, we propose ATOM (ATomic mOtion Modeling) to mitigate this problem, by decomposing actions into atomic actions, and employing a curriculum learning strategy to learn atomic action composition.

Motion Synthesis

Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions

1 code implementation AAAI 2019 Zhenyi Wang, Ping Yu, Yang Zhao, Ruiyi Zhang, Yufan Zhou, Junsong Yuan, Changyou Chen

In this paper, we focus on skeleton-based action generation and propose to model smooth and diverse transitions on a latent space of action sequences with much lower dimensionality.

Action Generation

Learning Transferable Human-Object Interaction Detector With Natural Language Supervision

1 code implementation CVPR 2022 Suchen Wang, Yueqi Duan, Henghui Ding, Yap-Peng Tan, Kim-Hui Yap, Junsong Yuan

More specifically, we propose a new HOI visual encoder to detect the interacting humans and objects, and map them to a joint feature space to perform interaction recognition.

Human-Object Interaction Detection

Source-Free Domain Adaptation for Medical Image Segmentation via Prototype-Anchored Feature Alignment and Contrastive Learning

1 code implementation19 Jul 2023 Qinji Yu, Nan Xi, Junsong Yuan, Ziyu Zhou, Kang Dang, Xiaowei Ding

To tackle the source data-absent problem, we present a novel two-stage source-free domain adaptation (SFDA) framework for medical image segmentation, where only a well-trained source segmentation model and unlabeled target data are available during domain adaptation.

Contrastive Learning Image Segmentation +5

Motion-driven Visual Tempo Learning for Video-based Action Recognition

2 code implementations TIP 2022 Yuanzhong Liu, Junsong Yuan, Zhigang Tu

Action visual tempo characterizes the dynamics and the temporal scale of an action, which is helpful to distinguish human actions that share high similarities in visual dynamics and appearance.

Action Recognition

Harnessing Low-Frequency Neural Fields for Few-Shot View Synthesis

1 code implementation15 Mar 2023 Liangchen Song, Zhong Li, Xuan Gong, Lele Chen, Zhang Chen, Yi Xu, Junsong Yuan

We further propose a simple-yet-effective strategy for tuning the frequency to avoid overfitting few-shot inputs: enforcing consistency among the frequency domain of rendered 2D images.

Novel View Synthesis

Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting

1 code implementation ICCV 2023 Wentao Bao, Lele Chen, Libing Zeng, Zhong Li, Yi Xu, Junsong Yuan, Yu Kong

In this paper, we set up an egocentric 3D hand trajectory forecasting task that aims to predict hand trajectories in a 3D space from early observed RGB videos in a first-person view.

3D Human Pose Tracking Trajectory Forecasting +1

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation

1 code implementation18 Mar 2024 Zixin Zhu, Xuelu Feng, Dongdong Chen, Junsong Yuan, Chunming Qiao, Gang Hua

We hypothesize that the latent representation learned from a pretrained generative T2V model encapsulates rich semantics and coherent temporal correspondences, thereby naturally facilitating video understanding.

Referring Video Object Segmentation Semantic Segmentation +2

Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition

1 code implementation CVPR 2017 Junwu Weng, Chaoqun Weng, Junsong Yuan

Moreover, by identifying key skeleton joints and temporal stages for each action class, our ST-NBNN can capture the essential spatio-temporal patterns that play key roles of recognizing actions, which is not always achievable by using end-to-end models.

Action Classification Action Recognition +2

High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition

1 code implementation CVPR 2023 Tianyu Luan, Yuanhao Zhai, Jingjing Meng, Zhong Li, Zhang Chen, Yi Xu, Junsong Yuan

To capture high-frequency personalized details, we transform the 3D mesh into the frequency domain, and propose a novel frequency decomposition loss to supervise each frequency component.

Deformable VisTR: Spatio temporal deformable attention for video instance segmentation

1 code implementation12 Mar 2022 Sudhir Yarram, Jialian Wu, Pan Ji, Yi Xu, Junsong Yuan

To improve the training efficiency, we propose Deformable VisTR, leveraging spatio-temporal deformable attention module that only attends to a small fixed set of key spatio-temporal sampling points around a reference point.

Instance Segmentation Semantic Segmentation +1

Actor-Action Semantic Segmentation with Region Masks

no code implementations23 Jul 2018 Kang Dang, Chunluan Zhou, Zhigang Tu, Michael Hoy, Justin Dauwels, Junsong Yuan

One major challenge for this task is that when an actor performs an action, different body parts of the actor provide different types of cues for the action category and may receive inconsistent action labeling when they are labeled independently.

Action Segmentation Instance Segmentation +2

Exploiting Local Feature Patterns for Unsupervised Domain Adaptation

no code implementations12 Nov 2018 Jun Wen, Risheng Liu, Nenggan Zheng, Qian Zheng, Zhefeng Gong, Junsong Yuan

In this paper, we present a method for learning domain-invariant local feature patterns and jointly aligning holistic and local feature statistics.

Unsupervised Domain Adaptation

Recognizing Human Actions as the Evolution of Pose Estimation Maps

no code implementations CVPR 2018 Mengyuan Liu, Junsong Yuan

Specifically, the evolution of pose estimation maps can be decomposed as an evolution of heatmaps, e. g., probabilistic maps, and an evolution of estimated 2D human poses, which denote the changes of body shape and body pose, respectively.

Action Recognition Multimodal Activity Recognition +3

Conditional Generative Adversarial Network for Structured Domain Adaptation

no code implementations CVPR 2018 Weixiang Hong, Zhenzhen Wang, Ming Yang, Junsong Yuan

In recent years, deep neural nets have triumphed over many computer vision problems, including semantic segmentation, which is a critical task in emerging autonomous driving and medical image diagnostics applications.

Autonomous Driving Domain Adaptation +2

Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display

no code implementations CVPR 2018 Shizheng Wang, Wenjuan Liao, Phil Surman, Zhigang Tu, Yuanjin Zheng, Junsong Yuan

Multi-layer light field displays are a type of computational three-dimensional (3D) display which has recently gained increasing interest for its holographic-like effect and natural compatibility with 2D displays.

Bi-box Regression for Pedestrian Detection and Occlusion Estimation

no code implementations ECCV 2018 Chunluan Zhou, Junsong Yuan

The full body estimation branch is trained to regress full body regions for positive pedestrian proposals, while the visible part estimation branch is trained to regress visible part regions for both positive and negative pedestrian proposals.

Occlusion Estimation Pedestrian Detection +1

Product Quantization Network for Fast Image Retrieval

no code implementations ECCV 2018 Tan Yu, Junsong Yuan, Chen Fang, Hailin Jin

Product quantization has been widely used in fast image retrieval due to its effectiveness of coding high-dimensional visual features.

Image Retrieval Quantization +1

Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition

no code implementations ECCV 2018 Junwu Weng, Mengyuan Liu, Xudong Jiang, Junsong Yuan

This deformable convolution can better utilize contextual joints for action and gesture recognition and is more robust to noisy joints.

Hand Gesture Recognition Hand-Gesture Recognition

Point-to-Point Regression PointNet for 3D Hand Pose Estimation

no code implementations ECCV 2018 Liuhao Ge, Zhou Ren, Junsong Yuan

Convolutional Neural Networks (CNNs)-based methods for 3D hand pose estimation with depth cameras usually take 2D depth images as input and directly regress holistic 3D hand pose.

3D Hand Pose Estimation regression

Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images

no code implementations ECCV 2018 Yujun Cai, Liuhao Ge, Jianfei Cai, Junsong Yuan

Compared with depth-based 3D hand pose estimation, it is more challenging to infer 3D hand pose from monocular RGB images, due to substantial depth ambiguity and the difficulty of obtaining fully-annotated training data.

3D Hand Pose Estimation

Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior

no code implementations CVPR 2013 Gangqiang Zhao, Junsong Yuan, Gang Hua

We show that such data driven co-occurrence information from bottom-up can conveniently be incorporated in LDA with a Gaussian Markov prior, which combines top down probabilistic topic modeling with bottom up priors in a unified model.

Object Object Discovery +1

Multi-feature Spectral Clustering with Minimax Optimization

no code implementations CVPR 2014 Hongxing Wang, Chaoqun Weng, Junsong Yuan

To find a consensus clustering result that is agreeable to all feature modalities, our objective is to find a universal feature embedding, which not only fits each individual feature modality well, but also unifies different feature modalities by minimizing their pairwise disagreements.

Clustering

Fast Action Proposals for Human Action Detection and Search

no code implementations CVPR 2015 Gang Yu, Junsong Yuan

Assuming each action is performed by a human with meaningful motion, both appearance and motion cues are utilized to measure the actionness of the video tubes.

Action Detection Video Segmentation +1

From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection

no code implementations CVPR 2016 Jingjing Meng, Hongxing Wang, Junsong Yuan, Yap-Peng Tan

This representative selection problem is formulated as a sparse dictionary selection problem, i. e., choosing a few representatives object proposals to reconstruct the whole proposal pool.

Object Video Summarization

3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation From Single Depth Images

no code implementations CVPR 2017 Liuhao Ge, Hui Liang, Junsong Yuan, Daniel Thalmann

We propose a simple, yet effective approach for real-time hand pose estimation from single depth images using three-dimensional Convolutional Neural Networks (3D CNNs).

3D Hand Pose Estimation Data Augmentation

Fried Binary Embedding for High-Dimensional Visual Features

no code implementations CVPR 2017 Weixiang Hong, Junsong Yuan, Sreyasee Das Bhattacharjee

We argue that long binary codes (b O(d)) are critical to fully utilize the discriminative power of high-dimensional visual features, and can achieve better results in various tasks such as approximate nearest neighbour search.

Vocal Bursts Intensity Prediction

Object Co-Skeletonization With Co-Segmentation

no code implementations CVPR 2017 Koteswar Rao Jerripothula, Jianfei Cai, Jiangbo Lu, Junsong Yuan

Recent advances in the joint processing of images have certainly shown its advantages over the individual processing.

Object Segmentation

Adaptive Exponential Smoothing for Online Filtering of Pixel Prediction Maps

no code implementations ICCV 2015 Kang Dang, Jiong Yang, Junsong Yuan

We propose an efficient online video filtering method, called adaptive exponential filtering (AES) to refine pixel prediction maps.

Saliency Detection Scene Parsing

Compressive Quantization for Fast Object Instance Search in Videos

no code implementations ICCV 2017 Tan Yu, Zhenzhen Wang, Junsong Yuan

Most of current visual search systems focus on image-to-image (point-to-point) search such as image and object retrieval.

Instance Search Object +3

Common Action Discovery and Localization in Unconstrained Videos

no code implementations ICCV 2017 Jiong Yang, Junsong Yuan

Similar to common object discovery in images or videos, it is of great interests to discover and locate common actions in videos, which can benefit many video analytics applications such as video summarization, search, and understanding.

Object Discovery Video Summarization

Multi-Label Learning of Part Detectors for Heavily Occluded Pedestrian Detection

no code implementations ICCV 2017 Chunluan Zhou, Junsong Yuan

Detecting pedestrians that are partially occluded remains a challenging problem due to variations and uncertainties of partial occlusion patterns.

Multi-Label Learning Pedestrian Detection

Towards Real-time Eyeblink Detection in The Wild:Dataset,Theory and Practices

no code implementations21 Feb 2019 Guilei Hu, Yang Xiao, Zhiguo Cao, Lubin Meng, Zhiwen Fang, Joey Tianyi Zhou, Junsong Yuan

Effective and real-time eyeblink detection is of wide-range applications, such as deception detection, drive fatigue detection, face anti-spoofing, etc.

Attribute Deception Detection +1

Progress Regression RNN for Online Spatial-Temporal Action Localization in Unconstrained Videos

no code implementations1 Mar 2019 Bo Hu, Jianfei Cai, Tat-Jen Cham, Junsong Yuan

Previous spatial-temporal action localization methods commonly follow the pipeline of object detection to estimate bounding boxes and labels of actions.

object-detection Object Detection +3

Bayesian Uncertainty Matching for Unsupervised Domain Adaptation

no code implementations24 Jun 2019 Jun Wen, Nenggan Zheng, Junsong Yuan, Zhefeng Gong, Changyou Chen

By imposing distribution matching on both features and labels (via uncertainty), label distribution mismatching in source and target data is effectively alleviated, encouraging the classifier to produce consistent predictions across domains.

Unsupervised Domain Adaptation

Context-Integrated and Feature-Refined Network for Lightweight Object Parsing

no code implementations26 Jul 2019 Bin Jiang, Wenxuan Tu, Chao Yang, Junsong Yuan

The core components of CIFReNet are the Long-skip Refinement Module (LRM) and the Multi-scale Context Integration Module (MCIM).

Scene Parsing Semantic Segmentation

Temporal Pulses Driven Spiking Neural Network for Fast Object Recognition in Autonomous Driving

no code implementations24 Jan 2020 Wei Wang, Shibo Zhou, Jingxi Li, Xiaohua LI, Junsong Yuan, Zhanpeng Jin

Accurate real-time object recognition from sensory data has long been a crucial and challenging task for autonomous driving.

Autonomous Driving Object +1

Image Co-skeletonization via Co-segmentation

no code implementations12 Apr 2020 Koteswar Rao Jerripothula, Jianfei Cai, Jiangbo Lu, Junsong Yuan

Object skeletonization in a single natural image is a challenging problem because there is hardly any prior knowledge about the object.

Object Segmentation

Towards Understanding the Adversarial Vulnerability of Skeleton-based Action Recognition

no code implementations14 May 2020 Tianhang Zheng, Sheng Liu, Changyou Chen, Junsong Yuan, Baochun Li, Kui Ren

We first formulate generation of adversarial skeleton actions as a constrained optimization problem by representing or approximating the physiological and physical constraints with mathematical formulations.

Action Recognition Skeleton Based Action Recognition

Temporal Distinct Representation Learning for Action Recognition

no code implementations ECCV 2020 Junwu Weng, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xudong Jiang, Junsong Yuan

Motivated by the previous success of Two-Dimensional Convolutional Neural Network (2D CNN) on image recognition, researchers endeavor to leverage it to characterize videos.

Action Recognition Representation Learning

Revisiting Modified Greedy Algorithm for Monotone Submodular Maximization with a Knapsack Constraint

no code implementations12 Aug 2020 Jing Tang, Xueyan Tang, Andrew Lim, Kai Han, Chongshou Li, Junsong Yuan

Second, we enhance the modified greedy algorithm to derive a data-dependent upper bound on the optimum.

Learning Progressive Joint Propagation for Human Motion Prediction

no code implementations ECCV 2020 Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann

Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.

Human motion prediction motion prediction

Clustering Driven Deep Autoencoder for Video Anomaly Detection

no code implementations ECCV 2020 Yunpeng Chang, Zhigang Tu, Wei Xie, Junsong Yuan

Because of the ambiguous definition of anomaly and the complexity of real data, anomaly detection in videos is one of the most challenging problems in intelligent video surveillance.

Anomaly Detection Clustering +1

Hand-Transformer: Non-Autoregressive Structured Modeling for 3D Hand Pose Estimation

no code implementations ECCV 2020 Lin Huang, Jianchao Tan, Ji Liu, Junsong Yuan

To address this issue, we connect this structured output learning problem with the structured modeling framework in sequence transduction field.

3D Hand Pose Estimation

Attention-Aware Noisy Label Learning for Image Classification

no code implementations30 Sep 2020 Zhenzhen Wang, Chunyan Xu, Yap-Peng Tan, Junsong Yuan

In this paper, the attention-aware noisy label learning approach ($A^2NL$) is proposed to improve the discriminative capability of the network trained on datasets with potential label noise.

Classification General Classification +2

Interventional Domain Adaptation

no code implementations7 Nov 2020 Jun Wen, Changjian Shui, Kun Kuang, Junsong Yuan, Zenan Huang, Zhefeng Gong, Nenggan Zheng

To address this issue, we intervene in the learning of feature discriminability using unlabeled target data to guide it to get rid of the domain-specific part and be safely transferable.

counterfactual Unsupervised Domain Adaptation

NeuLF: Efficient Novel View Synthesis with Neural 4D Light Field

no code implementations15 May 2021 Zhong Li, Liangchen Song, Celong Liu, Junsong Yuan, Yi Xu

In this paper, we present an efficient and robust deep learning solution for novel view synthesis of complex scenes.

Novel View Synthesis

Two-Stream Consensus Network: Submission to HACS Challenge 2021 Weakly-Supervised Learning Track

no code implementations21 Jun 2021 Yuanhao Zhai, Le Wang, David Doermann, Junsong Yuan

The base model training encourages the model to predict reliable predictions based on single modality (i. e., RGB or optical flow), based on the fusion of which a pseudo ground truth is generated and in turn used as supervision to train the base models.

Optical Flow Estimation Weakly-supervised Learning +2

High Quality Disparity Remapping With Two-Stage Warping

no code implementations ICCV 2021 Bing Li, Chia-Wen Lin, Cheng Zheng, Shan Liu, Junsong Yuan, Bernard Ghanem, C.-C. Jay Kuo

In the second stage, we derive another warping model to refine warping results in less important regions by eliminating serious distortions in shape, disparity and 3D structure.

Vocal Bursts Intensity Prediction Vocal Bursts Valence Prediction

Stacked Homography Transformations for Multi-View Pedestrian Detection

no code implementations ICCV 2021 Liangchen Song, Jialian Wu, Ming Yang, Qian Zhang, Yuan Li, Junsong Yuan

This task is confronted with two challenges: how to establish the 3D correspondences from views to the BEV map and how to assemble occupancy information across views.

Multiview Detection Pedestrian Detection

A Unified 3D Human Motion Synthesis Model via Conditional Variational Auto-Encoder

no code implementations ICCV 2021 Yujun Cai, Yiwei Wang, Yiheng Zhu, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Chuanxia Zheng, Sijie Yan, Henghui Ding, Xiaohui Shen, Ding Liu, Nadia Magnenat Thalmann

Notably, by considering this problem as a conditional generation process, we estimate a parametric distribution of the missing regions based on the input conditions, from which to sample and synthesize the full motion series.

motion prediction Motion Synthesis

Pseudo Supervised Monocular Depth Estimation with Teacher-Student Network

no code implementations22 Oct 2021 Huan Liu, Junsong Yuan, Chen Wang, Jun Chen

Despite recent improvement of supervised monocular depth estimation, the lack of high quality pixel-wise ground truth annotations has become a major hurdle for further progress.

Knowledge Distillation Monocular Depth Estimation +1

Joint-bone Fusion Graph Convolutional Network for Semi-supervised Skeleton Action Recognition

no code implementations8 Feb 2022 Zhigang Tu, Jiaxu Zhang, Hongyan Li, Yujin Chen, Junsong Yuan

In recent years, graph convolutional networks (GCNs) play an increasingly critical role in skeleton-based human action recognition.

Action Recognition Pose Prediction +2

Efficient Video Instance Segmentation via Tracklet Query and Proposal

no code implementations CVPR 2022 Jialian Wu, Sudhir Yarram, Hui Liang, Tian Lan, Junsong Yuan, Jayan Eledath, Gerard Medioni

In addition, VisTR is not fully end-to-end learnable in multiple video clips as it requires a hand-crafted data association to link instance tracklets between successive clips.

Instance Segmentation Segmentation +2

Optical Flow for Video Super-Resolution: A Survey

no code implementations20 Mar 2022 Zhigang Tu, Hongyan Li, Wei Xie, Yuanzhong Liu, Shifu Zhang, Baoxin Li, Junsong Yuan

Video super-resolution is currently one of the most active research topics in computer vision as it plays an important role in many visual applications.

Motion Compensation Optical Flow Estimation +1

Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth

no code implementations21 Jun 2022 Nitin Bansal, Pan Ji, Junsong Yuan, Yi Xu

Multi-task learning (MTL) paradigm focuses on jointly learning two or more tasks, aiming for significant improvement w. r. t model's generalizability, performance, and training/inference memory footprint.

Data Augmentation Depth Estimation +3

Neural Correspondence Field for Object Pose Estimation

no code implementations30 Jul 2022 Lin Huang, Tomas Hodan, Lingni Ma, Linguang Zhang, Luan Tran, Christopher Twigg, Po-Chen Wu, Junsong Yuan, Cem Keskin, Robert Wang

Unlike classical correspondence-based methods which predict 3D object coordinates at pixels of the input image, the proposed method predicts 3D object coordinates at 3D query points sampled in the camera frustum.

3D Reconstruction Object +1

Progressive Multi-view Human Mesh Recovery with Self-Supervision

no code implementations10 Dec 2022 Xuan Gong, Liangchen Song, Meng Zheng, Benjamin Planche, Terrence Chen, Junsong Yuan, David Doermann, Ziyan Wu

To date, little attention has been given to multi-view 3D human mesh estimation, despite real-life applicability (e. g., motion capture, sport analysis) and robustness to single-view ambiguities.

Benchmarking Human Mesh Recovery

SOAR: Scene-debiasing Open-set Action Recognition

no code implementations ICCV 2023 Yuanhao Zhai, Ziyi Liu, Zhenyu Wu, Yi Wu, Chunluan Zhou, David Doermann, Junsong Yuan, Gang Hua

Deep models have the risk of utilizing spurious clues to make predictions, e. g., recognizing actions via classifying the background scene.

Open Set Action Recognition Scene Classification

Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models

no code implementations13 Dec 2023 Liangchen Song, Liangliang Cao, Jiatao Gu, Yifan Jiang, Junsong Yuan, Hao Tang

In this work, we propose that by incorporating correspondence regularization into diffusion models, the process of 3D editing can be significantly accelerated.

AMuSE: Adaptive Multimodal Analysis for Speaker Emotion Recognition in Group Conversations

no code implementations26 Jan 2024 Naresh Kumar Devulapally, Sidharth Anand, Sreyasee Das Bhattacharjee, Junsong Yuan, Yu-Ping Chang

This difficulty is compounded in group settings, where the emotion and its temporal evolution are not only influenced by the individual but also by external contexts like audience reaction and context of the ongoing conversation.

Emotion Recognition

Spectrum AUC Difference (SAUCD): Human-aligned 3D Shape Evaluation

no code implementations3 Mar 2024 Tianyu Luan, Zhong Li, Lele Chen, Xuan Gong, Lichang Chen, Yi Xu, Junsong Yuan

Then, we calculate the Area Under the Curve (AUC) difference between the two spectrums, so that each frequency band that captures either the overall or detailed shape is equitably considered.

FSC: Few-point Shape Completion

no code implementations12 Mar 2024 Xianzu Wu, Xianfeng Wu, Tianyu Luan, Yajing Bai, Zhongyuan Lai, Junsong Yuan

While previous studies have demonstrated successful 3D object shape completion with a sufficient number of points, they often fail in scenarios when a few points, e. g. tens of points, are observed.

Object

Cannot find the paper you are looking for? You can Submit a new open access paper.