Search Results for author: Junsong Yuan

Found 111 papers, 34 papers with code

Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective

4 code implementations • 1 Feb 2021 • Helong Zhou, Liangchen Song, Jiajie Chen, Ye Zhou, Guoli Wang, Junsong Yuan, Qian Zhang

The outputs from the teacher network are used as soft labels for supervising the training of a new network.

Ranked #25 on Knowledge Distillation on ImageNet

Knowledge Distillation

1,357

Paper
Code

3D Hand Shape and Pose Estimation from a Single RGB Image

2 code implementations • CVPR 2019 • Liuhao Ge, Zhou Ren, Yuncheng Li, Zehao Xue, Yingying Wang, Jianfei Cai, Junsong Yuan

This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image.

3D Hand Pose Estimation

585

Paper
Code

Track to Detect and Segment: An Online Multi-Object Tracker

1 code implementation • CVPR 2021 • Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, Junsong Yuan

Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking.

Ranked #1 on Instance Segmentation on nuScenes

3D Multi-Object Tracking Instance Segmentation +7

545

Paper
Code

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals

1 code implementation • CVPR 2018 • Shanxin Yuan, Guillermo Garcia-Hernando, Bjorn Stenger, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee, Pavlo Molchanov, Jan Kautz, Sina Honari, Liuhao Ge, Junsong Yuan, Xinghao Chen, Guijin Wang, Fan Yang, Kai Akiyama, Yang Wu, Qingfu Wan, Meysam Madadi, Sergio Escalera, Shile Li, Dongheui Lee, Iason Oikonomidis, Antonis Argyros, Tae-Kyun Kim

Official Torch7 implementation of "V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map", CVPR 2018

Ranked #5 on Hand Pose Estimation on HANDS 2017

3D Hand Pose Estimation 3D Pose Estimation

373

Paper
Code

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

2 code implementations • ICCV 2019 • Fu Xiong, Boshen Zhang, Yang Xiao, Zhiguo Cao, Taidong Yu, Joey Tianyi Zhou, Junsong Yuan

For 3D hand and body pose estimation task in depth image, a novel anchor-based approach termed Anchor-to-Joint regression network (A2J) with the end-to-end learning ability is proposed.

Ranked #1 on Hand Pose Estimation on K2HPD

3D Pose Estimation Depth Estimation +1

285

Paper
Code

NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions

1 code implementation • ICCV 2023 • Zhang Chen, Zhong Li, Liangchen Song, Lele Chen, Jingyi Yu, Junsong Yuan, Yi Xu

The spatial positions of their neural features are fixed on grid nodes and cannot well adapt to target signals.

279

Paper
Code

GRiT: A Generative Region-to-text Transformer for Object Understanding

1 code implementation • 1 Dec 2022 • Jialian Wu, JianFeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang

Specifically, GRiT consists of a visual encoder to extract image features, a foreground object extractor to localize objects, and a text decoder to generate open-set object descriptions.

Ranked #2 on Dense Captioning on Visual Genome

Dense Captioning Descriptive +3

271

Paper
Code

MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video

1 code implementation • CVPR 2022 • Jinlu Zhang, Zhigang Tu, Jianyu Yang, Yujin Chen, Junsong Yuan

Recent transformer-based solutions have been introduced to estimate 3D human pose from 2D keypoint sequence by considering body joints among all frames globally to learn spatio-temporal correlation.

Ranked #6 on Monocular 3D Human Pose Estimation on Human3.6M

Monocular 3D Human Pose Estimation

179

Paper
Code

Model-based 3D Hand Reconstruction via Self-Supervised Learning

1 code implementation • CVPR 2021 • Yujin Chen, Zhigang Tu, Di Kang, Linchao Bao, Ying Zhang, Xuefei Zhe, Ruizhi Chen, Junsong Yuan

For the first time, we demonstrate the feasibility of training an accurate 3D hand reconstruction network without relying on manual annotations.

Self-Supervised Learning

103

Paper
Code

AiATrack: Attention in Attention for Transformer Visual Tracking

1 code implementation • 20 Jul 2022 • Shenyuan Gao, Chunluan Zhou, Chao Ma, Xinggang Wang, Junsong Yuan

However, the independent correlation computation in the attention mechanism could result in noisy and ambiguous attention weights, which inhibits further performance improvement.

Ranked #2 on Visual Object Tracking on NeedForSpeed

Visual Object Tracking Visual Tracking

103

Paper
Code

Kernel Cross-Correlator

3 code implementations • 12 Sep 2017 • Chen Wang, Le Zhang, Lihua Xie, Junsong Yuan

Cross-correlator plays a significant role in many visual perception tasks, such as object detection and tracking.

Human Activity Recognition object-detection +2

Paper
Code

Non-iterative SLAM for Warehouse Robots Using Ground Textures

2 code implementations • 16 Oct 2017 • Chen Wang, Minh-Chung Hoang, Lihua Xie, Junsong Yuan

We present a novel visual SLAM method for the warehouse robot with a single downward-facing camera using ground textures.

Robotics

Paper
Code

PointCloud Saliency Maps

3 code implementations • ICCV 2019 • Tianhang Zheng, Changyou Chen, Junsong Yuan, Bo Li, Kui Ren

Our motivation for constructing a saliency map is by point dropping, which is a non-differentiable operator.

Paper
Code

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video

1 code implementation • CVPR 2020 • Yancheng Wang, Yang Xiao, Fu Xiong, Wenxiang Jiang, Zhiguo Cao, Joey Tianyi Zhou, Junsong Yuan

Each available 3DV voxel intrinsically involves 3D spatial and motion feature jointly.

3D Action Recognition

Paper
Code

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation

1 code implementation • 13 Aug 2020 • Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, Junsong Yuan

In the classification tree, as the number of parent class nodes are significantly less, their logits are less noisy and can be utilized to suppress the wrong/noisy logits existed in the fine-grained class nodes.

Ranked #5 on Few-Shot Object Detection on LVIS v1.0 val

Classification Few-Shot Object Detection +7

Paper
Code

Hand PointNet: 3D Hand Pose Estimation Using Point Sets

1 code implementation • CVPR 2018 • Liuhao Ge, Yujun Cai, Junwu Weng, Junsong Yuan

Convolutional Neural Network (CNN) has shown promising results for 3D hand pose estimation in depth images.

Ranked #7 on Hand Pose Estimation on HANDS 2017

3D Hand Pose Estimation regression

Paper
Code

SPAGAN: Shortest Path Graph Attention Network

1 code implementation • 10 Jan 2021 • Yiding Yang, Xinchao Wang, Mingli Song, Junsong Yuan, DaCheng Tao

SPAGAN therefore allows for a more informative and intact exploration of the graph structure and further {a} more effective aggregation of information from distant neighbors into the center node, as compared to node-based GCN methods.

Graph Attention

Paper
Code

Kervolutional Neural Networks

6 code implementations • CVPR 2019 • Chen Wang, Jianfei Yang, Lihua Xie, Junsong Yuan

Convolutional neural networks (CNNs) have enabled the state-of-the-art performance in many computer vision tasks.

Paper
Code

Structure-Aware Human-Action Generation

1 code implementation • ECCV 2020 • Ping Yu, Yang Zhao, Chunyuan Li, Junsong Yuan, Changyou Chen

Generating long-range skeleton-based human actions has been a challenging problem since small deviations of one frame can cause a malformed action sequence.

Ranked #2 on Human action generation on NTU RGB+D 2D

Action Generation graph construction +1

Paper
Code

Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene

1 code implementation • 11 Aug 2020 • Xinke Li, Chongshou Li, Zekun Tong, Andrew Lim, Junsong Yuan, Yuwei Wu, Jing Tang, Raymond Huang

Based on it, we formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.

Instance Segmentation Point Cloud Segmentation +3

Paper
Code

ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection

2 code implementations • 14 Aug 2020 • Ye Liu, Junsong Yuan, Chang Wen Chen

We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of <human, action, object> in images.

Ranked #3 on Zero-Shot Human-Object Interaction Detection on HICO-DET

Human-Object Interaction Detection Object +1

Paper
Code

Language-guided Human Motion Synthesis with Atomic Actions

1 code implementation • 18 Aug 2023 • Yuanhao Zhai, Mingzhen Huang, Tianyu Luan, Lu Dong, Ifeoma Nwogu, Siwei Lyu, David Doermann, Junsong Yuan

In this paper, we propose ATOM (ATomic mOtion Modeling) to mitigate this problem, by decomposing actions into atomic actions, and employing a curriculum learning strategy to learn atomic action composition.

Motion Synthesis

Paper
Code

Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions

1 code implementation • AAAI 2019 • Zhenyi Wang, Ping Yu, Yang Zhao, Ruiyi Zhang, Yufan Zhou, Junsong Yuan, Changyou Chen

In this paper, we focus on skeleton-based action generation and propose to model smooth and diverse transitions on a latent space of action sequences with much lower dimensionality.

Ranked #4 on Human action generation on NTU RGB+D 2D

Action Generation

Paper
Code

Learning Transferable Human-Object Interaction Detector With Natural Language Supervision

1 code implementation • CVPR 2022 • Suchen Wang, Yueqi Duan, Henghui Ding, Yap-Peng Tan, Kim-Hui Yap, Junsong Yuan

More specifically, we propose a new HOI visual encoder to detect the interacting humans and objects, and map them to a joint feature space to perform interaction recognition.

Human-Object Interaction Detection

Paper
Code

Source-Free Domain Adaptation for Medical Image Segmentation via Prototype-Anchored Feature Alignment and Contrastive Learning

1 code implementation • 19 Jul 2023 • Qinji Yu, Nan Xi, Junsong Yuan, Ziyu Zhou, Kang Dang, Xiaowei Ding

To tackle the source data-absent problem, we present a novel two-stage source-free domain adaptation (SFDA) framework for medical image segmentation, where only a well-trained source segmentation model and unlabeled target data are available during domain adaptation.

Contrastive Learning Image Segmentation +5

Paper
Code

Motion-driven Visual Tempo Learning for Video-based Action Recognition

2 code implementations • TIP 2022 • Yuanzhong Liu, Junsong Yuan, Zhigang Tu

Action visual tempo characterizes the dynamics and the temporal scale of an action, which is helpful to distinguish human actions that share high similarities in visual dynamics and appearance.

Ranked #14 on Action Recognition on Something-Something V1

Action Recognition

Paper
Code

Harnessing Low-Frequency Neural Fields for Few-Shot View Synthesis

1 code implementation • 15 Mar 2023 • Liangchen Song, Zhong Li, Xuan Gong, Lele Chen, Zhang Chen, Yi Xu, Junsong Yuan

We further propose a simple-yet-effective strategy for tuning the frequency to avoid overfitting few-shot inputs: enforcing consistency among the frequency domain of rendered 2D images.

Novel View Synthesis

Paper
Code

Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting

1 code implementation • ICCV 2023 • Wentao Bao, Lele Chen, Libing Zeng, Zhong Li, Yi Xu, Junsong Yuan, Yu Kong

In this paper, we set up an egocentric 3D hand trajectory forecasting task that aims to predict hand trajectories in a 3D space from early observed RGB videos in a first-person view.

3D Human Pose Tracking Trajectory Forecasting +1

Paper
Code

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation

1 code implementation • 18 Mar 2024 • Zixin Zhu, Xuelu Feng, Dongdong Chen, Junsong Yuan, Chunming Qiao, Gang Hua

We hypothesize that the latent representation learned from a pretrained generative T2V model encapsulates rich semantics and coherent temporal correspondences, thereby naturally facilitating video understanding.

Referring Video Object Segmentation Semantic Segmentation +2

Paper
Code

Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition

1 code implementation • CVPR 2017 • Junwu Weng, Chaoqun Weng, Junsong Yuan

Moreover, by identifying key skeleton joints and temporal stages for each action class, our ST-NBNN can capture the essential spatio-temporal patterns that play key roles of recognizing actions, which is not always achievable by using end-to-end models.

Action Classification Action Recognition +2

Paper
Code

High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition

1 code implementation • CVPR 2023 • Tianyu Luan, Yuanhao Zhai, Jingjing Meng, Zhong Li, Zhang Chen, Yi Xu, Junsong Yuan

To capture high-frequency personalized details, we transform the 3D mesh into the frequency domain, and propose a novel frequency decomposition loss to supervise each frequency component.

Paper
Code

Relit-NeuLF: Efficient Relighting and Novel View Synthesis via Neural 4D Light Field

1 code implementation • 23 Oct 2023 • Zhong Li, Liangchen Song, Zhang Chen, Xiangyu Du, Lele Chen, Junsong Yuan, Yi Xu

A DecomposeNet learns to map each ray to its SVBRDF components: albedo, normal, and roughness.

Novel View Synthesis

Paper
Code

Deformable VisTR: Spatio temporal deformable attention for video instance segmentation

1 code implementation • 12 Mar 2022 • Sudhir Yarram, Jialian Wu, Pan Ji, Yi Xu, Junsong Yuan

To improve the training efficiency, we propose Deformable VisTR, leveraging spatio-temporal deformable attention module that only attends to a small fixed set of key spatio-temporal sampling points around a reference point.

Instance Segmentation Semantic Segmentation +1

Paper
Code

PointACL:Adversarial Contrastive Learning for Robust Point Clouds Representation under Adversarial Attack

1 code implementation • 14 Sep 2022 • Junxuan Huang, Yatong An, Lu Cheng, Bai Chen, Junsong Yuan, Chunming Qiao

Adversarial contrastive learning (ACL) is considered an effective way to improve the robustness of pre-trained models.

3D Classification Adversarial Attack +2

Paper
Code

Robust 3D Hand Pose Estimation in Single Depth Images: from Single-View CNN to Multi-View CNNs

no code implementations • CVPR 2016 • Liuhao Ge, Hui Liang, Junsong Yuan, Daniel Thalmann

Articulated hand pose estimation plays an important role in human-computer interaction.

3D Hand Pose Estimation

Paper
Add Code

Actor-Action Semantic Segmentation with Region Masks

no code implementations • 23 Jul 2018 • Kang Dang, Chunluan Zhou, Zhigang Tu, Michael Hoy, Justin Dauwels, Junsong Yuan

One major challenge for this task is that when an actor performs an action, different body parts of the actor provide different types of cues for the action category and may receive inconsistent action labeling when they are labeled independently.

Action Segmentation Instance Segmentation +2

Paper
Add Code

Exploiting Local Feature Patterns for Unsupervised Domain Adaptation

no code implementations • 12 Nov 2018 • Jun Wen, Risheng Liu, Nenggan Zheng, Qian Zheng, Zhefeng Gong, Junsong Yuan

In this paper, we present a method for learning domain-invariant local feature patterns and jointly aligning holistic and local feature statistics.

Unsupervised Domain Adaptation

Paper
Add Code

Max-Margin Structured Output Regression for Spatio-Temporal Action Localization

no code implementations • NeurIPS 2012 • Du Tran, Junsong Yuan

The mapping between a video and a spatio-temporal action trajectory is learned.

Object Localization regression +2

Paper
Add Code

Multi-View Harmonized Bilinear Network for 3D Object Recognition

no code implementations • CVPR 2018 • Tan Yu, Jingjing Meng, Junsong Yuan

View-based methods have achieved considerable success in $3$D object recognition tasks.

3D Object Recognition Object

Paper
Add Code

Recognizing Human Actions as the Evolution of Pose Estimation Maps

no code implementations • CVPR 2018 • Mengyuan Liu, Junsong Yuan

Specifically, the evolution of pose estimation maps can be decomposed as an evolution of heatmaps, e. g., probabilistic maps, and an evolution of estimated 2D human poses, which denote the changes of body shape and body pose, respectively.

Ranked #1 on Multimodal Activity Recognition on UTD-MHAD

Action Recognition Multimodal Activity Recognition +3

Paper
Add Code

Conditional Generative Adversarial Network for Structured Domain Adaptation

no code implementations • CVPR 2018 • Weixiang Hong, Zhenzhen Wang, Ming Yang, Junsong Yuan

In recent years, deep neural nets have triumphed over many computer vision problems, including semantic segmentation, which is a critical task in emerging autonomous driving and medical image diagnostics applications.

Autonomous Driving Domain Adaptation +2

Paper
Add Code

Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display

no code implementations • CVPR 2018 • Shizheng Wang, Wenjuan Liao, Phil Surman, Zhigang Tu, Yuanjin Zheng, Junsong Yuan

Multi-layer light field displays are a type of computational three-dimensional (3D) display which has recently gained increasing interest for its holographic-like effect and natural compatibility with 2D displays.

Paper
Add Code

Bi-box Regression for Pedestrian Detection and Occlusion Estimation

no code implementations • ECCV 2018 • Chunluan Zhou, Junsong Yuan

The full body estimation branch is trained to regress full body regions for positive pedestrian proposals, while the visible part estimation branch is trained to regress visible part regions for both positive and negative pedestrian proposals.

Occlusion Estimation Pedestrian Detection +1

Paper
Add Code

Product Quantization Network for Fast Image Retrieval

no code implementations • ECCV 2018 • Tan Yu, Junsong Yuan, Chen Fang, Hailin Jin

Product quantization has been widely used in fast image retrieval due to its effectiveness of coding high-dimensional visual features.

Image Retrieval Quantization +1

Paper
Add Code

Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition

no code implementations • ECCV 2018 • Junwu Weng, Mengyuan Liu, Xudong Jiang, Junsong Yuan

This deformable convolution can better utilize contextual joints for action and gesture recognition and is more robust to noisy joints.

Hand Gesture Recognition Hand-Gesture Recognition

Paper
Add Code

Point-to-Point Regression PointNet for 3D Hand Pose Estimation

no code implementations • ECCV 2018 • Liuhao Ge, Zhou Ren, Junsong Yuan

Convolutional Neural Networks (CNNs)-based methods for 3D hand pose estimation with depth cameras usually take 2D depth images as input and directly regress holistic 3D hand pose.

3D Hand Pose Estimation regression

Paper
Add Code

Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images

no code implementations • ECCV 2018 • Yujun Cai, Liuhao Ge, Jianfei Cai, Junsong Yuan

Compared with depth-based 3D hand pose estimation, it is more challenging to infer 3D hand pose from monocular RGB images, due to substantial depth ambiguity and the difficulty of obtaining fully-annotated training data.

3D Hand Pose Estimation

Paper
Add Code

Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior

no code implementations • CVPR 2013 • Gangqiang Zhao, Junsong Yuan, Gang Hua

We show that such data driven co-occurrence information from bottom-up can conveniently be incorporated in LDA with a Gaussian Markov prior, which combines top down probabilistic topic modeling with bottom up priors in a unified model.

Object Object Discovery +1

Paper
Add Code

Multi-feature Spectral Clustering with Minimax Optimization

no code implementations • CVPR 2014 • Hongxing Wang, Chaoqun Weng, Junsong Yuan

To find a consensus clustering result that is agreeable to all feature modalities, our objective is to find a universal feature embedding, which not only fits each individual feature modality well, but also unifies different feature modalities by minimizing their pairwise disagreements.

Clustering

Paper
Add Code

Fast Action Proposals for Human Action Detection and Search

no code implementations • CVPR 2015 • Gang Yu, Junsong Yuan

Assuming each action is performed by a human with meaningful motion, both appearance and motion cues are utilized to measure the actionness of the video tubes.

Action Detection Video Segmentation +1

Paper
Add Code

From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection

no code implementations • CVPR 2016 • Jingjing Meng, Hongxing Wang, Junsong Yuan, Yap-Peng Tan

This representative selection problem is formulated as a sparse dictionary selection problem, i. e., choosing a few representatives object proposals to reconstruct the whole proposal pool.

Object Video Summarization

Paper
Add Code

3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation From Single Depth Images

no code implementations • CVPR 2017 • Liuhao Ge, Hui Liang, Junsong Yuan, Daniel Thalmann

We propose a simple, yet effective approach for real-time hand pose estimation from single depth images using three-dimensional Convolutional Neural Networks (3D CNNs).

3D Hand Pose Estimation Data Augmentation

Paper
Add Code

HOPE: Hierarchical Object Prototype Encoding for Efficient Object Instance Search in Videos

no code implementations • CVPR 2017 • Tan Yu, Yuwei Wu, Junsong Yuan

This paper tackles the problem of efficient and effective object instance search in videos.

Instance Search Object

Paper
Add Code

Fried Binary Embedding for High-Dimensional Visual Features

no code implementations • CVPR 2017 • Weixiang Hong, Junsong Yuan, Sreyasee Das Bhattacharjee

We argue that long binary codes (b O(d)) are critical to fully utilize the discriminative power of high-dimensional visual features, and can achieve better results in various tasks such as approximate nearest neighbour search.

Vocal Bursts Intensity Prediction

Paper
Add Code

Object Co-Skeletonization With Co-Segmentation

no code implementations • CVPR 2017 • Koteswar Rao Jerripothula, Jianfei Cai, Jiangbo Lu, Junsong Yuan

Recent advances in the joint processing of images have certainly shown its advantages over the individual processing.

Object Segmentation

Paper
Add Code

Adaptive Exponential Smoothing for Online Filtering of Pixel Prediction Maps

no code implementations • ICCV 2015 • Kang Dang, Jiong Yang, Junsong Yuan

We propose an efficient online video filtering method, called adaptive exponential filtering (AES) to refine pixel prediction maps.

Saliency Detection Scene Parsing

Paper
Add Code

Compressive Quantization for Fast Object Instance Search in Videos

no code implementations • ICCV 2017 • Tan Yu, Zhenzhen Wang, Junsong Yuan

Most of current visual search systems focus on image-to-image (point-to-point) search such as image and object retrieval.

Instance Search Object +3

Paper
Add Code

Common Action Discovery and Localization in Unconstrained Videos

no code implementations • ICCV 2017 • Jiong Yang, Junsong Yuan

Similar to common object discovery in images or videos, it is of great interests to discover and locate common actions in videos, which can benefit many video analytics applications such as video summarization, search, and understanding.

Object Discovery Video Summarization

Paper
Add Code

Multi-Label Learning of Part Detectors for Heavily Occluded Pedestrian Detection

no code implementations • ICCV 2017 • Chunluan Zhou, Junsong Yuan

Detecting pedestrians that are partially occluded remains a challenging problem due to variations and uncertainties of partial occlusion patterns.

Multi-Label Learning Pedestrian Detection

Paper
Add Code

Towards Real-time Eyeblink Detection in The Wild:Dataset,Theory and Practices

no code implementations • 21 Feb 2019 • Guilei Hu, Yang Xiao, Zhiguo Cao, Lubin Meng, Zhiwen Fang, Joey Tianyi Zhou, Junsong Yuan

Effective and real-time eyeblink detection is of wide-range applications, such as deception detection, drive fatigue detection, face anti-spoofing, etc.

Attribute Deception Detection +1

Paper
Add Code

Progress Regression RNN for Online Spatial-Temporal Action Localization in Unconstrained Videos

no code implementations • 1 Mar 2019 • Bo Hu, Jianfei Cai, Tat-Jen Cham, Junsong Yuan

Previous spatial-temporal action localization methods commonly follow the pipeline of object detection to estimate bounding boxes and labels of actions.

object-detection Object Detection +3

Paper
Add Code

Bayesian Uncertainty Matching for Unsupervised Domain Adaptation

no code implementations • 24 Jun 2019 • Jun Wen, Nenggan Zheng, Junsong Yuan, Zhefeng Gong, Changyou Chen

By imposing distribution matching on both features and labels (via uncertainty), label distribution mismatching in source and target data is effectively alleviated, encouraging the classifier to produce consistent predictions across domains.

Unsupervised Domain Adaptation

Paper
Add Code

Context-Integrated and Feature-Refined Network for Lightweight Object Parsing

no code implementations • 26 Jul 2019 • Bin Jiang, Wenxuan Tu, Chao Yang, Junsong Yuan

The core components of CIFReNet are the Long-skip Refinement Module (LRM) and the Multi-scale Context Integration Module (MCIM).

Scene Parsing Semantic Segmentation

Paper
Add Code

Temporal Pulses Driven Spiking Neural Network for Fast Object Recognition in Autonomous Driving

no code implementations • 24 Jan 2020 • Wei Wang, Shibo Zhou, Jingxi Li, Xiaohua LI, Junsong Yuan, Zhanpeng Jin

Accurate real-time object recognition from sensory data has long been a crucial and challenging task for autonomous driving.

Autonomous Driving Object +1

Paper
Add Code

Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction

no code implementations • ECCV 2020 • Anil Armagan, Guillermo Garcia-Hernando, Seungryul Baek, Shreyas Hampali, Mahdi Rad, Zhaohui Zhang, Shipeng Xie, Mingxiu Chen, Boshen Zhang, Fu Xiong, Yang Xiao, Zhiguo Cao, Junsong Yuan, Pengfei Ren, Weiting Huang, Haifeng Sun, Marek Hrúz, Jakub Kanis, Zdeněk Krňoul, Qingfu Wan, Shile Li, Linlin Yang, Dongheui Lee, Angela Yao, Weiguo Zhou, Sijia Mei, Yun-hui Liu, Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Philippe Weinzaepfel, Romain Brégier, Grégory Rogez, Vincent Lepetit, Tae-Kyun Kim

To address these issues, we designed a public challenge (HANDS'19) to evaluate the abilities of current 3D hand pose estimators (HPEs) to interpolate and extrapolate the poses of a training set.

3D Hand Pose Estimation

Paper
Add Code

Image Co-skeletonization via Co-segmentation

no code implementations • 12 Apr 2020 • Koteswar Rao Jerripothula, Jianfei Cai, Jiangbo Lu, Junsong Yuan

Object skeletonization in a single natural image is a challenging problem because there is hardly any prior knowledge about the object.

Object Segmentation

Paper
Add Code

Towards Understanding the Adversarial Vulnerability of Skeleton-based Action Recognition

no code implementations • 14 May 2020 • Tianhang Zheng, Sheng Liu, Changyou Chen, Junsong Yuan, Baochun Li, Kui Ren

We first formulate generation of adversarial skeleton actions as a constrained optimization problem by representing or approximating the physiological and physical constraints with mathematical formulations.

Action Recognition Skeleton Based Action Recognition

Paper
Add Code

Joint Hand-object 3D Reconstruction from a Single Image with Cross-branch Feature Fusion

no code implementations • 28 Jun 2020 • Yujin Chen, Zhigang Tu, Di Kang, Ruizhi Chen, Linchao Bao, Zhengyou Zhang, Junsong Yuan

In this work, we propose to consider hand and object jointly in feature space and explore the reciprocity of the two branches.

3D Reconstruction Depth Estimation +3

Paper
Add Code

Temporal Distinct Representation Learning for Action Recognition

no code implementations • ECCV 2020 • Junwu Weng, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xudong Jiang, Junsong Yuan

Motivated by the previous success of Two-Dimensional Convolutional Neural Network (2D CNN) on image recognition, researchers endeavor to leverage it to characterize videos.

Action Recognition Representation Learning

Paper
Add Code

Deep Reinforcement Learning with Label Embedding Reward for Supervised Image Hashing

no code implementations • 10 Aug 2020 • Zhenzhen Wang, Weixiang Hong, Junsong Yuan

Deep hashing has shown promising results in image retrieval and recognition.

Binarization Decision Making +4

Paper
Add Code

Revisiting Modified Greedy Algorithm for Monotone Submodular Maximization with a Knapsack Constraint

no code implementations • 12 Aug 2020 • Jing Tang, Xueyan Tang, Andrew Lim, Kai Han, Chongshou Li, Junsong Yuan

Second, we enhance the modified greedy algorithm to derive a data-dependent upper bound on the optimum.

Paper
Add Code

Learning Progressive Joint Propagation for Human Motion Prediction

no code implementations • ECCV 2020 • Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann

Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.

Human motion prediction motion prediction

Paper
Add Code

Clustering Driven Deep Autoencoder for Video Anomaly Detection

no code implementations • ECCV 2020 • Yunpeng Chang, Zhigang Tu, Wei Xie, Junsong Yuan

Because of the ambiguous definition of anomaly and the complexity of real data, anomaly detection in videos is one of the most challenging problems in intelligent video surveillance.

Anomaly Detection Clustering +1

Paper
Add Code

Hand-Transformer: Non-Autoregressive Structured Modeling for 3D Hand Pose Estimation

no code implementations • ECCV 2020 • Lin Huang, Jianchao Tan, Ji Liu, Junsong Yuan

To address this issue, we connect this structured output learning problem with the structured modeling framework in sequence transduction field.

3D Hand Pose Estimation

Paper
Add Code

Attention-Aware Noisy Label Learning for Image Classification

no code implementations • 30 Sep 2020 • Zhenzhen Wang, Chunyan Xu, Yap-Peng Tan, Junsong Yuan

In this paper, the attention-aware noisy label learning approach ($A^2NL$) is proposed to improve the discriminative capability of the network trained on datasets with potential label noise.

Classification General Classification +2

Paper
Add Code

Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective

no code implementations • ICLR 2021 • Helong Zhou, Liangchen Song, Jiajie Chen, Ye Zhou, Guoli Wang, Junsong Yuan, Qian Zhang

In this paper, we investigate the bias-variance tradeoff brought by distillation with soft labels.

Knowledge Distillation

Paper
Add Code

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization

no code implementations • ECCV 2020 • Yuanhao Zhai, Le Wang, Wei Tang, Qilin Zhang, Junsong Yuan, Gang Hua

Weakly-supervised Temporal Action Localization (W-TAL) aims to classify and localize all action instances in an untrimmed video under only video-level supervision.

Ranked #12 on Weakly Supervised Action Localization on THUMOS14

Vocal Bursts Valence Prediction Weakly Supervised Action Localization +2

Paper
Add Code

Interventional Domain Adaptation

no code implementations • 7 Nov 2020 • Jun Wen, Changjian Shui, Kun Kuang, Junsong Yuan, Zenan Huang, Zhefeng Gong, Nenggan Zheng

To address this issue, we intervene in the learning of feature discriminability using unlabeled target data to guide it to get rid of the domain-specific part and be safely transferable.

counterfactual Unsupervised Domain Adaptation

Paper
Add Code

Generation For Adaption: A GAN-Based Approach for 3D Domain Adaption with Point Cloud Data

no code implementations • 15 Feb 2021 • Junxuan Huang, Junsong Yuan, Chunming Qiao

Recent deep networks have achieved good performance on a variety of 3d points classification tasks.

General Classification Generative Adversarial Network +1

Paper
Add Code

ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization

no code implementations • 28 Mar 2021 • Ziyi Liu, Le Wang, Qilin Zhang, Wei Tang, Junsong Yuan, Nanning Zheng, Gang Hua

In this paper, we introduce an Action-Context Separation Network (ACSNet) that explicitly takes into account context for accurate action localization.

Ranked #7 on Weakly Supervised Action Localization on THUMOS’14

Video Polyp Segmentation Weakly Supervised Action Localization +2

Paper
Add Code

Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context

no code implementations • 30 Mar 2021 • Ziyi Liu, Le Wang, Wei Tang, Junsong Yuan, Nanning Zheng, Gang Hua

To address this challenge, we introduce a framework that learns two feature subspaces respectively for actions and their context.

Action Recognition Weakly-supervised Temporal Action Localization +1

Paper
Add Code

NeuLF: Efficient Novel View Synthesis with Neural 4D Light Field

no code implementations • 15 May 2021 • Zhong Li, Liangchen Song, Celong Liu, Junsong Yuan, Yi Xu

In this paper, we present an efficient and robust deep learning solution for novel view synthesis of complex scenes.

Novel View Synthesis

Paper
Add Code

Two-Stream Consensus Network: Submission to HACS Challenge 2021 Weakly-Supervised Learning Track

no code implementations • 21 Jun 2021 • Yuanhao Zhai, Le Wang, David Doermann, Junsong Yuan

The base model training encourages the model to predict reliable predictions based on single modality (i. e., RGB or optical flow), based on the fusion of which a pseudo ground truth is generated and in turn used as supervision to train the base models.

Optical Flow Estimation Weakly-supervised Learning +2

Paper
Add Code

OVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning

no code implementations • 8 Aug 2021 • Sheng Liu, Kevin Lin, Lijuan Wang, Junsong Yuan, Zicheng Liu

We introduce the task of open-vocabulary visual instance search (OVIS).

Instance Search Representation Learning

Paper
Add Code

High Quality Disparity Remapping With Two-Stage Warping

no code implementations • ICCV 2021 • Bing Li, Chia-Wen Lin, Cheng Zheng, Shan Liu, Junsong Yuan, Bernard Ghanem, C.-C. Jay Kuo

In the second stage, we derive another warping model to refine warping results in less important regions by eliminating serious distortions in shape, disparity and 3D structure.

Vocal Bursts Intensity Prediction Vocal Bursts Valence Prediction

Paper
Add Code

Discovering Human Interactions With Large-Vocabulary Objects via Query and Multi-Scale Detection

no code implementations • ICCV 2021 • Suchen Wang, Kim-Hui Yap, Henghui Ding, Jiyan Wu, Junsong Yuan, Yap-Peng Tan

In this work, we study the problem of human-object interaction (HOI) detection with large vocabulary object categories.

Human-Object Interaction Detection Object +2

Paper
Add Code

Stacked Homography Transformations for Multi-View Pedestrian Detection

no code implementations • ICCV 2021 • Liangchen Song, Jialian Wu, Ming Yang, Qian Zhang, Yuan Li, Junsong Yuan

This task is confronted with two challenges: how to establish the 3D correspondences from views to the BEV map and how to assemble occupancy information across views.

Ranked #7 on Multiview Detection on MultiviewX

Multiview Detection Pedestrian Detection

Paper
Add Code

A Unified 3D Human Motion Synthesis Model via Conditional Variational Auto-Encoder

no code implementations • ICCV 2021 • Yujun Cai, Yiwei Wang, Yiheng Zhu, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Chuanxia Zheng, Sijie Yan, Henghui Ding, Xiaohui Shen, Ding Liu, Nadia Magnenat Thalmann

Notably, by considering this problem as a conditional generation process, we estimate a parametric distribution of the missing regions based on the input conditions, from which to sample and synthesize the full motion series.

motion prediction Motion Synthesis

Paper
Add Code

Pseudo Supervised Monocular Depth Estimation with Teacher-Student Network

no code implementations • 22 Oct 2021 • Huan Liu, Junsong Yuan, Chen Wang, Jun Chen

Despite recent improvement of supervised monocular depth estimation, the lack of high quality pixel-wise ground truth annotations has become a major hurdle for further progress.

Knowledge Distillation Monocular Depth Estimation +1

Paper
Add Code

Consistent 3D Hand Reconstruction in Video via self-supervised Learning

no code implementations • 24 Jan 2022 • Zhigang Tu, Zhisheng Huang, Yujin Chen, Di Kang, Linchao Bao, Bisheng Yang, Junsong Yuan

We present a method for reconstructing accurate and consistent 3D hands from a monocular video.

Self-Supervised Learning

Paper
Add Code

Joint-bone Fusion Graph Convolutional Network for Semi-supervised Skeleton Action Recognition

no code implementations • 8 Feb 2022 • Zhigang Tu, Jiaxu Zhang, Hongyan Li, Yujin Chen, Junsong Yuan

In recent years, graph convolutional networks (GCNs) play an increasingly critical role in skeleton-based human action recognition.

Action Recognition Pose Prediction +2

Paper
Add Code

Efficient Video Instance Segmentation via Tracklet Query and Proposal

no code implementations • CVPR 2022 • Jialian Wu, Sudhir Yarram, Hui Liang, Tian Lan, Junsong Yuan, Jayan Eledath, Gerard Medioni

In addition, VisTR is not fully end-to-end learnable in multiple video clips as it requires a hand-crafted data association to link instance tracklets between successive clips.

Instance Segmentation Segmentation +2

Paper
Add Code

Optical Flow for Video Super-Resolution: A Survey

no code implementations • 20 Mar 2022 • Zhigang Tu, Hongyan Li, Wei Xie, Yuanzhong Liu, Shifu Zhang, Baoxin Li, Junsong Yuan

Video super-resolution is currently one of the most active research topics in computer vision as it plays an important role in many visual applications.

Motion Compensation Optical Flow Estimation +1

Paper
Add Code

Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth

no code implementations • 21 Jun 2022 • Nitin Bansal, Pan Ji, Junsong Yuan, Yi Xu

Multi-task learning (MTL) paradigm focuses on jointly learning two or more tasks, aiming for significant improvement w. r. t model's generalizability, performance, and training/inference memory footprint.

Data Augmentation Depth Estimation +3

Paper
Add Code

Neural Correspondence Field for Object Pose Estimation

no code implementations • 30 Jul 2022 • Lin Huang, Tomas Hodan, Lingni Ma, Linguang Zhang, Luan Tran, Christopher Twigg, Po-Chen Wu, Junsong Yuan, Cem Keskin, Robert Wang

Unlike classical correspondence-based methods which predict 3D object coordinates at pixels of the input image, the proposed method predicts 3D object coordinates at 3D query points sampled in the camera frustum.

3D Reconstruction Object +1

Paper
Add Code

PREF: Predictability Regularized Neural Motion Fields

no code implementations • 21 Sep 2022 • Liangchen Song, Xuan Gong, Benjamin Planche, Meng Zheng, David Doermann, Junsong Yuan, Terrence Chen, Ziyan Wu

We propose to regularize the estimated motion to be predictable.

Paper
Add Code

Federated Learning with Privacy-Preserving Ensemble Attention Distillation

no code implementations • 16 Oct 2022 • Xuan Gong, Liangchen Song, Rishi Vedula, Abhishek Sharma, Meng Zheng, Benjamin Planche, Arun Innanje, Terrence Chen, Junsong Yuan, David Doermann, Ziyan Wu

We propose a privacy-preserving FL framework leveraging unlabeled public data for one-way offline knowledge distillation in this work.

Federated Learning Image Classification +2

Paper
Add Code

NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields

no code implementations • 28 Oct 2022 • Liangchen Song, Anpei Chen, Zhong Li, Zhang Chen, Lele Chen, Junsong Yuan, Yi Xu, Andreas Geiger

Visually exploring in a real-world 4D spatiotemporal space freely in VR has been a long-term quest.

Paper
Add Code

Progressive Multi-view Human Mesh Recovery with Self-Supervision

no code implementations • 10 Dec 2022 • Xuan Gong, Liangchen Song, Meng Zheng, Benjamin Planche, Terrence Chen, Junsong Yuan, David Doermann, Ziyan Wu

To date, little attention has been given to multi-view 3D human mesh estimation, despite real-life applicability (e. g., motion capture, sport analysis) and robustness to single-view ambiguities.

Benchmarking Human Mesh Recovery

Paper
Add Code

Dynamic Voxel Grid Optimization for High-Fidelity RGB-D Supervised Surface Reconstruction

no code implementations • 12 Apr 2023 • Xiangyu Xu, Lichang Chen, Changjiang Cai, Huangying Zhan, Qingan Yan, Pan Ji, Junsong Yuan, Heng Huang, Yi Xu

Direct optimization of interpolated features on multi-resolution voxel grids has emerged as a more efficient alternative to MLP-like modules.

Computational Efficiency Surface Reconstruction

Paper
Add Code

Neural Voting Field for Camera-Space 3D Hand Pose Estimation

no code implementations • CVPR 2023 • Lin Huang, Chung-Ching Lin, Kevin Lin, Lin Liang, Lijuan Wang, Junsong Yuan, Zicheng Liu

We present a unified framework for camera-space 3D hand pose estimation from a single RGB image based on 3D implicit representation.

Ranked #4 on 3D Hand Pose Estimation on HO-3D

3D Hand Pose Estimation regression

Paper
Add Code

3D-Aware Facial Landmark Detection via Multi-View Consistent Training on Synthetic Data

no code implementations • CVPR 2023 • Libing Zeng, Lele Chen, Wentao Bao, Zhong Li, Yi Xu, Junsong Yuan, Nima Khademi Kalantari

Accurate facial landmark detection on wild images plays an essential role in human-computer interaction, entertainment, and medical applications.

Facial Landmark Detection Image Generation +1

Paper
Add Code

RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture

no code implementations • 18 May 2023 • Liangchen Song, Liangliang Cao, Hongyu Xu, Kai Kang, Feng Tang, Junsong Yuan, Yang Zhao

The proposed framework consists of two significant components: Geometry Guided Diffusion and Mesh Optimization.

Image Generation Indoor Scene Synthesis

Paper
Add Code

SOAR: Scene-debiasing Open-set Action Recognition

no code implementations • ICCV 2023 • Yuanhao Zhai, Ziyi Liu, Zhenyu Wu, Yi Wu, Chunluan Zhou, David Doermann, Junsong Yuan, Gang Hua

Deep models have the risk of utilizing spurious clues to make predictions, e. g., recognizing actions via classifying the background scene.

Open Set Action Recognition Scene Classification

Paper
Add Code

Open Set Video HOI detection from Action-Centric Chain-of-Look Prompting

no code implementations • ICCV 2023 • Nan Xi, Jingjing Meng, Junsong Yuan

To this end, we propose ACoLP, a model of Action-centric Chain-of-Look Prompting for open set video HOI detection.

Human-Object Interaction Detection Language Modelling +1

Paper
Add Code

Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning

no code implementations • ICCV 2023 • Yuanhao Zhai, Tianyu Luan, David Doermann, Junsong Yuan

To improve the generalization ability, we propose weakly-supervised self-consistency learning (WSCL) to leverage the weakly annotated images.

Image Manipulation Image Manipulation Detection +1

Paper
Add Code

Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models

no code implementations • 13 Dec 2023 • Liangchen Song, Liangliang Cao, Jiatao Gu, Yifan Jiang, Junsong Yuan, Hao Tang

In this work, we propose that by incorporating correspondence regularization into diffusion models, the process of 3D editing can be significantly accelerated.

Paper
Add Code

Multi-label Emotion Analysis in Conversation via Multimodal Knowledge Distillation

no code implementations • ACM International Conference on Multimedia 2023 • Sidharth Anand, Naresh Kumar Devulapally, Sreyasee Das Bhattacharjee, Junsong Yuan

Evaluating speaker emotion in conversations is crucial for various applications requiring human-computer interaction.

Ranked #1 on Multimodal Sentiment Analysis on CMU-MOSEI

Emotion Recognition Knowledge Distillation +1

Paper
Add Code

AMuSE: Adaptive Multimodal Analysis for Speaker Emotion Recognition in Group Conversations

no code implementations • 26 Jan 2024 • Naresh Kumar Devulapally, Sidharth Anand, Sreyasee Das Bhattacharjee, Junsong Yuan, Yu-Ping Chang

This difficulty is compounded in group settings, where the emotion and its temporal evolution are not only influenced by the individual but also by external contexts like audience reaction and context of the ongoing conversation.

Emotion Recognition

Paper
Add Code

Spectrum AUC Difference (SAUCD): Human-aligned 3D Shape Evaluation

no code implementations • 3 Mar 2024 • Tianyu Luan, Zhong Li, Lele Chen, Xuan Gong, Lichang Chen, Yi Xu, Junsong Yuan

Then, we calculate the Area Under the Curve (AUC) difference between the two spectrums, so that each frequency band that captures either the overall or detailed shape is equitably considered.

Paper
Add Code

FSC: Few-point Shape Completion

no code implementations • 12 Mar 2024 • Xianzu Wu, Xianfeng Wu, Tianyu Luan, Yajing Bai, Zhongyuan Lai, Junsong Yuan

While previous studies have demonstrated successful 3D object shape completion with a sufficient number of points, they often fail in scenarios when a few points, e. g. tens of points, are observed.

Object

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.