Search Results for author: Shenghua Gao

Found 79 papers, 46 papers with code

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

2 code implementations10 Apr 2024 Jiale Xu, Weihao Cheng, Yiming Gao, Xintao Wang, Shenghua Gao, Ying Shan

We present InstantMesh, a feed-forward framework for instant 3D mesh generation from a single image, featuring state-of-the-art generation quality and significant training scalability.

Image to 3D

2D Gaussian Splatting for Geometrically Accurate Radiance Fields

no code implementations26 Mar 2024 Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, Shenghua Gao

3D Gaussian Splatting (3DGS) has recently revolutionized radiance field reconstruction, achieving high quality novel view synthesis and fast rendering speed without baking.

Novel View Synthesis

Bridging 3D Gaussian and Mesh for Freeview Video Rendering

no code implementations18 Mar 2024 Yuting Xiao, Xuan Wang, Jiafei Li, Hongrui Cai, Yanbo Fan, Nan Xue, Minghui Yang, Yujun Shen, Shenghua Gao

To this end, we propose a novel approach, GauMesh, to bridge the 3D Gaussian and Mesh for modeling and rendering the dynamic scenes.

Novel View Synthesis

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

2 code implementations19 Jan 2024 Chenyu Wang, Weixin Luo, Qianyu Chen, Haonan Mai, Jindi Guo, Sixun Dong, Xiaohua, Xuan, Zhengxin Li, Lin Ma, Shenghua Gao

Recently, the astonishing performance of large language models (LLMs) in natural language comprehension and generation tasks triggered lots of exploration of using them as central controllers to build agent systems.

Language Modelling Large Language Model

TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

1 code implementation6 Nov 2023 Shuo Wang, Jing Li, Zibo Zhao, Dongze Lian, Binbin Huang, Xiaomei Wang, Zhengxin Li, Shenghua Gao

Holistic scene understanding includes semantic segmentation, surface normal estimation, object boundary detection, depth estimation, etc.

Boundary Detection Depth Estimation +5

RoomDesigner: Encoding Anchor-latents for Style-consistent and Shape-compatible Indoor Scene Generation

1 code implementation16 Oct 2023 Yiqun Zhao, Zibo Zhao, Jing Li, Sixun Dong, Shenghua Gao

Indoor scene generation aims at creating shape-compatible, style-consistent furniture arrangements within a spatially reasonable layout.

Quantization Scene Generation

LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation

1 code implementation ICCV 2023 YiHao Zhi, Xiaodong Cun, Xuelin Chen, Xi Shen, Wen Guo, Shaoli Huang, Shenghua Gao

While previous methods are able to generate speech rhythm-synchronized gestures, the semantic context of the speech is generally lacking in the gesticulations.

Gesture Generation

DebSDF: Delving into the Details and Bias of Neural Indoor Scene Reconstruction

no code implementations29 Aug 2023 Yuting Xiao, Jingwei Xu, Zehao Yu, Shenghua Gao

This paper presents \textbf{DebSDF} to address these challenges, focusing on the utilization of uncertainty in monocular priors and the bias in SDF-based volume rendering.

Indoor Scene Reconstruction Surface Reconstruction

Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation

1 code implementation NeurIPS 2023 Zibo Zhao, Wen Liu, Xin Chen, Xianfang Zeng, Rui Wang, Pei Cheng, Bin Fu, Tao Chen, Gang Yu, Shenghua Gao

We present a novel alignment-before-generation approach to tackle the challenging task of generating general 3D shapes based on 2D images or texts.

3D Shape Generation

Omni-Line-of-Sight Imaging for Holistic Shape Reconstruction

no code implementations21 Apr 2023 Binbin Huang, Xingyue Peng, Siyuan Shen, Suan Xia, Ruiqian Li, Yanhua Yu, Yuehan Wang, Shenghua Gao, Wenzheng Chen, Shiying Li, Jingyi Yu

The core of our method is to put the object nearby diffuse walls and augment the LOS scan in the front view with the NLOS scans from the surrounding walls, which serve as virtual ``mirrors'' to trap lights toward the object.

Object

Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos

1 code implementation CVPR 2023 Sixun Dong, Huazhang Hu, Dongze Lian, Weixin Luo, Yicheng Qian, Shenghua Gao

Sequential video understanding, as an emerging video understanding task, has driven lots of researchers' attention because of its goal-oriented nature.

Representation Learning Sentence +1

High-level Feature Guided Decoding for Semantic Segmentation

no code implementations15 Mar 2023 Ye Huang, Di Kang, Shenghua Gao, Wen Li, Lixin Duan

One crucial design of the HFG is to protect the high-level features from being contaminated by using proper stop-gradient operations so that the backbone does not update according to the noisy gradient from the upsampler.

Semantic Segmentation Vocal Bursts Intensity Prediction

P$^2$SDF for Neural Indoor Scene Reconstruction

no code implementations1 Mar 2023 Jing Li, Jinpeng Yu, Ruoyu Wang, Zhengxin Li, Zhengyu Zhang, Lina Cao, Shenghua Gao

As the unsupervised plane segments are usually noisy and inaccurate, we propose to assign different weights to the sampled points on the plane in plane estimation as well as the regularization loss.

Indoor Scene Reconstruction Surface Reconstruction

VertMatch: A Semi-supervised Framework for Vertebral Structure Detection in 3D Ultrasound Volume

no code implementations28 Dec 2022 Hongye Zeng, Kang Zhou, Songhan Ge, Yuchong Gao, Jianhao Zhao, Shenghua Gao, Rui Zheng

We propose VertMatch, a two-step framework to detect vertebral structures in 3D ultrasound volume by utilizing unlabeled data in semi-supervised manner.

Lifelong Person Re-Identification via Knowledge Refreshing and Consolidation

1 code implementation29 Nov 2022 Chunlin Yu, Ye Shi, Zimo Liu, Shenghua Gao, Jingya Wang

Lifelong person re-identification (LReID) is in significant demand for real-world development as a large amount of ReID data is captured from diverse locations over time and cannot be accessed at once inherently.

Continual Learning Person Re-Identification

ResNeRF: Geometry-Guided Residual Neural Radiance Field for Indoor Scene Novel View Synthesis

no code implementations26 Nov 2022 Yuting Xiao, Yiqun Zhao, Yanyu Xu, Shenghua Gao

In the first stage, we focus on geometry reconstruction based on SDF representation, which would lead to a good geometry surface of the scene and also a sharp density.

Novel View Synthesis

Dual-Space NeRF: Learning Animatable Avatars and Scene Lighting in Separate Spaces

1 code implementation31 Aug 2022 YiHao Zhi, Shenhan Qian, Xinhao Yan, Shenghua Gao

Previous methods alleviate the inconsistency of lighting by learning a per-frame embedding, but this operation does not generalize to unseen poses.

UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation

1 code implementation20 Jul 2022 Shenhan Qian, Jiale Xu, Ziwei Liu, Liqian Ma, Shenghua Gao

We propose united implicit functions (UNIF), a part-based method for clothed human reconstruction and animation with raw scans and skeletons as the input.

Position

PREF: Phasorial Embedding Fields for Compact Neural Representations

1 code implementation26 May 2022 Binbin Huang, Xinhao Yan, Anpei Chen, Shenghua Gao, Jingyi Yu

We present an efficient frequency-based neural representation termed PREF: a shallow MLP augmented with a phasor volume that covers significant border spectra than previous Fourier feature mapping or Positional Encoding.

DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

no code implementations CVPR 2022 Xianing Chen, Qiong Cao, Yujie Zhong, Jing Zhang, Shenghua Gao, DaCheng Tao

Our DearKD is a two-stage framework that first distills the inductive biases from the early intermediate layers of a CNN and then gives the transformer full play by training without distillation.

Knowledge Distillation

TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting

1 code implementation CVPR 2022 Huazhang Hu, Sixun Dong, Yiqun Zhao, Dongze Lian, Zhengxin Li, Shenghua Gao

Existing methods focus on performing repetitive action counting in short videos, which is tough for dealing with longer videos in more realistic scenarios.

Repetitive Action Counting

Taylor3DNet: Fast 3D Shape Inference With Landmark Points Based Taylor Series

no code implementations18 Jan 2022 Yuting Xiao, Jiale Xu, Shenghua Gao

Taylor3DNet exploits a set of discrete landmark points and their corresponding Taylor series coefficients to represent the implicit field of a 3D shape, and the number of landmark points is independent of the resolution of the iso-surface extraction.

3D Shape Reconstruction 3D Shape Representation

SVIP: Sequence VerIfication for Procedures in Videos

1 code implementation CVPR 2022 Yicheng Qian, Weixin Luo, Dongze Lian, Xu Tang, Peilin Zhao, Shenghua Gao

In this paper, we propose a novel sequence verification task that aims to distinguish positive video pairs performing the same action sequence from negative ones with step-level transformations but still conducting the same task.

Action Detection Action Recognition

Raw Bayer Pattern Image Synthesis for Computer Vision-oriented Image Signal Processing Pipeline Design

no code implementations25 Oct 2021 Wei Zhou, Xiangyu Zhang, Hongyu Wang, Shenghua Gao, Xin Lou

It is shown that by adding another transformation, the proposed method is able to synthesize high-quality RAW Bayer images with arbitrary size.

Demosaicking Image Generation +3

Cross-domain Trajectory Prediction with CTP-Net

no code implementations22 Oct 2021 Pingxuan Huang, Zhenhua Cui, Jing Li, Shenghua Gao, Bo Hu, Yanyan Fang

Further, considering the consistency between the observed and the predicted trajectories, a target domain offset discriminator is utilized to adversarially regularize the future trajectory predictions to be in line with the observed trajectories.

Domain Adaptation Pedestrian Trajectory Prediction +1

Proxy-bridged Image Reconstruction Network for Anomaly Detection in Medical Images

no code implementations5 Oct 2021 Kang Zhou, Jing Li, Weixin Luo, Zhengxin Li, Jianlong Yang, Huazhu Fu, Jun Cheng, Jiang Liu, Shenghua Gao

To mitigate this problem, in this paper, we propose a novel Proxy-bridged Image Reconstruction Network (ProxyAno) for anomaly detection in medical images.

Anomaly Detection Image Reconstruction

Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates

1 code implementation ICCV 2021 Shenhan Qian, Zhi Tu, YiHao Zhi, Wen Liu, Shenghua Gao

Co-speech gesture generation is to synthesize a gesture sequence that not only looks real but also matches with the input speech audio.

Gesture Generation

AS-MLP: An Axial Shifted MLP Architecture for Vision

2 code implementations ICLR 2022 Dongze Lian, Zehao Yu, Xing Sun, Shenghua Gao

Our proposed AS-MLP obtains 51. 5 mAP on the COCO validation set and 49. 5 MS mIoU on the ADE20K dataset, which is competitive compared to the transformer-based architectures.

object-detection Object Detection +1

Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding

1 code implementation CVPR 2021 Binbin Huang, Dongze Lian, Weixin Luo, Shenghua Gao

Then we combine the contextual information from the landmark feature convolution module with the target's visual features for grounding.

Descriptive Object +1

Layout-Guided Novel View Synthesis from a Single Indoor Panorama

1 code implementation CVPR 2021 Jiale Xu, Jia Zheng, Yanyu Xu, Rui Tang, Shenghua Gao

Then, we leverage the room layout prior, a strong structural constraint of the indoor scene, to guide the generation of target views.

Novel View Synthesis

Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

1 code implementation CVPR 2021 Zhaoyuan Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanling Zhang, Shenghua Gao

This paper proposes a framework for the interactive video object segmentation (VOS) in the wild where users can choose some frames for annotations iteratively.

Interactive Video Object Segmentation

Crowd Counting With Partial Annotations in an Image

1 code implementation ICCV 2021 Yanyu Xu, Ziming Zhong, Dongze Lian, Jing Li, Zhengxin Li, Xinxing Xu, Shenghua Gao

To fully leverage the data captured from different scenes with different view angles while reducing the annotation cost, this paper studies a novel crowd counting setting, i. e. only using partial annotations in each image as training data.

Active Learning Crowd Counting

Amodal Segmentation Based on Visible Region Segmentation and Shape Prior

1 code implementation10 Dec 2020 Yuting Xiao, Yanyu Xu, Ziming Zhong, Weixin Luo, Jiawei Li, Shenghua Gao

In this way, features corresponding to background and occlusion can be suppressed for amodal mask estimation.

Segmentation

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

2 code implementations18 Nov 2020 Wen Liu, Zhixin Piao, Zhi Tu, Wenhan Luo, Lin Ma, Shenghua Gao

Also, we build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.

Denoising Image Generation +1

SIRI: Spatial Relation Induced Network For Spatial Description Resolution

no code implementations NeurIPS 2020 Peiyao Wang, Weixin Luo, Yanyu Xu, Haojie Li, Shugong Xu, Jianyu Yang, Shenghua Gao

Spatial Description Resolution, as a language-guided localization task, is proposed for target location in a panoramic street view, given corresponding language descriptions.

Relation

Encoding Structure-Texture Relation with P-Net for Anomaly Detection in Retinal Images

1 code implementation ECCV 2020 Kang Zhou, Yuting Xiao, Jianlong Yang, Jun Cheng, Wen Liu, Weixin Luo, Zaiwang Gu, Jiang Liu, Shenghua Gao

In the end, we further utilize the reconstructed image to extract the structure and measure the difference between structure extracted from original and the reconstructed image.

Anatomy Anomaly Detection +2

P$^{2}$Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation

1 code implementation15 Jul 2020 Zehao Yu, Lei Jin, Shenghua Gao

Furthermore, because those textureless regions in indoor scenes (e. g., wall, floor, roof, \etc) usually correspond to planar regions, we propose to leverage superpixels as a plane prior.

Monocular Depth Estimation Superpixels

Towards Fast Adaptation of Neural Architectures with Meta Learning

1 code implementation ICLR 2020 Dongze Lian, Yin Zheng, Yintao Xu, Yanxiong Lu, Leyu Lin, Peilin Zhao, Junzhou Huang, Shenghua Gao

Recently, Neural Architecture Search (NAS) has been successfully applied to multiple artificial intelligence areas and shows better performance compared with hand-designed networks.

Few-Shot Learning Neural Architecture Search

Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement

1 code implementation CVPR 2020 Zehao Yu, Shenghua Gao

On one hand, the high-resolution depth map, the data-adaptive propagation method and the Gauss-Newton layer jointly guarantee the effectiveness of our method.

Depth Estimation

Sparse-GAN: Sparsity-constrained Generative Adversarial Network for Anomaly Detection in Retinal OCT Image

no code implementations28 Nov 2019 Kang Zhou, Shenghua Gao, Jun Cheng, Zaiwang Gu, Huazhu Fu, Zhi Tu, Jianlong Yang, Yitian Zhao, Jiang Liu

With the development of convolutional neural network, deep learning has shown its success for retinal disease detection from optical coherence tomography (OCT) images.

Anomaly Detection Generative Adversarial Network

Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

2 code implementations ICCV 2019 Wen Liu, Zhixin Piao, Jie Min, Wenhan Luo, Lin Ma, Shenghua Gao

In this paper, we propose to use a 3D body mesh recovery module to disentangle the pose and shape, which can not only model the joint location and rotation but also characterize the personalized body shape.

Denoising Novel View Synthesis

Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling

1 code implementation ECCV 2020 Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, Zihan Zhou

Recently, there has been growing interest in developing learning-based methods to detect and utilize salient semi-global or global structures, such as junctions, lines, planes, cuboids, smooth surfaces, and all types of symmetries, for 3D scene modeling and understanding.

Room Layout Estimation

Locality-constrained Spatial Transformer Network for Video Crowd Counting

1 code implementation18 Jul 2019 Yanyan Fang, Biyun Zhan, Wandi Cai, Shenghua Gao, Bo Hu

Then to relate the density maps between neighbouring frames, a Locality-constrained Spatial Transformer (LST) module is introduced to estimate the density map of next frame with that of current frame.

Crowd Counting Translation

Believe It or Not, We Know What You Are Looking at!

1 code implementation4 Jul 2019 Dongze Lian, Zehao Yu, Shenghua Gao

There are two merits for our two-stage solution based gaze following: i) our solution mimics the behavior of human in gaze following, therefore it is more psychological plausible; ii) besides using heatmap to supervise the output of our network, we can also leverage gaze direction to facilitate the training of gaze direction pathway, therefore our network can be more robustly trained.

Learning Semantics-aware Distance Map with Semantics Layering Network for Amodal Instance Segmentation

1 code implementation30 May 2019 Ziheng Zhang, Anpei Chen, Ling Xie, Jingyi Yu, Shenghua Gao

Specifically, we first introduce a new representation, namely a semantics-aware distance map (sem-dist map), to serve as our target for amodal segmentation instead of the commonly used masks and heatmaps.

Amodal Instance Segmentation Segmentation +1

Domain Adversarial Reinforcement Learning for Partial Domain Adaptation

no code implementations10 May 2019 Jin Chen, Xinxiao wu, Lixin Duan, Shenghua Gao

In this more general and practical scenario, a major challenge is how to select source instances in the shared classes across different domains for positive transfer.

Partial Domain Adaptation Q-Learning +2

Generic Multiview Visual Tracking

no code implementations4 Apr 2019 Minye Wu, Haibin Ling, Ning Bi, Shenghua Gao, Hao Sheng, Jingyi Yu

A natural solution to these challenges is to use multiple cameras with multiview inputs, though existing systems are mostly limited to specific targets (e. g. human), static cameras, and/or camera calibration.

Camera Calibration Trajectory Prediction +1

CE-Net: Context Encoder Network for 2D Medical Image Segmentation

3 code implementations7 Mar 2019 Zaiwang Gu, Jun Cheng, Huazhu Fu, Kang Zhou, Huaying Hao, Yitian Zhao, Tianyang Zhang, Shenghua Gao, Jiang Liu

In this paper, we propose a context encoder network (referred to as CE-Net) to capture more high-level information and preserve spatial information for 2D medical image segmentation.

Cell Segmentation Image Segmentation +4

Deep Surface Light Fields

no code implementations15 Oct 2018 Anpei Chen, Minye Wu, Yingliang Zhang, Nianyi Li, Jie Lu, Shenghua Gao, Jingyi Yu

A surface light field represents the radiance of rays originating from any points on the surface in any directions.

Data Compression Image Registration

Evaluating Capability of Deep Neural Networks for Image Classification via Information Plane

no code implementations ECCV 2018 Hao Cheng, Dongze Lian, Shenghua Gao, Yanlin Geng

Inspired by the pioneering work of information bottleneck principle for Deep Neural Networks (DNNs) analysis, we design an information plane based framework to evaluate the capability of DNNs for image classification tasks, which not only helps understand the capability of DNNs, but also helps us choose a neural network which leads to higher classification accuracy more efficiently.

General Classification Image Classification +1

Multi-Cell Multi-Task Convolutional Neural Networks for Diabetic Retinopathy Grading

no code implementations31 Aug 2018 Kang Zhou, Zaiwang Gu, Wen Liu, Weixin Luo, Jun Cheng, Shenghua Gao, Jiang Liu

To considering the relationships of images with different stages, we propose a \textbf{Multi-Task} learning strategy which predicts the label with both classification and regression.

Diabetic Retinopathy Grading General Classification +1

Gaze Prediction in Dynamic 360° Immersive Videos

no code implementations CVPR 2018 Yanyu Xu, Yanbing Dong, Junru Wu, Zhengzhong Sun, Zhiru Shi, Jingyi Yu, Shenghua Gao

This paper explores gaze prediction in dynamic $360^circ$ immersive videos, emph{i. e.}, based on the history scan path and VR contents, we predict where a viewer will look at an upcoming time.

Gaze Prediction

Face Aging With Identity-Preserved Conditional Generative Adversarial Networks

2 code implementations CVPR 2018 Zongwei Wang, Xu Tang, Weixin Luo, Shenghua Gao

By grouping faces with target age together, the objective of face aging is equivalent to transferring aging patterns of faces within the target age group to the face whose aged face is to be synthesized.

Future Frame Prediction for Anomaly Detection – A New Baseline

1 code implementation CVPR 2018 Wen Liu, Weixin Luo, Dongze Lian, Shenghua Gao

To predict a future frame with higher quality for normal events, other than the commonly used appearance (spatial) constraints on intensity and gradient, we also introduce a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.

Anomaly Detection Optical Flow Estimation +1

Encoding Crowd Interaction With Deep Neural Network for Pedestrian Trajectory Prediction

1 code implementation CVPR 2018 Yanyu Xu, Zhixin Piao, Shenghua Gao

Specifically, motivated by the residual learning in deep learning, we propose to predict displacement between neighboring frames for each pedestrian sequentially.

Pedestrian Trajectory Prediction Trajectory Prediction

Future Frame Prediction for Anomaly Detection -- A New Baseline

1 code implementation28 Dec 2017 Wen Liu, Weixin Luo, Dongze Lian, Shenghua Gao

To predict a future frame with higher quality for normal events, other than the commonly used appearance (spatial) constraints on intensity and gradient, we also introduce a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.

Anomaly Detection Optical Flow Estimation +2

Personalized Saliency and its Prediction

1 code implementation9 Oct 2017 Yanyu Xu, Shenghua Gao, Junru Wu, Nianyi Li, Jingyi Yu

Specifically, we propose to decompose a personalized saliency map (referred to as PSM) into a universal saliency map (referred to as USM) predictable by existing saliency detection models and a new discrepancy map across users that characterizes personalized saliency.

Saliency Detection

A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework

1 code implementation ICCV 2017 Weixin Luo, Wen Liu, Shenghua Gao

Motivated by the capability of sparse coding based anomaly detection, we propose a Temporally-coherent Sparse Coding (TSC) where we enforce similar neighbouring frames be encoded with similar reconstruction coefficients.

Anomaly Detection

Graph Construction with Label Information for Semi-Supervised Learning

no code implementations8 Jul 2016 Liansheng Zhuang, Zihan Zhou, Jingwen Yin, Shenghua Gao, Zhouchen Lin, Yi Ma, Nenghai Yu

In the literature, most existing graph-based semi-supervised learning (SSL) methods only use the label information of observed samples in the label propagation stage, while ignoring such valuable information when learning the graph.

graph construction Graph Learning

Progressively Parsing Interactional Objects for Fine Grained Action Detection

no code implementations CVPR 2016 Bingbing Ni, Xiaokang Yang, Shenghua Gao

Fine grained video action analysis often requires reliable detection and tracking of various interacting objects and human body parts, denoted as interactional object parsing.

Action Analysis Action Recognition +5

Single-Image Crowd Counting via Multi-Column Convolutional Neural Network

5 code implementations Conference 2016 Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, Yi Ma

To this end, we have proposed a simple but effective Multi-column Convolutional Neural Network (MCNN) architecture to map the image to its crowd density map.

Crowd Counting

Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features

no code implementations3 Sep 2014 Liansheng Zhuang, Shenghua Gao, Jinhui Tang, Jingjing Wang, Zhouchen Lin, Yi Ma

This paper aims at constructing a good graph for discovering intrinsic data structures in a semi-supervised learning setting.

graph construction

PCANet: A Simple Deep Learning Baseline for Image Classification?

2 code implementations14 Apr 2014 Tsung-Han Chan, Kui Jia, Shenghua Gao, Jiwen Lu, Zinan Zeng, Yi Ma

In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms.

Classification Face Recognition +5

Learning by Associating Ambiguously Labeled Images

no code implementations CVPR 2013 Zinan Zeng, Shijie Xiao, Kui Jia, Tsung-Han Chan, Shenghua Gao, Dong Xu, Yi Ma

Our framework is motivated by the observation that samples from the same class repetitively appear in the collection of ambiguously labeled training images, while they are just ambiguously labeled in each image.

Cannot find the paper you are looking for? You can Submit a new open access paper.