Search Results for author: Shi-Min Hu

Found 40 papers, 20 papers with code

CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization

no code implementations27 Feb 2024 Hao-Yang Peng, Jia-Peng Zhang, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu

In the field of digital content creation, generating high-quality 3D characters from single images is challenging, especially given the complexities of various body poses and the issues of self-occlusion and pose ambiguity.

Semantic-Aware Transformation-Invariant RoI Align

no code implementations15 Dec 2023 Guo-Ye Yang, George Kiyohiro Nakayama, Zi-Kai Xiao, Tai-Jiang Mu, Xiaolei Huang, Shi-Min Hu

In this paper, we propose a novel RoI feature extractor, termed Semantic RoI Align (SRA), which is capable of extracting invariant RoI features under a variety of transformations for two-stage detectors.

object-detection Object Detection +1

DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

no code implementations ICCV 2023 Kiyohiro Nakayama, Mikaela Angelina Uy, Jiahui Huang, Shi-Min Hu, Ke Li, Leonidas J Guibas

We propose a factorization that models independent part style and part configuration distributions and presents a novel cross-diffusion network that enables us to generate coherent and plausible shapes under our proposed factorization.

Point Cloud Generation

Long Range Pooling for 3D Large-Scale Scene Understanding

no code implementations CVPR 2023 Xiang-Li Li, Meng-Hao Guo, Tai-Jiang Mu, Ralph R. Martin, Shi-Min Hu

To achieve the above properties, we propose a simple yet effective long range pooling (LRP) module using dilation max pooling, which provides a network with a large adaptive receptive field.

Scene Understanding

SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation

3 code implementations18 Sep 2022 Meng-Hao Guo, Cheng-Ze Lu, Qibin Hou, ZhengNing Liu, Ming-Ming Cheng, Shi-Min Hu

Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90. 6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it.

Segmentation Semantic Segmentation

DeepPortraitDrawing: Generating Human Body Images from Freehand Sketches

no code implementations4 May 2022 Xian Wu, Chen Wang, Hongbo Fu, Ariel Shamir, Song-Hai Zhang, Shi-Min Hu

Researchers have explored various ways to generate realistic images from freehand sketches, e. g., for objects and human faces.

Image Generation Sketch-to-Image Translation

Visual Attention Network

17 code implementations20 Feb 2022 Meng-Hao Guo, Cheng-Ze Lu, Zheng-Ning Liu, Ming-Ming Cheng, Shi-Min Hu

In this paper, we propose a novel linear attention named large kernel attention (LKA) to enable self-adaptive and long-range correlations in self-attention while avoiding its shortcomings.

Image Classification Instance Segmentation +5

NeRF-SR: High-Quality Neural Radiance Fields using Supersampling

1 code implementation3 Dec 2021 Chen Wang, Xian Wu, Yuan-Chen Guo, Song-Hai Zhang, Yu-Wing Tai, Shi-Min Hu

We present NeRF-SR, a solution for high-resolution (HR) novel view synthesis with mostly low-resolution (LR) inputs.

Novel View Synthesis Vocal Bursts Intensity Prediction

CIRCLE: Convolutional Implicit Reconstruction and Completion for Large-scale Indoor Scene

no code implementations25 Nov 2021 Haoxiang Chen, Jiahui Huang, Tai-Jiang Mu, Shi-Min Hu

We present CIRCLE, a framework for large-scale scene completion and geometric refinement based on local implicit signed distance functions.

Multiway Non-rigid Point Cloud Registration via Learned Functional Map Synchronization

1 code implementation25 Nov 2021 Jiahui Huang, Tolga Birdal, Zan Gojcic, Leonidas J. Guibas, Shi-Min Hu

We present SyNoRiM, a novel way to jointly register multiple non-rigid shapes by synchronizing the maps relating learned functions defined on the point clouds.

Point Cloud Registration

Sampling Equivariant Self-attention Networks for Object Detection in Aerial Images

no code implementations5 Nov 2021 Guo-Ye Yang, Xiang-Li Li, Ralph R. Martin, Shi-Min Hu

Sampling equivariant networks can adjust sampling from input feature maps according to the transformation of the object, allowing a kernel to extract features of an object under different transformations.

object-detection Object Detection In Aerial Images

Subdivision-Based Mesh Convolution Networks

1 code implementation4 Jun 2021 Shi-Min Hu, Zheng-Ning Liu, Meng-Hao Guo, Jun-Xiong Cai, Jiahui Huang, Tai-Jiang Mu, Ralph R. Martin

Meshes with arbitrary connectivity can be remeshed to have Loop subdivision sequence connectivity via self-parameterization, making SubdivNet a general approach.

3D Classification

Can Attention Enable MLPs To Catch Up With CNNs?

no code implementations31 May 2021 Meng-Hao Guo, Zheng-Ning Liu, Tai-Jiang Mu, Dun Liang, Ralph R. Martin, Shi-Min Hu

In the first week of May, 2021, researchers from four different institutions: Google, Tsinghua University, Oxford University and Facebook, shared their latest work [16, 7, 12, 17] on arXiv. org almost at the same time, each proposing new learning architectures, consisting mainly of linear layers, claiming them to be comparable, or even superior to convolutional-based models.

Recursive-NeRF: An Efficient and Dynamically Growing NeRF

1 code implementation19 May 2021 Guo-Wei Yang, Wen-Yang Zhou, Hao-Yang Peng, Dun Liang, Tai-Jiang Mu, Shi-Min Hu

Only query coordinates with high uncertainties are forwarded to the next level to a bigger neural network with a more powerful representational capability.

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks

7 code implementations5 May 2021 Meng-Hao Guo, Zheng-Ning Liu, Tai-Jiang Mu, Shi-Min Hu

Attention mechanisms, especially self-attention, have played an increasingly important role in deep feature representation for visual tasks.

Image Classification Image Generation +5

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

1 code implementation CVPR 2021 Jiahui Huang, He Wang, Tolga Birdal, Minhyuk Sung, Federica Arrigoni, Shi-Min Hu, Leonidas Guibas

We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds.

Motion Estimation Motion Segmentation +1

PCT: Point cloud transformer

11 code implementations17 Dec 2020 Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R. Martin, Shi-Min Hu

It is inherently permutation invariant for processing a sequence of points, making it well-suited for point cloud learning.

3D Part Segmentation 3D Point Cloud Classification +1

DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors

1 code implementation CVPR 2021 Jiahui Huang, Shi-Sheng Huang, Haoxuan Song, Shi-Min Hu

Previous online 3D dense reconstruction methods struggle to achieve the balance between memory storage and surface quality, largely due to the usage of stagnant underlying geometry representation, such as TSDF (truncated signed distance functions) or surfels, without any knowledge of the scene priors.

3D Reconstruction

Alternating ConvLSTM: Learning Force Propagation with Alternate State Updates

no code implementations14 Jun 2020 Congyue Deng, Tai-Jiang Mu, Shi-Min Hu

Experimental results show that Alt-ConvLSTM efficiently models the material kinetic features and greatly outperforms vanilla ConvLSTM with only the single state update.

ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings

no code implementations CVPR 2020 Jiahui Huang, Sheng Yang, Tai-Jiang Mu, Shi-Min Hu

We present ClusterVO, a stereo Visual Odometry which simultaneously clusters and estimates the motion of both ego and surrounding rigid clusters/objects.

Autonomous Driving Clustering +2

Shallow2Deep: Indoor Scene Modeling by Single Image Understanding

no code implementations22 Feb 2020 Yinyu Nie, Shihui Guo, Jian Chang, Xiaoguang Han, Jiahui Huang, Shi-Min Hu, Jian Jun Zhang

Particularly, we design a shallow-to-deep architecture on the basis of convolutional networks for semantic scene understanding and modeling.

Relation Network Scene Understanding

Morphing and Sampling Network for Dense Point Cloud Completion

2 code implementations30 Nov 2019 Minghua Liu, Lu Sheng, Sheng Yang, Jing Shao, Shi-Min Hu

3D point cloud completion, the task of inferring the complete geometric shape from a partial point cloud, has been attracting attention in the community.

Point Cloud Completion

Example-Guided Style Consistent Image Synthesis from Semantic Labeling

1 code implementation4 Jun 2019 Miao Wang, Guo-Ye Yang, Rui-Long Li, Run-Ze Liang, Song-Hai Zhang, Peter. M. Hall, Shi-Min Hu

Example-guided image synthesis aims to synthesize an image from a semantic label map and an exemplary image indicating style.

Image Generation Scene Segmentation

FaceShapeGene: A Disentangled Shape Representation for Flexible Face Image Editing

no code implementations6 May 2019 Sen-Zhe Xu, Hao-Zhi Huang, Shi-Min Hu, Wei Liu

On the basis of the FaceShapeGene, a novel part-wise face image editing system is developed, which contains a shape-remix network and a conditional label-to-face transformer.

Image Manipulation

What and Where: A Context-based Recommendation System for Object Insertion

no code implementations24 Nov 2018 Song-Hai Zhang, Zhengping Zhou, Bin Liu, Xin Dong, Dun Liang, Peter Hall, Shi-Min Hu

In this work, we propose a novel topic consisting of two dual tasks: 1) given a scene, recommend objects to insert, 2) given an object category, retrieve suitable background scenes.

Object

Temporally Coherent Video Harmonization Using Adversarial Networks

1 code implementation5 Sep 2018 Hao-Zhi Huang, Senzhe Xu, Junxiong Cai, Wei Liu, Shi-Min Hu

Since existing video datasets which have ground-truth foreground masks and optical flows are not sufficiently large, we propose a simple yet efficient method to build up a synthetic dataset supporting supervised training of the proposed adversarial network.

Video Harmonization

Learning to Reconstruct High-quality 3D Shapes with Cascaded Fully Convolutional Networks

no code implementations ECCV 2018 Yan-Pei Cao, Zheng-Ning Liu, Zheng-Fei Kuang, Leif Kobbelt, Shi-Min Hu

We present a data-driven approach to reconstructing high-resolution and detailed volumetric representations of 3D shapes.

Associating Inter-Image Salient Instances for Weakly Supervised Semantic Segmentation

no code implementations ECCV 2018 Ruochen Fan, Qibin Hou, Ming-Ming Cheng, Gang Yu, Ralph R. Martin, Shi-Min Hu

We also combine our method with Mask R-CNN for instance segmentation, and demonstrated for the first time the ability of weakly supervised instance segmentation using only keyword annotations.

Clustering graph partitioning +6

Deep Portrait Image Completion and Extrapolation

no code implementations23 Aug 2018 Xian Wu, Rui-Long Li, Fang-Lue Zhang, Jian-Cheng Liu, Jue Wang, Ariel Shamir, Shi-Min Hu

We evaluate our method on public portrait image datasets, and show that it outperforms other state-of-art general image completion methods.

Graphics

Pose2Seg: Detection Free Human Instance Segmentation

6 code implementations CVPR 2019 Song-Hai Zhang, Rui-Long Li, Xin Dong, Paul L. Rosin, Zixi Cai, Han Xi, Dingcheng Yang, Hao-Zhi Huang, Shi-Min Hu

We demonstrate that our pose-based framework can achieve better accuracy than the state-of-art detection-based approach on the human instance segmentation problem, and can moreover better handle occlusion.

2D Human Pose Estimation Human Instance Segmentation +5

Chinese Text in the Wild

5 code implementations28 Feb 2018 Tai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li, Shi-Min Hu

[python3. 6] 运用tf实现自然场景文字检测, keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

Optical Character Recognition (OCR)

Deep Online Video Stabilization

3 code implementations22 Feb 2018 Miao Wang, Guo-Ye Yang, Jin-Kun Lin, Ariel Shamir, Song-Hai Zhang, Shao-Ping Lu, Shi-Min Hu

In this paper, we solve the video stabilization problem using a convolutional neural network (ConvNet).

Graphics

S4Net: Single Stage Salient-Instance Segmentation

1 code implementation CVPR 2019 Ruochen Fan, Ming-Ming Cheng, Qibin Hou, Tai-Jiang Mu, Jingdong Wang, Shi-Min Hu

Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch.

Instance Segmentation Segmentation +1

A Comparative Study of Algorithms for Realtime Panoramic Video Blending

no code implementations1 Jun 2016 Zhe Zhu, Jiaming Lu, Minxuan Wang, Song-Hai Zhang, Ralph Martin, Hantao Liu, Shi-Min Hu

In this paper, we investigate 6 popular blending algorithms---feather blending, multi-band blending, modified Poisson blending, mean value coordinate blending, multi-spline blending and convolution pyramid blending.

Appearance Harmonization for Single Image Shadow Removal

no code implementations21 Mar 2016 Liqian Ma, Jue Wang, Eli Shechtman, Kalyan Sunkavalli, Shi-Min Hu

In this work we propose a fully automatic shadow region harmonization approach that improves the appearance compatibility of the de-shadowed region as typically produced by previous methods.

Image Generation Image Shadow Removal +1

Sketch2Photo: Internet Image Montage

no code implementations ACM Transactions on Graphics 2009 Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu

The composed picture is generated by seamlessly stitching several photographs in agreement with the sketch and text labels; these are found by searching the Internet.

Image Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.