Search Results for author: Shi-Min Hu

Found 40 papers, 20 papers with code

Theoretically Achieving Continuous Representation of Oriented Bounding Boxes

2 code implementations • 29 Feb 2024 • Zi-Kai Xiao, Guo-Ye Yang, Xue Yang, Tai-Jiang Mu, Junchi Yan, Shi-Min Hu

Considerable efforts have been devoted to Oriented Object Detection (OOD).

Fairness object-detection +2

167

Paper
Code

CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization

no code implementations • 27 Feb 2024 • Hao-Yang Peng, Jia-Peng Zhang, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu

In the field of digital content creation, generating high-quality 3D characters from single images is challenging, especially given the complexities of various body poses and the issues of self-occlusion and pose ambiguity.

Paper
Add Code

Semantic-Aware Transformation-Invariant RoI Align

no code implementations • 15 Dec 2023 • Guo-Ye Yang, George Kiyohiro Nakayama, Zi-Kai Xiao, Tai-Jiang Mu, Xiaolei Huang, Shi-Min Hu

In this paper, we propose a novel RoI feature extractor, termed Semantic RoI Align (SRA), which is capable of extracting invariant RoI features under a variety of transformations for two-stage detectors.

object-detection Object Detection +1

Paper
Add Code

DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

no code implementations • ICCV 2023 • Kiyohiro Nakayama, Mikaela Angelina Uy, Jiahui Huang, Shi-Min Hu, Ke Li, Leonidas J Guibas

We propose a factorization that models independent part style and part configuration distributions and presents a novel cross-diffusion network that enables us to generate coherent and plausible shapes under our proposed factorization.

Point Cloud Generation

Paper
Add Code

Long Range Pooling for 3D Large-Scale Scene Understanding

no code implementations • CVPR 2023 • Xiang-Li Li, Meng-Hao Guo, Tai-Jiang Mu, Ralph R. Martin, Shi-Min Hu

To achieve the above properties, we propose a simple yet effective long range pooling (LRP) module using dilation max pooling, which provides a network with a large adaptive receptive field.

Scene Understanding

Paper
Add Code

SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation

3 code implementations • 18 Sep 2022 • Meng-Hao Guo, Cheng-Ze Lu, Qibin Hou, ZhengNing Liu, Ming-Ming Cheng, Shi-Min Hu

Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90. 6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it.

Ranked #1 on Semantic Segmentation on iSAID

Segmentation Semantic Segmentation

7,370

Paper
Code

DeepPortraitDrawing: Generating Human Body Images from Freehand Sketches

no code implementations • 4 May 2022 • Xian Wu, Chen Wang, Hongbo Fu, Ariel Shamir, Song-Hai Zhang, Shi-Min Hu

Researchers have explored various ways to generate realistic images from freehand sketches, e. g., for objects and human faces.

Image Generation Sketch-to-Image Translation

Paper
Add Code

Visual Attention Network

17 code implementations • 20 Feb 2022 • Meng-Hao Guo, Cheng-Ze Lu, Zheng-Ning Liu, Ming-Ming Cheng, Shi-Min Hu

In this paper, we propose a novel linear attention named large kernel attention (LKA) to enable self-adaptive and long-range correlations in self-attention while avoiding its shortcomings.

Ranked #1 on Panoptic Segmentation on COCO panoptic

Image Classification Instance Segmentation +5

124,527

Paper
Code

NeRF-SR: High-Quality Neural Radiance Fields using Supersampling

1 code implementation • 3 Dec 2021 • Chen Wang, Xian Wu, Yuan-Chen Guo, Song-Hai Zhang, Yu-Wing Tai, Shi-Min Hu

We present NeRF-SR, a solution for high-resolution (HR) novel view synthesis with mostly low-resolution (LR) inputs.

Novel View Synthesis Vocal Bursts Intensity Prediction

127

Paper
Code

CIRCLE: Convolutional Implicit Reconstruction and Completion for Large-scale Indoor Scene

no code implementations • 25 Nov 2021 • Haoxiang Chen, Jiahui Huang, Tai-Jiang Mu, Shi-Min Hu

We present CIRCLE, a framework for large-scale scene completion and geometric refinement based on local implicit signed distance functions.

Paper
Add Code

Multiway Non-rigid Point Cloud Registration via Learned Functional Map Synchronization

1 code implementation • 25 Nov 2021 • Jiahui Huang, Tolga Birdal, Zan Gojcic, Leonidas J. Guibas, Shi-Min Hu

We present SyNoRiM, a novel way to jointly register multiple non-rigid shapes by synchronizing the maps relating learned functions defined on the point clouds.

Point Cloud Registration

112

Paper
Code

Attention Mechanisms in Computer Vision: A Survey

1 code implementation • 15 Nov 2021 • Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R. Martin, Ming-Ming Cheng, Shi-Min Hu

Humans can naturally and effectively find salient regions in complex scenes.

Image Classification Image Generation +5

2,673

Paper
Code

Sampling Equivariant Self-attention Networks for Object Detection in Aerial Images

no code implementations • 5 Nov 2021 • Guo-Ye Yang, Xiang-Li Li, Ralph R. Martin, Shi-Min Hu

Sampling equivariant networks can adjust sampling from input feature maps according to the transformation of the object, allowing a kernel to extract features of an object under different transformations.

object-detection Object Detection In Aerial Images

Paper
Add Code

Subdivision-Based Mesh Convolution Networks

1 code implementation • 4 Jun 2021 • Shi-Min Hu, Zheng-Ning Liu, Meng-Hao Guo, Jun-Xiong Cai, Jiahui Huang, Tai-Jiang Mu, Ralph R. Martin

Meshes with arbitrary connectivity can be remeshed to have Loop subdivision sequence connectivity via self-parameterization, making SubdivNet a general approach.

3D Classification

235

Paper
Code

Can Attention Enable MLPs To Catch Up With CNNs?

no code implementations • 31 May 2021 • Meng-Hao Guo, Zheng-Ning Liu, Tai-Jiang Mu, Dun Liang, Ralph R. Martin, Shi-Min Hu

In the first week of May, 2021, researchers from four different institutions: Google, Tsinghua University, Oxford University and Facebook, shared their latest work [16, 7, 12, 17] on arXiv. org almost at the same time, each proposing new learning architectures, consisting mainly of linear layers, claiming them to be comparable, or even superior to convolutional-based models.

Paper
Add Code

Recursive-NeRF: An Efficient and Dynamically Growing NeRF

1 code implementation • 19 May 2021 • Guo-Wei Yang, Wen-Yang Zhou, Hao-Yang Peng, Dun Liang, Tai-Jiang Mu, Shi-Min Hu

Only query coordinates with high uncertainties are forwarded to the next level to a bigger neural network with a more powerful representational capability.

Paper
Code

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks

7 code implementations • 5 May 2021 • Meng-Hao Guo, Zheng-Ning Liu, Tai-Jiang Mu, Shi-Min Hu

Attention mechanisms, especially self-attention, have played an increasingly important role in deep feature representation for visual tasks.

Ranked #16 on Semantic Segmentation on PASCAL VOC 2012 test

Image Classification Image Generation +5

10,797

Paper
Code

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

1 code implementation • CVPR 2021 • Jiahui Huang, He Wang, Tolga Birdal, Minhyuk Sung, Federica Arrigoni, Shi-Min Hu, Leonidas Guibas

We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds.

Motion Estimation Motion Segmentation +1

Paper
Code

PCT: Point cloud transformer

11 code implementations • 17 Dec 2020 • Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R. Martin, Shi-Min Hu

It is inherently permutation invariant for processing a sequence of points, making it well-suited for point cloud learning.

Ranked #2 on 3D Point Cloud Classification on IntrA

3D Part Segmentation 3D Point Cloud Classification +1

634

Paper
Code

DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors

1 code implementation • CVPR 2021 • Jiahui Huang, Shi-Sheng Huang, Haoxuan Song, Shi-Min Hu

Previous online 3D dense reconstruction methods struggle to achieve the balance between memory storage and surface quality, largely due to the usage of stagnant underlying geometry representation, such as TSDF (truncated signed distance functions) or surfels, without any knowledge of the scene priors.

3D Reconstruction

122

Paper
Code

Alternating ConvLSTM: Learning Force Propagation with Alternate State Updates

no code implementations • 14 Jun 2020 • Congyue Deng, Tai-Jiang Mu, Shi-Min Hu

Experimental results show that Alt-ConvLSTM efficiently models the material kinetic features and greatly outperforms vanilla ConvLSTM with only the single state update.

Paper
Add Code

ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings

no code implementations • CVPR 2020 • Jiahui Huang, Sheng Yang, Tai-Jiang Mu, Shi-Min Hu

We present ClusterVO, a stereo Visual Odometry which simultaneously clusters and estimates the motion of both ego and surrounding rigid clusters/objects.

Autonomous Driving Clustering +2

Paper
Add Code

Shallow2Deep: Indoor Scene Modeling by Single Image Understanding

no code implementations • 22 Feb 2020 • Yinyu Nie, Shihui Guo, Jian Chang, Xiaoguang Han, Jiahui Huang, Shi-Min Hu, Jian Jun Zhang

Particularly, we design a shallow-to-deep architecture on the basis of convolutional networks for semantic scene understanding and modeling.

Relation Network Scene Understanding

Paper
Add Code

Morphing and Sampling Network for Dense Point Cloud Completion

2 code implementations • 30 Nov 2019 • Minghua Liu, Lu Sheng, Sheng Yang, Jing Shao, Shi-Min Hu

3D point cloud completion, the task of inferring the complete geometric shape from a partial point cloud, has been attracting attention in the community.

Ranked #8 on Point Cloud Completion on ShapeNet

Point Cloud Completion

382

Paper
Code

Example-Guided Style Consistent Image Synthesis from Semantic Labeling

1 code implementation • 4 Jun 2019 • Miao Wang, Guo-Ye Yang, Rui-Long Li, Run-Ze Liang, Song-Hai Zhang, Peter. M. Hall, Shi-Min Hu

Example-guided image synthesis aims to synthesize an image from a semantic label map and an exemplary image indicating style.

Image Generation Scene Segmentation

Paper
Code

FaceShapeGene: A Disentangled Shape Representation for Flexible Face Image Editing

no code implementations • 6 May 2019 • Sen-Zhe Xu, Hao-Zhi Huang, Shi-Min Hu, Wei Liu

On the basis of the FaceShapeGene, a novel part-wise face image editing system is developed, which contains a shape-remix network and a conditional label-to-face transformer.

Image Manipulation

Paper
Add Code

What and Where: A Context-based Recommendation System for Object Insertion

no code implementations • 24 Nov 2018 • Song-Hai Zhang, Zhengping Zhou, Bin Liu, Xin Dong, Dun Liang, Peter Hall, Shi-Min Hu

In this work, we propose a novel topic consisting of two dual tasks: 1) given a scene, recommend objects to insert, 2) given an object category, retrieve suitable background scenes.

Object

Paper
Add Code

TZC: Efficient Inter-Process Communication for Robotics Middleware with Partial Serialization

2 code implementations • 1 Oct 2018 • Yu-Ping Wang, Wende Tan, Xu-Qiang Hu, Dinesh Manocha, Shi-Min Hu

We show that by using TZC, the braking distance can be shortened by 16% than ROS.

Robotics

Paper
Code

Temporally Coherent Video Harmonization Using Adversarial Networks

1 code implementation • 5 Sep 2018 • Hao-Zhi Huang, Senzhe Xu, Junxiong Cai, Wei Liu, Shi-Min Hu

Since existing video datasets which have ground-truth foreground masks and optical flows are not sufficiently large, we propose a simple yet efficient method to build up a synthetic dataset supporting supervised training of the proposed adversarial network.

Video Harmonization

Paper
Code

Learning to Reconstruct High-quality 3D Shapes with Cascaded Fully Convolutional Networks

no code implementations • ECCV 2018 • Yan-Pei Cao, Zheng-Ning Liu, Zheng-Fei Kuang, Leif Kobbelt, Shi-Min Hu

We present a data-driven approach to reconstructing high-resolution and detailed volumetric representations of 3D shapes.

Paper
Add Code

Associating Inter-Image Salient Instances for Weakly Supervised Semantic Segmentation

no code implementations • ECCV 2018 • Ruochen Fan, Qibin Hou, Ming-Ming Cheng, Gang Yu, Ralph R. Martin, Shi-Min Hu

We also combine our method with Mask R-CNN for instance segmentation, and demonstrated for the first time the ability of weakly supervised instance segmentation using only keyword annotations.

Ranked #4 on Image-level Supervised Instance Segmentation on COCO test-dev (using extra training data)

Clustering graph partitioning +6

Paper
Add Code

Deep Portrait Image Completion and Extrapolation

no code implementations • 23 Aug 2018 • Xian Wu, Rui-Long Li, Fang-Lue Zhang, Jian-Cheng Liu, Jue Wang, Ariel Shamir, Shi-Min Hu

We evaluate our method on public portrait image datasets, and show that it outperforms other state-of-art general image completion methods.

Graphics

Paper
Add Code

LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments

no code implementations • 16 Jul 2018 • Dun Liang, Yuanchen Guo, Shaokui Zhang, Song-Hai Zhang, Peter Hall, Min Zhang, Shi-Min Hu

Combining LineNet and TTLane, we proposed a pipeline to model HD maps with crowdsourced data for the first time.

Lane Detection

Paper
Add Code

Pose2Seg: Detection Free Human Instance Segmentation

6 code implementations • CVPR 2019 • Song-Hai Zhang, Rui-Long Li, Xin Dong, Paul L. Rosin, Zixi Cai, Han Xi, Dingcheng Yang, Hao-Zhi Huang, Shi-Min Hu

We demonstrate that our pose-based framework can achieve better accuracy than the state-of-art detection-based approach on the human instance segmentation problem, and can moreover better handle occlusion.

Ranked #1 on Human Instance Segmentation on OCHuman

2D Human Pose Estimation Human Instance Segmentation +5

4,966

Paper
Code

Chinese Text in the Wild

5 code implementations • 28 Feb 2018 • Tai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li, Shi-Min Hu

[python3. 6] 运用tf实现自然场景文字检测, keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

Optical Character Recognition (OCR)

2,891

Paper
Code

Deep Online Video Stabilization

3 code implementations • 22 Feb 2018 • Miao Wang, Guo-Ye Yang, Jin-Kun Lin, Ariel Shamir, Song-Hai Zhang, Shao-Ping Lu, Shi-Min Hu

In this paper, we solve the video stabilization problem using a convolutional neural network (ConvNet).

Graphics

228

Paper
Code

S4Net: Single Stage Salient-Instance Segmentation

1 code implementation • CVPR 2019 • Ruochen Fan, Ming-Ming Cheng, Qibin Hou, Tai-Jiang Mu, Jingdong Wang, Shi-Min Hu

Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch.

Instance Segmentation Segmentation +1

Paper
Code

A Comparative Study of Algorithms for Realtime Panoramic Video Blending

no code implementations • 1 Jun 2016 • Zhe Zhu, Jiaming Lu, Minxuan Wang, Song-Hai Zhang, Ralph Martin, Hantao Liu, Shi-Min Hu

In this paper, we investigate 6 popular blending algorithms---feather blending, multi-band blending, modified Poisson blending, mean value coordinate blending, multi-spline blending and convolution pyramid blending.

Paper
Add Code

Appearance Harmonization for Single Image Shadow Removal

no code implementations • 21 Mar 2016 • Liqian Ma, Jue Wang, Eli Shechtman, Kalyan Sunkavalli, Shi-Min Hu

In this work we propose a fully automatic shadow region harmonization approach that improves the appearance compatibility of the de-shadowed region as typically produced by previous methods.

Image Generation Image Shadow Removal +1

Paper
Add Code

Sketch2Photo: Internet Image Montage

no code implementations • ACM Transactions on Graphics 2009 • Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu

The composed picture is generated by seamlessly stitching several photographs in agreement with the sketch and text labels; these are found by searching the Internet.

Image Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.