Search Results for author: Guosheng Lin

Found 143 papers, 50 papers with code

Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence Recognition

no code implementations ECCV 2020 Yukun Su, Guosheng Lin, Jinhui Zhu, Qingyao Wu

This paper introduces a new method for recognizing violent behavior by learning contextual relationships between related people from human skeleton points.

Activity Recognition

Splitting vs. Merging: Mining Object Regions with Discrepancy and Intersection Loss for Weakly Supervised Semantic Segmentation

no code implementations ECCV 2020 Tianyi Zhang, Guosheng Lin, Weide Liu, Jianfei Cai, Alex Kot

Finally, by training the segmentation model with the masks generated by our Splitting vs Merging strategy, we achieve the state-of-the-art weakly-supervised segmentation results on the Pascal VOC 2012 benchmark.

Segmentation Weakly supervised segmentation +2

Eliminating Feature Ambiguity for Few-Shot Segmentation

1 code implementation13 Jul 2024 Qianxiong Xu, Guosheng Lin, Chen Change Loy, Cheng Long, Ziyue Li, Rui Zhao

Recent advancements in few-shot segmentation (FSS) have exploited pixel-by-pixel matching between query and support features, typically based on cross attention, which selectively activate query foreground (FG) features that correspond to the same-class support FG features.

Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation

1 code implementation24 Jun 2024 Yizheng Wu, Zhiyu Pan, Kewei Wang, Xingyi Li, Jiahao Cui, Liwen Xiao, Guosheng Lin, Zhiguo Cao

To leverage unlabeled data, previous semi-supervised 3D instance segmentation approaches have explored self-training frameworks, which rely on high-quality pseudo labels for consistency regularization.

3D Instance Segmentation Segmentation +1

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

1 code implementation14 Jun 2024 YiWen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang

Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement.

Decoder

Text-to-Image Rectified Flow as Plug-and-Play Priors

2 code implementations5 Jun 2024 Xiaofeng Yang, Cheng Chen, Xulei Yang, Fayao Liu, Guosheng Lin

Besides the generative capabilities of diffusion priors, motivated by the unique time-symmetry properties of rectified flow models, a variant of our method can additionally perform image inversion.

3D Generation Text to 3D

Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation

no code implementations27 May 2024 Zhoujie Fu, Jiacheng Wei, Wenhao Shen, Chaoyue Song, Xiaofeng Yang, Fayao Liu, Xulei Yang, Guosheng Lin

In this work, we introduce a novel approach for creating controllable dynamics in 3D-generated Gaussians using casually captured reference videos.

Video Generation

REACTO: Reconstructing Articulated Objects from a Single Video

no code implementations CVPR 2024 Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo, Guosheng Lin, Fayao Liu

In this paper, we address the challenge of reconstructing general articulated 3D objects from a single video.

Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion

no code implementations9 Apr 2024 Fan Yang, Jianfeng Zhang, Yichun Shi, Bowen Chen, Chenxu Zhang, Huichao Zhang, Xiaofeng Yang, Jiashi Feng, Guosheng Lin

Benefiting from the rapid development of 2D diffusion models, 3D content creation has made significant progress recently.

3D Generation

Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

no code implementations CVPR 2024 Cheng Chen, Xiaofeng Yang, Fan Yang, Chengzeng Feng, Zhoujie Fu, Chuan-Sheng Foo, Guosheng Lin, Fayao Liu

In this paper, we present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.

3D Generation Text to 3D

S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes

no code implementations CVPR 2024 Xingyi Li, Zhiguo Cao, Yizheng Wu, Kewei Wang, Ke Xian, Zhe Wang, Guosheng Lin

To address this limitation, we present S-DyRF, a reference-based spatio-temporal stylization method for dynamic neural radiance fields.

Style Transfer

Fine Structure-Aware Sampling: A New Sampling Training Scheme for Pixel-Aligned Implicit Models in Single-View Human Reconstruction

1 code implementation29 Feb 2024 Kennard Yanting Chan, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi Lin

Lastly, to further improve the training process, FSS proposes a mesh thickness loss signal for pixel-aligned implicit models.

Style-Consistent 3D Indoor Scene Synthesis with Decoupled Objects

no code implementations24 Jan 2024 Yunfan Zhang, Hong Huang, Zhiwei Xiong, Zhiqi Shen, Guosheng Lin, Hao Wang, Nicholas Vun

The core strength of our pipeline lies in its ability to generate 3D scenes that are not only visually impressive but also exhibit features like photorealism, multi-view consistency, and diversity.

Diversity Indoor Scene Synthesis

R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization

no code implementations CVPR 2024 Kennard Yanting Chan, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi Lin

To this end we propose R-Cyclic Diffuser a framework that adapts Zero-1-to-3's novel approach to clothed human data by fusing it with a pixel-aligned implicit model.

3D Object Reconstruction Novel View Synthesis

AttriHuman-3D: Editable 3D Human Avatar Generation with Attribute Decomposition and Indexing

no code implementations CVPR 2024 Fan Yang, Tianyi Chen, Xiaosheng He, Zhongang Cai, Lei Yang, Si Wu, Guosheng Lin

We propose AttriHuman-3D, an editable 3D human generation model, which address the aforementioned problems with attribute decomposition and indexing.

Attribute Disentanglement

SARA: Controllable Makeup Transfer with Spatial Alignment and Region-Adaptive Normalization

no code implementations28 Nov 2023 Xiaojing Zhong, Xinyi Huang, Zhonghua Wu, Guosheng Lin, Qingyao Wu

To address this problem, we propose a novel Spatial Alignment and Region-Adaptive normalization method (SARA) in this paper.

Learning-Based Biharmonic Augmentation for Point Cloud Classification

no code implementations10 Nov 2023 Jiacheng Wei, Guosheng Lin, Henghui Ding, Jie Hu, Kim-Hui Yap

Point cloud datasets often suffer from inadequate sample sizes in comparison to image datasets, making data augmentation challenging.

Classification Data Augmentation +2

Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation

no code implementations3 Nov 2023 Shichao Dong, Fayao Liu, Guosheng Lin

Recently, large-scale pre-trained models such as Segment-Anything Model (SAM) and Contrastive Language-Image Pre-training (CLIP) have demonstrated remarkable success and revolutionized the field of computer vision.

3D Semantic Segmentation Point Cloud Segmentation +5

Self-Supervised 3D Scene Flow Estimation and Motion Prediction using Local Rigidity Prior

1 code implementation17 Oct 2023 Ruibo Li, Chi Zhang, Zhe Wang, Chunhua Shen, Guosheng Lin

By rigidly aligning each region with its potential counterpart in the target point cloud, we obtain a region-specific rigid transformation to generate its pseudo flow labels.

Motion Estimation motion prediction +2

Neural Vector Fields: Generalizing Distance Vector Fields by Codebooks and Zero-Curl Regularization

no code implementations4 Sep 2023 Xianghui Yang, Guosheng Lin, Zhenghao Chen, Luping Zhou

Recent neural networks based surface reconstruction can be roughly divided into two categories, one warping templates explicitly and the other representing 3D surfaces implicitly.

Surface Reconstruction

Improving Video Violence Recognition with Human Interaction Learning on 3D Skeleton Point Clouds

no code implementations26 Aug 2023 Yukun Su, Guosheng Lin, Qingyao Wu

(ii) Global-SPIL: to better learn and refine the features of the unordered and unstructured skeleton points, Global-SPIL employs the self-attention layer that operates directly on the sampled points, which can help to make the output more permutation-invariant and well-suited for our task.

Action Recognition Temporal Action Localization

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis

1 code implementation22 Aug 2023 YiWen Chen, Chi Zhang, Xiaofeng Yang, Zhongang Cai, Gang Yu, Lei Yang, Guosheng Lin

Recent strides in Text-to-3D techniques have been propelled by distilling knowledge from powerful large text-to-image diffusion models (LDMs).

3D Generation Text to 3D

Make-It-4D: Synthesizing a Consistent Long-Term Dynamic Scene Video from a Single Image

no code implementations20 Aug 2023 Liao Shen, Xingyi Li, Huiqiang Sun, Juewen Peng, Ke Xian, Zhiguo Cao, Guosheng Lin

To animate the visual content, the feature point cloud is displaced based on the scene flow derived from motion estimation and the corresponding camera pose.

Motion Estimation

Self-Calibrated Cross Attention Network for Few-Shot Segmentation

1 code implementation ICCV 2023 Qianxiong Xu, Wenting Zhao, Guosheng Lin, Cheng Long

Moreover, when calculating SCCA, we design a scaled-cosine mechanism to better utilize the support features for similarity calculation.

Few-Shot Semantic Segmentation

Weakly Supervised 3D Instance Segmentation without Instance-level Annotations

no code implementations3 Aug 2023 Shichao Dong, Guosheng Lin

3D semantic scene understanding tasks have achieved great success with the emergence of deep learning, but often require a huge amount of manually annotated training data.

3D Instance Segmentation Scene Understanding +1

OR-NeRF: Object Removing from 3D Scenes Guided by Multiview Segmentation with Neural Radiance Fields

1 code implementation17 May 2023 Youtan Yin, Zhoujie Fu, Fan Yang, Guosheng Lin

This paper proposes a novel object-removing pipeline, named OR-NeRF, that can remove objects from 3D scenes with user-given points or text prompts on a single view, achieving better performance in less time than previous works.

3D scene Editing Novel View Synthesis +1

MoDA: Modeling Deformable 3D Objects from Casual Videos

1 code implementation17 Apr 2023 Chaoyue Song, Jiacheng Wei, Tianyi Chen, YiWen Chen, Chuan Sheng Foo, Fayao Liu, Guosheng Lin

To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts.

TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

1 code implementation CVPR 2023 Jiacheng Wei, Hao Wang, Jiashi Feng, Guosheng Lin, Kim-Hui Yap

We conduct extensive experiments to analyze each of our proposed components and show the efficacy of our framework in generating high-fidelity 3D textured and text-relevant shapes.

Diversity

Reliability-Adaptive Consistency Regularization for Weakly-Supervised Point Cloud Segmentation

1 code implementation9 Mar 2023 Zhonghua Wu, Yicheng Wu, Guosheng Lin, Jianfei Cai

Weakly-supervised point cloud segmentation with extremely limited labels is highly desirable to alleviate the expensive costs of collecting densely annotated 3D points.

Point Cloud Segmentation Segmentation +1

Neural Vector Fields: Implicit Representation by Explicit Learning

2 code implementations CVPR 2023 Xianghui Yang, Guosheng Lin, Zhenghao Chen, Luping Zhou

Deep neural networks (DNNs) are widely applied for nowadays 3D surface reconstruction tasks and such methods can be further divided into two categories, which respectively warp templates explicitly by moving vertices or represent 3D surfaces implicitly as signed or unsigned distance functions.

Quantization Surface Reconstruction

Effective End-to-End Vision Language Pretraining with Semantic Visual Loss

no code implementations18 Jan 2023 Xiaofeng Yang, Fayao Liu, Guosheng Lin

Current vision language pretraining models are dominated by methods using region visual features extracted from object detectors.

Weakly Supervised Class-Agnostic Motion Prediction for Autonomous Driving

no code implementations CVPR 2023 Ruibo Li, Hanyu Shi, Ziang Fu, Zhe Wang, Guosheng Lin

To this end, we propose a two-stage weakly supervised approach, where the segmentation model trained with the incomplete binary masks in Stage1 will facilitate the self-supervised learning of the motion prediction network in Stage2 by estimating possible moving foregrounds in advance.

Autonomous Driving motion prediction +2

Label-Guided Knowledge Distillation for Continual Semantic Segmentation on 2D Images and 3D Point Clouds

1 code implementation ICCV 2023 Ze Yang, Ruibo Li, Evan Ling, Chi Zhang, Yiming Wang, Dezhao Huang, Keng Teck Ma, Minhoe Hur, Guosheng Lin

To address this issue, we propose a new label-guided knowledge distillation (LGKD) loss, where the old model output is expanded and transplanted (with the guidance of the ground truth label) to form a semantically appropriate class correspondence with the new model output.

Continual Semantic Segmentation Knowledge Distillation +1

Generalizable Person Re-Identification via Viewpoint Alignment and Fusion

no code implementations5 Dec 2022 Bingliang Jiao, Lingqiao Liu, Liying Gao, Guosheng Lin, Ruiqi Wu, Shizhou Zhang, Peng Wang, Yanning Zhang

The key insight of this design is that the cross-attention mechanism in the transformer could be an ideal solution to align the discriminative texture clues from the original image with the canonical view image, which could compensate for the low-quality texture information of the canonical view image.

Domain Generalization Generalizable Person Re-identification +1

Unsupervised 3D Pose Transfer with Cross Consistency and Dual Reconstruction

1 code implementation18 Nov 2022 Chaoyue Song, Jiacheng Wei, Ruibo Li, Fayao Liu, Guosheng Lin

With $G$ as the basic component, we propose a cross consistency learning scheme and a dual reconstruction objective to learn the pose transfer without supervision.

Pose Transfer

IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-view Human Reconstruction

1 code implementation15 Nov 2022 Kennard Yanting Chan, Guosheng Lin, Haiyu Zhao, Weisi Lin

We propose IntegratedPIFu, a new pixel aligned implicit model that builds on the foundation set by PIFuHD.

Human Parsing

ManiCLIP: Multi-Attribute Face Manipulation from Text

1 code implementation2 Oct 2022 Hao Wang, Guosheng Lin, Ana García del Molino, Anran Wang, Jiashi Feng, Zhiqi Shen

In this paper we present a novel multi-attribute face manipulation method based on textual descriptions.

Attribute Text-based Image Editing

CRCNet: Few-shot Segmentation with Cross-Reference and Region-Global Conditional Networks

no code implementations23 Aug 2022 Weide Liu, Chi Zhang, Guosheng Lin, Fayao Liu

Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.

Segmentation

Single-view 3D Mesh Reconstruction for Seen and Unseen Categories

1 code implementation4 Aug 2022 Xianghui Yang, Guosheng Lin, Luping Zhou

Single-view 3D object reconstruction is a fundamental and challenging computer vision task that aims at recovering 3D shapes from single-view RGB images.

3D Object Reconstruction

3D Cartoon Face Generation with Controllable Expressions from a Single GAN Image

no code implementations29 Jul 2022 Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

To this end, we discover the semantic meanings of StyleGAN latent space, such that we are able to produce face images of various expressions, poses, and lighting by controlling the latent codes.

Face Generation Face Model

Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval

no code implementations29 Jul 2022 Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

When we do online paired data augmentation, we first generate augmented text through random token replacement, then pass the augmented text into the latent space alignment module to output the latent codes, which are finally fed to StyleGAN2 to generate the augmented images.

Cross-Modal Retrieval Data Augmentation +2

Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation

no code implementations19 Jul 2022 Zhonghua Wu, Yicheng Wu, Guosheng Lin, Jianfei Cai, Chen Qian

Weakly supervised point cloud segmentation, i. e. semantically segmenting a point cloud with only a few labeled points in the whole 3D scene, is highly desirable due to the heavy burden of collecting abundant dense annotations for the model training.

Point Cloud Segmentation Segmentation

Few-shot Open-set Recognition Using Background as Unknowns

no code implementations19 Jul 2022 Nan Song, Chi Zhang, Guosheng Lin

First, instead of learning the decision boundaries between seen classes, as is done in standard close-set classification, we reserve space for unseen classes, such that images located in these areas are recognized as the unseen classes.

Open Set Learning

Long-tailed Recognition by Learning from Latent Categories

no code implementations2 Jun 2022 Weide Liu, Zhonghua Wu, Yiming Wang, Henghui Ding, Fayao Liu, Jie Lin, Guosheng Lin

Previous long-tailed recognition methods commonly focus on the data augmentation or re-balancing strategy of the tail classes to give more attention to tail classes during the model training.

Data Augmentation Diversity +1

Efficient Few-Shot Object Detection via Knowledge Inheritance

1 code implementation23 Mar 2022 Ze Yang, Chi Zhang, Ruibo Li, Yi Xu, Guosheng Lin

Upon this baseline, we devise an initializer named knowledge inheritance (KI) to reliably initialize the novel weights for the box classifier, which effectively facilitates the knowledge transfer process and boosts the adaptation speed.

Few-Shot Object Detection Object +2

Self-Training Vision Language BERTs with a Unified Conditional Model

no code implementations6 Jan 2022 Xiaofeng Yang, Fengmao Lv, Fayao Liu, Guosheng Lin

We use the labeled image data to train a teacher model and use the trained model to generate pseudo captions on unlabeled image data.

Weakly Supervised Segmentation on Outdoor 4D Point Clouds With Temporal Matching and Spatial Graph Propagation

1 code implementation CVPR 2022 Hanyu Shi, Jiacheng Wei, Ruibo Li, Fayao Liu, Guosheng Lin

We propose a novel temporal-spatial framework for effective weakly supervised learning to generate high-quality pseudo labels from these limited annotated data.

Point Cloud Segmentation Scene Understanding +2

Expanding Large Pre-Trained Unimodal Models With Multimodal Information Injection for Image-Text Multimodal Classification

no code implementations CVPR 2022 Tao Liang, Guosheng Lin, Mingyang Wan, Tianrui Li, Guojun Ma, Fengmao Lv

Through the proposed MI2P unit, we can inject the language information into the vision backbone by attending the word-wise textual features to different visual channels, as well as inject the visual information into the language backbone by attending the channel-wise visual features to different textual words.

Improving Tail-Class Representation with Centroid Contrastive Learning

no code implementations19 Oct 2021 Anthony Meng Huat Tiong, Junnan Li, Guosheng Lin, Boyang Li, Caiming Xiong, Steven C. H. Hoi

ICCL interpolates two images from a class-agnostic sampler and a class-aware sampler, and trains the model such that the representation of the interpolative image can be used to retrieve the centroids for both source classes.

Contrastive Learning Image Classification +2

Learning Structural Representations for Recipe Generation and Food Retrieval

no code implementations4 Oct 2021 Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

Our approach brings together several novel ideas in a systematic framework: (1) exploiting an unsupervised learning approach to obtain the sentence-level tree structure labels before training; (2) generating trees of target recipes from images with the supervision of tree structure labels learned from (1); and (3) integrating the learned tree structures into the recipe generation and food cross-modal retrieval procedure.

Cross-Modal Retrieval Image Captioning +3

3D Pose Transfer with Correspondence Learning and Mesh Refinement

1 code implementation NeurIPS 2021 Chaoyue Song, Jiacheng Wei, Ruibo Li, Fayao Liu, Guosheng Lin

It aims to transfer the pose of a source mesh to a target mesh and keep the identity (e. g., body shape) of the target mesh.

3D Generation Pose Transfer

Meta Navigator: Search for a Good Adaptation Policy for Few-shot Learning

no code implementations ICCV 2021 Chi Zhang, Henghui Ding, Guosheng Lin, Ruibo Li, Changhu Wang, Chunhua Shen

Inspired by the recent success in Automated Machine Learning literature (AutoML), in this paper, we present Meta Navigator, a framework that attempts to solve the aforementioned limitation in few-shot learning by seeking a higher-level strategy and proffer to automate the selection from various few-shot learning designs.

AutoML Few-Shot Learning

Calibrating Class Activation Maps for Long-Tailed Visual Recognition

no code implementations29 Aug 2021 Chi Zhang, Guosheng Lin, Lvlong Lai, Henghui Ding, Qingyao Wu

First, we present a Class Activation Map Calibration (CAMC) module to improve the learning and prediction of network classifiers, by enforcing network prediction based on important image regions.

Representation Learning

Few-shot Segmentation with Optimal Transport Matching and Message Flow

no code implementations19 Aug 2021 Weide Liu, Chi Zhang, Henghui Ding, Tzu-Yi Hung, Guosheng Lin

In this work, we argue that every support pixel's information is desired to be transferred to all query pixels and propose a Correspondence Matching Network (CMNet) with an Optimal Transport Matching module to mine out the correspondence between the query and support images.

Few-Shot Semantic Segmentation Multi-Task Learning +2

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation

1 code implementation17 Aug 2021 Weide Liu, Xiangfei Kong, Tzu-Yi Hung, Guosheng Lin

To improve the generality of the objective activation maps, we propose a region prototypical network RPNet to explore the cross-image object diversity of the training set.

Diversity Image Segmentation +3

MV-TON: Memory-based Video Virtual Try-on network

no code implementations17 Aug 2021 Xiaojing Zhong, Zhonghua Wu, Taizhe Tan, Guosheng Lin, Qingyao Wu

With the development of Generative Adversarial Network, image-based virtual try-on methods have made great progress.

Generative Adversarial Network Virtual Try-on

Cross-Modal Graph with Meta Concepts for Video Captioning

1 code implementation14 Aug 2021 Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

Video captioning targets interpreting the complex visual contents as text descriptions, which requires the model to fully understand video scenes including objects and their interactions.

object-detection Object Detection +1

Few-Shot Segmentation with Global and Local Contrastive Learning

1 code implementation11 Aug 2021 Weide Liu, Zhonghua Wu, Henghui Ding, Fayao Liu, Jie Lin, Guosheng Lin

To this end, we first propose a prior extractor to learn the query information from the unlabeled images with our proposed global-local contrastive learning.

Contrastive Learning Image Segmentation +2

Learning Meta-class Memory for Few-Shot Semantic Segmentation

1 code implementation ICCV 2021 Zhonghua Wu, Xiangxi Shi, Guosheng Lin, Jianfei Cai

To explicitly learn meta-class representations in few-shot segmentation task, we propose a novel Meta-class Memory based few-shot segmentation method (MM-Net), where we introduce a set of learnable memory embeddings to memorize the meta-class information during the base class training and transfer to novel classes during the inference stage.

Few-Shot Semantic Segmentation Segmentation +1

M2IOSR: Maximal Mutual Information Open Set Recognition

no code implementations5 Aug 2021 Xin Sun, Henghui Ding, Chi Zhang, Guosheng Lin, Keck-Voon Ling

In this work, we aim to address the challenging task of open set recognition (OSR).

Open Set Learning

Point Discriminative Learning for Data-efficient 3D Point Cloud Analysis

no code implementations4 Aug 2021 Fayao Liu, Guosheng Lin, Chuan-Sheng Foo, Chaitanya K. Joshi, Jie Lin

In this work we propose PointDisc, a point discriminative learning method to leverage self-supervisions for data-efficient 3D point cloud classification and segmentation.

3D Object Classification 3D Part Segmentation +5

Cycle-Consistent Inverse GAN for Text-to-Image Synthesis

no code implementations3 Aug 2021 Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

In this paper, we propose a novel unified framework of Cycle-consistent Inverse GAN (CI-GAN) for both text-to-image generation and text-guided image manipulation tasks.

Diversity Image Manipulation +1

Remember What You have drawn: Semantic Image Manipulation with Memory

no code implementations27 Jul 2021 Xiangxi Shi, Zhonghua Wu, Guosheng Lin, Jianfei Cai, Shafiq Joty

Therefore, in this paper, we propose a memory-based Image Manipulation Network (MIM-Net), where a set of memories learned from images is introduced to synthesize the texture information with the guidance of the textual description.

Image Manipulation

Dense Supervision Propagation for Weakly Supervised Semantic Segmentation on 3D Point Clouds

no code implementations23 Jul 2021 Jiacheng Wei, Guosheng Lin, Kim-Hui Yap, Fayao Liu, Tzu-Yi Hung

While dense labeling on 3D data is expensive and time-consuming, only a few works address weakly supervised semantic point cloud segmentation methods to relieve the labeling cost by learning from simpler and cheaper labels.

Point Cloud Segmentation Scene Understanding +3

CT-Net: Complementary Transfering Network for Garment Transfer with Arbitrary Geometric Changes

no code implementations CVPR 2021 Fan Yang, Guosheng Lin

Garment transfer shows great potential in realistic applications with the goal of transfering outfits across different people images.

Few-Shot Incremental Learning with Continually Evolved Classifiers

1 code implementation CVPR 2021 Chi Zhang, Nan Song, Guosheng Lin, Yun Zheng, Pan Pan, Yinghui Xu

First, we adopt a simple but effective decoupled learning strategy of representations and classifiers that only the classifiers are updated in each incremental session, which avoids knowledge forgetting in the representations.

Few-Shot Class-Incremental Learning Incremental Learning

Progressive Self-Guided Loss for Salient Object Detection

1 code implementation7 Jan 2021 Sheng Yang, Weisi Lin, Guosheng Lin, Qiuping Jiang, Zichuan Liu

We present a simple yet effective progressive self-guided loss function to facilitate deep learning-based salient object detection (SOD) in images.

Object object-detection +2

CycleSegNet: Object Co-segmentation with Cycle Refinement and Region Correspondence

no code implementations5 Jan 2021 Chi Zhang, Guankai Li, Guosheng Lin, Qingyao Wu, Rui Yao

Image co-segmentation is an active computer vision task that aims to segment the common objects from a set of images.

Segmentation

Attention Is Not Enough: Mitigating the Distribution Discrepancy in Asynchronous Multimodal Sequence Fusion

no code implementations ICCV 2021 Tao Liang, Guosheng Lin, Lei Feng, Yan Zhang, Fengmao Lv

To this end, both the marginal distribution and the elements with high-confidence correlations are aligned over the common space of the query and key vectors which are computed from different modalities.

Time Series Time Series Analysis +1

Self-Supervised 3D Skeleton Action Representation Learning With Motion Consistency and Continuity

no code implementations ICCV 2021 Yukun Su, Guosheng Lin, Qingyao Wu

Recently, self-supervised learning (SSL) has been proved very effective and it can help boost the performance in learning representations from unlabeled data in the image domain.

Action Recognition Representation Learning +2

Compositional Prototype Network with Multi-view Comparision for Few-Shot Point Cloud Semantic Segmentation

no code implementations28 Dec 2020 Xiaoyu Chen, Chi Zhang, Guosheng Lin, Jing Han

Moreover, when we use our network to handle the long-tail problem in a fully supervised point cloud segmentation dataset, it can also effectively boost the performance of the few-shot classes.

Few-Shot Learning Point Cloud Segmentation +2

On Lightweight Privacy-Preserving Collaborative Learning for Internet of Things by Independent Random Projections

1 code implementation11 Dec 2020 Linshan Jiang, Rui Tan, Xin Lou, Guosheng Lin

This paper considers the design and implementation of a practical privacy-preserving collaborative learning scheme, in which a curious learning coordinator trains a better machine learning model based on the data samples contributed by a number of IoT objects, while the confidentiality of the raw forms of the training data is protected against the coordinator.

BIG-bench Machine Learning Privacy Preserving

Feature Flow: In-network Feature Flow Estimation for Video Object Detection

no code implementations21 Sep 2020 Ruibing Jin, Guosheng Lin, Changyun Wen, Jianliang Wang, Fayao Liu

Optical flow, which expresses pixel displacement, is widely used in many computer vision tasks to provide pixel-level motion information.

object-detection Optical Flow Estimation +1

Structure-Aware Generation Network for Recipe Generation from Images

1 code implementation ECCV 2020 Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

We investigate an open research task of generating cooking instructions based on only food images and ingredients, which is similar to the image captioning task.

Image Captioning Recipe Generation +1

Graph Edit Distance Reward: Learning to Edit Scene Graph

no code implementations ECCV 2020 Lichang Chen, Guosheng Lin, Shijie Wang, Qingyao Wu

Scene Graph, as a vital tool to bridge the gap between language domain and image domain, has been widely adopted in the cross-modality task like VQA.

Graph Matching Image Retrieval +2

Open Set Recognition with Conditional Probabilistic Generative Models

no code implementations12 Aug 2020 Xin Sun, Chi Zhang, Guosheng Lin, Keck-Voon Ling

A typical challenge that hinders their real-world applications is that unknown samples may be fed into the system during the testing phase, but traditional deep neural networks will wrongly recognize these unknown samples as one of the known classes.

Open Set Learning

Decomposing Generation Networks with Structure Prediction for Recipe Generation

no code implementations27 Jul 2020 Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

Recipe generation from food images and ingredients is a challenging task, which requires the interpretation of the information from another modality.

Image Captioning Recipe Generation +1

Multi-Path Region Mining For Weakly Supervised 3D Semantic Segmentation on Point Clouds

1 code implementation CVPR 2020 Jiacheng Wei, Guosheng Lin, Kim-Hui Yap, Tzu-Yi Hung, Lihua Xie

To the best of our knowledge, this is the first method that uses cloud-level weak labels on raw 3D space to train a point cloud semantic segmentation network.

3D Semantic Segmentation Point Cloud Segmentation +2

Exploring Bottom-up and Top-down Cues with Attentive Learning for Webly Supervised Object Detection

no code implementations CVPR 2020 Zhonghua Wu, Qingyi Tao, Guosheng Lin, Jianfei Cai

To reduce the human labeling effort, we propose a novel webly supervised object detection (WebSOD) method for novel classes which only requires the web images without further annotations.

Object object-detection +2

DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning

5 code implementations15 Mar 2020 Chi Zhang, Yujun Cai, Guosheng Lin, Chunhua Shen

We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations to determine image relevance.

Classification Few-Shot Image Classification +4

Video Object Segmentation and Tracking: A Survey

no code implementations19 Apr 2019 Rui Yao, Guosheng Lin, Shixiong Xia, Jiaqi Zhao, Yong Zhou

Second, we provide a detailed discussion and overview of the technical characteristics of the different methods.

Autonomous Vehicles Object +7

On Lightweight Privacy-Preserving Collaborative Learning for IoT Objects

no code implementations13 Feb 2019 Linshan Jiang, Rui Tan, Xin Lou, Guosheng Lin

This paper considers the design and implementation of a practical privacy-preserving collaborative learning scheme, in which a curious learning coordinator trains a better machine learning model based on the data samples contributed by a number of IoT objects, while the confidentiality of the raw forms of the training data is protected against the coordinator.

BIG-bench Machine Learning Privacy Preserving

M2E-Try On Net: Fashion from Model to Everyone

no code implementations21 Nov 2018 Zhonghua Wu, Guosheng Lin, Qingyi Tao, Jianfei Cai

Instead, we present a novel virtual Try-On network, M2E-Try On Net, which transfers the clothes from a model image to a person image without the need of any clean product images.

Virtual Try-on

Correlation Propagation Networks for Scene Text Detection

no code implementations30 Sep 2018 Zichuan Liu, Guosheng Lin, Wang Ling Goh, Fayao Liu, Chunhua Shen, Xiaokang Yang

In this work, we propose a novel hybrid method for scene text detection namely Correlation Propagation Network (CPN).

Scene Text Detection Text Detection

Keypoint Based Weakly Supervised Human Parsing

no code implementations14 Sep 2018 Zhonghua Wu, Guosheng Lin, Jianfei Cai

We develop an iterative learning method to generate pseudo part segmentation masks from keypoint labels.

Human Parsing Segmentation +1

Bootstrapping the Performance of Webly Supervised Semantic Segmentation

1 code implementation CVPR 2018 Tong Shen, Guosheng Lin, Chunhua Shen, Ian Reid

In this work, we focus on weak supervision, developing a method for training a high-quality pixel-level classifier for semantic segmentation, using only image-level class labels as the provided ground-truth.

Segmentation Transfer Learning +2

MoNet: Deep Motion Exploitation for Video Object Segmentation

no code implementations CVPR 2018 Huaxin Xiao, Jiashi Feng, Guosheng Lin, Yu Liu, Maojun Zhang

In this paper, we propose a novel MoNet model to deeply exploit motion cues for boosting video object segmentation performance from two aspects, i. e., frame representation learning and segmentation refinement.

Object Optical Flow Estimation +5

Learning Markov Clustering Networks for Scene Text Detection

no code implementations CVPR 2018 Zichuan Liu, Guosheng Lin, Sheng Yang, Jiashi Feng, Weisi Lin, Wang Ling Goh

MCN predicts instance-level bounding boxes by firstly converting an image into a Stochastic Flow Graph (SFG) and then performing Markov Clustering on this graph.

Clustering Scene Text Detection +1

Weakly Supervised Semantic Segmentation Based on Web Image Co-segmentation

no code implementations25 May 2017 Tong Shen, Guosheng Lin, Lingqiao Liu, Chunhua Shen, Ian Reid

Training a Fully Convolutional Network (FCN) for semantic segmentation requires a large number of masks with pixel level labelling, which involves a large amount of human labour and time for annotation.

Segmentation Weakly supervised Semantic Segmentation +1

Structured Learning of Tree Potentials in CRF for Image Segmentation

no code implementations26 Mar 2017 Fayao Liu, Guosheng Lin, Ruizhi Qiao, Chunhua Shen

In this fashion, we easily achieve nonlinear learning of potential functions on both unary and pairwise terms in CRFs.

Image Segmentation Semantic Segmentation

Efficient Dense Labeling of Human Activity Sequences from Wearables using Fully Convolutional Networks

no code implementations20 Feb 2017 Rui Yao, Guosheng Lin, Qinfeng Shi, Damith Ranasinghe

We conduct extensive experiments and demonstrate that our proposed approach is able to outperform the state-of-the-arts in terms of classification and label misalignment measures on three challenging datasets: Opportunity, Hand Gesture, and our new dataset.

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

13 code implementations CVPR 2017 Guosheng Lin, Anton Milan, Chunhua Shen, Ian Reid

Recently, very deep convolutional neural networks (CNNs) have shown outstanding performance in object recognition and have also been the first choice for dense classification problems such as semantic segmentation.

3D Absolute Human Pose Estimation Semantic Segmentation +1

Crowd Counting via Weighted VLAD on Dense Attribute Feature Maps

no code implementations29 Apr 2016 Biyun Sheng, Chunhua Shen, Guosheng Lin, Jun Li, Wankou Yang, Changyin Sun

Crowd counting is an important task in computer vision, which has many applications in video surveillance.

Attribute Crowd Counting

Exploring Context with Deep Structured models for Semantic Segmentation

no code implementations10 Mar 2016 Guosheng Lin, Chunhua Shen, Anton Van Den Hengel, Ian Reid

We formulate deep structured models by combining CNNs and Conditional Random Fields (CRFs) for learning the patch-patch context between image regions.

Image Segmentation Segmentation +1

Fast Training of Triplet-based Deep Binary Embedding Networks

no code implementations CVPR 2016 Bohan Zhuang, Guosheng Lin, Chunhua Shen, Ian Reid

To solve the first stage, we design a large-scale high-order binary codes inference algorithm to reduce the high-order objective to a standard binary quadratic problem such that graph cuts can be used to efficiently infer the binary code which serve as the label of each training datum.

Image Retrieval Multi-Label Classification +1

Structured Learning of Binary Codes with Column Generation

no code implementations22 Feb 2016 Guosheng Lin, Fayao Liu, Chunhua Shen, Jianxin Wu, Heng Tao Shen

Our column generation based method can be further generalized from the triplet loss to a general structured learning based framework that allows one to directly optimize multivariate performance measures.

Image Retrieval Information Retrieval +1

Discriminative Training of Deep Fully-connected Continuous CRF with Task-specific Loss

no code implementations28 Jan 2016 Fayao Liu, Guosheng Lin, Chunhua Shen

We exemplify the usefulness of the proposed model on multi-class semantic labelling (discrete) and the robust depth estimation (continuous) problems.

Depth Estimation Multi-class Classification

Deeply Learning the Messages in Message Passing Inference

no code implementations NeurIPS 2015 Guosheng Lin, Chunhua Shen, Ian Reid, Anton Van Den Hengel

The network output dimension for message estimation is the same as the number of classes, in contrast to the network output for general CNN potential functions in CRFs, which is exponential in the order of the potentials.

Image Segmentation Semantic Segmentation +1

CRF Learning with CNN Features for Image Segmentation

no code implementations28 Mar 2015 Fayao Liu, Guosheng Lin, Chunhua Shen

The deep CNN is trained on the ImageNet dataset and transferred to image segmentations here for constructing potentials of superpixels.

Image Segmentation Segmentation +2

Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields

1 code implementation26 Feb 2015 Fayao Liu, Chunhua Shen, Guosheng Lin, Ian Reid

Therefore, here we present a deep convolutional neural field model for estimating depths from single monocular images, aiming to jointly explore the capacity of deep CNN and continuous CRF.

Depth Estimation

Deep Convolutional Neural Fields for Depth Estimation from a Single Image

no code implementations CVPR 2015 Fayao Liu, Chunhua Shen, Guosheng Lin

Therefore, we in this paper present a deep convolutional neural field model for estimating depths from a single image, aiming to jointly explore the capacity of deep CNN and continuous CRF.

Depth Estimation

Supervised Hashing Using Graph Cuts and Boosted Decision Trees

1 code implementation24 Aug 2014 Guosheng Lin, Chunhua Shen, Anton Van Den Hengel

The proposed framework allows a number of existing approaches to hashing to be placed in context, and simplifies the development of new problem-specific hashing methods.

Descriptive Image Retrieval +1

Fast Supervised Hashing with Decision Trees for High-Dimensional Data

1 code implementation CVPR 2014 Guosheng Lin, Chunhua Shen, Qinfeng Shi, Anton Van Den Hengel, David Suter

Here we propose to use boosted decision trees for achieving non-linearity in hashing, which are fast to train and evaluate, hence more suitable for hashing with high dimensional data.

Retrieval Vocal Bursts Intensity Prediction

Fast Training of Effective Multi-class Boosting Using Coordinate Descent Optimization

no code implementations23 Nov 2013 Guosheng Lin, Chunhua Shen, Anton Van Den Hengel, David Suter

Different from most existing multi-class boosting methods, which use the same set of weak learners for all the classes, we train class specified weak learners (i. e., each class has a different set of weak learners).

Multi-class Classification

A General Two-Step Approach to Learning-Based Hashing

no code implementations7 Sep 2013 Guosheng Lin, Chunhua Shen, David Suter, Anton Van Den Hengel

This framework allows a number of existing approaches to hashing to be placed in context, and simplifies the development of new problem-specific hashing methods.

Vocal Bursts Valence Prediction

StructBoost: Boosting Methods for Predicting Structured Output Variables

no code implementations14 Feb 2013 Chunhua Shen, Guosheng Lin, Anton Van Den Hengel

Inspired by structured support vector machines (SSVM), here we propose a new boosting algorithm for structured output prediction, which we refer to as StructBoost.

Image Segmentation Multi-class Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.