Search Results for author: Guoli Song

Found 11 papers, 4 papers with code

Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

1 code implementation21 Nov 2022 Peng Jin, Jinfa Huang, Fenglin Liu, Xian Wu, Shen Ge, Guoli Song, David A. Clifton, Jie Chen

Most video-and-language representation learning approaches employ contrastive learning, e. g., CLIP, to project the video and text features into a common latent space according to the semantic similarities of text-video pairs.

Contrastive Learning Representation Learning +2

Fuzzy Positive Learning for Semi-supervised Semantic Segmentation

no code implementations16 Oct 2022 Pengchong Qiao, Zhidan Wei, Yu Wang, Zhennan Wang, Guoli Song, Fan Xu, Xiangyang Ji, Chang Liu, Jie Chen

Semi-supervised learning (SSL) essentially pursues class boundary exploration with less dependence on human annotations.

Semi-Supervised Semantic Segmentation

ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation

no code implementations12 Oct 2022 Kehan Li, Zhennan Wang, Zesen Cheng, Runyi Yu, Yian Zhao, Guoli Song, Chang Liu, Li Yuan, Jie Chen

Recently, self-supervised large-scale visual pre-training models have shown great promise in representing pixel-level semantic relationships, significantly promoting the development of unsupervised dense prediction tasks, e. g., unsupervised semantic segmentation (USS).

Image Segmentation Unsupervised Semantic Segmentation

Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering

no code implementations21 Sep 2022 Hao Li, Jinfa Huang, Peng Jin, Guoli Song, Qi Wu, Jie Chen

Under this setting, these 2D spatial reasoning approaches cannot distinguish the fine-grain spatial relations between visual objects and scene texts on the same image plane, thereby impairing the interpretability and performance of TextVQA models.

Image Captioning Optical Character Recognition +3

Locality Guidance for Improving Vision Transformers on Tiny Datasets

1 code implementation20 Jul 2022 Kehan Li, Runyi Yu, Zhennan Wang, Li Yuan, Guoli Song, Jie Chen

Therefore, our locality guidance approach is very simple and efficient, and can serve as a basic performance enhancement method for VTs on tiny datasets.

$L_2$BN: Enhancing Batch Normalization by Equalizing the $L_2$ Norms of Features

no code implementations6 Jul 2022 Zhennan Wang, Kehan Li, Runyi Yu, Yian Zhao, Pengchong Qiao, Fan Xu, Guoli Song, Jie Chen

In this paper, we show that the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more distinguished inter-class features and more compact intra-class features.

Acoustic Scene Classification Image Classification +1

ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval

no code implementations CVPR 2022 Mengjun Cheng, Yipeng Sun, Longchao Wang, Xiongwei Zhu, Kun Yao, Jie Chen, Guoli Song, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Visual appearance is considered to be the most important cue to understand images for cross-modal retrieval, while sometimes the scene text appearing in images can provide valuable information to understand the visual semantics.

Contrastive Learning Cross-Modal Retrieval +1

CDNet: Centripetal Direction Network for Nuclear Instance Segmentation

2 code implementations ICCV 2021 Hongliang He, Zhongyi Huang, Yao Ding, Guoli Song, Lin Wang, Qian Ren, Pengxu Wei, Zhiqiang Gao, Jie Chen

Specifically, we define the centripetal direction feature as a class of adjacent directions pointing to the nuclear center to represent the spatial relationship between pixels within the nucleus.

Instance Segmentation Semantic Segmentation

Similarity Gaussian Process Latent Variable Model for Multi-Modal Data Analysis

no code implementations ICCV 2015 Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian

Data from real applications involve multiple modalities representing content with the same semantics and deliver rich information from complementary aspects.

Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.