Search Results for author: Litong Feng

Found 20 papers, 14 papers with code

ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation

1 code implementation9 Aug 2024 Mengcheng Lan, Chaofeng Chen, Yiping Ke, Xinjiang Wang, Litong Feng, Wayne Zhang

ProxyCLIP leverages the spatial feature correspondence from VFMs as a form of proxy attention to augment CLIP, thereby inheriting the VFMs' robust local consistency and maintaining CLIP's exceptional zero-shot transfer capacity.

Open Vocabulary Semantic Segmentation Open-Vocabulary Semantic Segmentation +2

ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

no code implementations17 Jul 2024 Mengcheng Lan, Chaofeng Chen, Yiping Ke, Xinjiang Wang, Litong Feng, Wayne Zhang

Despite the success of large-scale pretrained Vision-Language Models (VLMs) especially CLIP in various open-vocabulary tasks, their application to semantic segmentation remains challenging, producing noisy segmentation maps with mis-segmented regions.

Open Vocabulary Semantic Segmentation Open-Vocabulary Semantic Segmentation +1

Towards Vision-Language Geo-Foundation Model: A Survey

1 code implementation13 Jun 2024 Yue Zhou, Litong Feng, Yiping Ke, Xue Jiang, Junchi Yan, Xue Yang, Wayne Zhang

Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding.

Earth Observation Image Captioning +5

H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model

1 code implementation29 Mar 2024 Chao Pang, Jiang Wu, Jiayu Li, Yi Liu, Jiaxing Sun, Weijia Li, Xingxing Weng, Shuai Wang, Litong Feng, Gui-Song Xia, Conghui He

The generic large Vision-Language Models (VLMs) is rapidly developing, but still perform poorly in Remote Sensing (RS) domain, which is due to the unique and specialized nature of RS imagery and the comparatively limited spatial perception of current VLMs.

Hallucination Language Modelling +2

Diverse Cotraining Makes Strong Semi-Supervised Segmentor

1 code implementation ICCV 2023 Yijiang Li, Xinjiang Wang, Lihe Yang, Litong Feng, Wayne Zhang, Ying Gao

Deep co-training has been introduced to semi-supervised segmentation and achieves impressive results, yet few studies have explored the working mechanism behind it.

Diversity

Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation

1 code implementation CVPR 2023 Lihe Yang, Lei Qi, Litong Feng, Wayne Zhang, Yinghuan Shi

In this work, we revisit the weak-to-strong consistency framework, popularized by FixMatch from semi-supervised classification, where the prediction of a weakly perturbed image serves as supervision for its strongly perturbed version.

Semi-supervised Change Detection Semi-supervised Medical Image Segmentation +1

ViM: Out-Of-Distribution with Virtual-logit Matching

2 code implementations CVPR 2022 Haoqi Wang, Zhizhong Li, Litong Feng, Wayne Zhang

Most of the existing Out-Of-Distribution (OOD) detection algorithms depend on single input source: the feature, the logit, or the softmax probability.

Out-of-Distribution Detection

Semantically Coherent Out-of-Distribution Detection

2 code implementations ICCV 2021 Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, Ziwei Liu

The proposed UDG can not only enrich the semantic knowledge of the model by exploiting unlabeled data in an unsupervised manner, but also distinguish ID/OOD samples to enhance ID classification and OOD detection tasks simultaneously.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Webly Supervised Image Classification with Metadata: Automatic Noisy Label Correction via Visual-Semantic Graph

1 code implementation12 Oct 2020 Jingkang Yang, Weirong Chen, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang

VSGraph-LC starts from anchor selection referring to the semantic similarity between metadata and correct label concepts, and then propagates correct labels from anchors on a visual graph using graph neural network (GNN).

General Classification Graph Neural Network +3

Scale-Equalizing Pyramid Convolution for Object Detection

2 code implementations CVPR 2020 Xinjiang Wang, Shilong Zhang, Zhuoran Yu, Litong Feng, Wayne Zhang

Inspired by this, a convolution across the pyramid level is proposed in this study, which is termed pyramid convolution and is a modified 3-D convolution.

Object object-detection +1

How Does BN Increase Collapsed Neural Network Filters?

no code implementations30 Jan 2020 Sheng Zhou, Xinjiang Wang, Ping Luo, Litong Feng, Wenjie Li, Wei zhang

This phenomenon is caused by the normalization effect of BN, which induces a non-trainable region in the parameter space and reduces the network capacity as a result.

object-detection Object Detection

Gradual Network for Single Image De-raining

no code implementations20 Sep 2019 Zhe Huang, Weijiang Yu, Wayne Zhang, Litong Feng, Nong Xiao

Taking the residual result (the coarse de-rained result) between the rainy image sample (i. e. the input data) and the output of coarse stage (i. e. the learnt rain mask) as input, the fine stage continues to de-rain by removing the fine-grained rain streaks (e. g. light rain streaks and water mist) to get a rain-free and well-reconstructed output image via a unified contextual merging sub-network with dense blocks and a merging block.

Rain Removal

Learning Efficient Detector with Semi-supervised Adaptive Distillation

1 code implementation2 Jan 2019 Shitao Tang, Litong Feng, Wenqi Shao, Zhanghui Kuang, Wei zhang, Yimin Chen

ADL enlarges the distillation loss for hard-to-learn and hard-to-mimic samples and reduces distillation loss for the dominant easy samples, enabling distillation to work on the single-stage detector first time, even if the student and the teacher are identical.

Image Classification Knowledge Distillation +1

Temporal Sequence Distillation: Towards Few-Frame Action Recognition in Videos

no code implementations15 Aug 2018 Zhaoyang Zhang, Zhanghui Kuang, Ping Luo, Litong Feng, Wei zhang

Secondly, TSD significantly reduces the computations to run video action recognition with compressed frames on the cloud, while maintaining high recognition accuracies.

Action Recognition In Videos Temporal Action Localization

Fast Video Shot Transition Localization with Deep Structured Models

4 code implementations13 Aug 2018 Shitao Tang, Litong Feng, Zhangkui Kuang, Yimin Chen, Wei zhang

In order to train a high-performance shot transition detector, we contribute a new database ClipShots, which contains 128636 cut transitions and 38120 gradual transitions from 4039 online videos.

Ranked #3 on Camera shot boundary detection on ClipShots (using extra training data)

Camera shot boundary detection

Cannot find the paper you are looking for? You can Submit a new open access paper.