Search Results for author: Zonghao Guo

Found 8 papers, 5 papers with code

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

3 code implementations • 18 Mar 2024 • Ruyi Xu, Yuan YAO, Zonghao Guo, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Maosong Sun, Gao Huang

To address the challenges, we present LLaVA-UHD, a large multimodal model that can efficiently perceive images in any aspect ratio and high resolution.

4,243

Paper
Code

ControlCap: Controllable Region-level Captioning

1 code implementation • 31 Jan 2024 • Yuzhong Zhao, Yue Liu, Zonghao Guo, Weijia Wu, Chen Gong, Fang Wan, Qixiang Ye

The multimodal model is constrained to generate captions within a few sub-spaces containing the control words, which increases the opportunity of hitting less frequent captions, alleviating the caption degeneration issue.

Ranked #1 on Dense Captioning on Visual Genome

Dense Captioning

Paper
Code

AttentionShift: Iteratively Estimated Part-Based Attention Map for Pointly Supervised Instance Segmentation

no code implementations • CVPR 2023 • Mingxiang Liao, Zonghao Guo, Yuze Wang, Peng Yuan, Bailan Feng, Fang Wan

Pointly supervised instance segmentation (PSIS) learns to segment objects using a single point within the object extent as supervision.

Instance Segmentation Object +2

Paper
Add Code

Bidirectional Feature Globalization for Few-shot Semantic Segmentation of 3D Point Cloud Scenes

no code implementations • 13 Aug 2022 • Yongqiang Mao, Zonghao Guo, Xiaonan Lu, Zhiqiang Yuan, Haowen Guo

With prototype-to-point globalization (Pr2PoG), the global perception is embedded to local point features based on similarity weights from sparse prototypes to dense point features.

Few-Shot Semantic Segmentation Metric Learning +2

Paper
Add Code

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

3 code implementations • ICCV 2023 • Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye

Except for the backbone networks, however, other components such as the detector head and the feature pyramid network (FPN) remain trained from scratch, which hinders fully tapping the potential of representation models.

Ranked #3 on Few-Shot Object Detection on MS-COCO (30-shot)

Decoder Few-Shot Object Detection +3

Paper
Code

Semantic Segmentation for Point Cloud Scenes via Dilated Graph Feature Aggregation and Pyramid Decoders

no code implementations • 11 Apr 2022 • Yongqiang Mao, Xian Sun, Kaiqiang Chen, Wenhui Diao, Zonghao Guo, Xiaonan Lu, Kun fu

Due to the unicity of receptive field, semantic segmentation of point clouds remains challenging for the expression of multi-receptive field features, which brings about the misclassification of instances with similar spatial structures.

Segmentation Semantic Segmentation

Paper
Add Code

Long-tailed Distribution Adaptation

1 code implementation • 6 Oct 2021 • Zhiliang Peng, Wei Huang, Zonghao Guo, Xiaosong Zhang, Jianbin Jiao, Qixiang Ye

We propose to jointly optimize empirical risks of the unbalanced and balanced domains and approximate their domain divergence by intra-class and inter-class distances, with the aim to adapt models trained on the long-tailed distribution to general distributions in an interpretable way.

Domain Adaptation Instance Segmentation +3

Paper
Code

Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection

2 code implementations • CVPR 2021 • Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Detecting oriented and densely packed objects remains challenging for spatial feature aliasing caused by the intersection of reception fields between objects.

Ranked #34 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images

1,756

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.