Search Results for author: Pengpeng Zeng

Found 12 papers, 6 papers with code

Context-based Transfer and Efficient Iterative Learning for Unbiased Scene Graph Generation

no code implementations • 29 Dec 2023 • Qishen Chen, Xinyu Lyu, Haonan Zhang, Pengpeng Zeng, Lianli Gao, Jingkuan Song

Thus, we introduce a plug-and-play method named CITrans, which iteratively trains SGG models with progressively enhanced data.

Graph Generation Unbiased Scene Graph Generation

Paper
Add Code

ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval

1 code implementation • 19 Dec 2023 • Kaipeng Fang, Jingkuan Song, Lianli Gao, Pengpeng Zeng, Zhi-Qi Cheng, Xiyao Li, Heng Tao Shen

Then, in Context-aware Simulator Learning stage, we train a Content-aware Prompt Simulator under a simulated test scenarios to produce the corresponding CaDP.

Few-Shot Learning Retrieval +2

Paper
Code

Generalized Unbiased Scene Graph Generation

no code implementations • 9 Aug 2023 • Xinyu Lyu, Lianli Gao, Junlin Xie, Pengpeng Zeng, Yulu Tian, Jie Shao, Heng Tao Shen

To the end, we propose the Multi-Concept Learning (MCL) framework, which ensures a balanced learning process across rare/ uncommon/ common concepts.

Graph Generation Unbiased Scene Graph Generation

Paper
Add Code

A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval

2 code implementations • NeurIPS 2022 2022 • Hao Li, Jingkuan Song, Lianli Gao, Pengpeng Zeng, Haonan Zhang, Gongfu Li

To verify the effectiveness of our approach, extensive experiments are conducted on MS-COCO, CUB Captions, and Flickr30K, which are commonly used in cross-modal retrieval.

Image-text matching Image-to-Text Retrieval +1

Paper
Code

Visual Commonsense-aware Representation Network for Video Captioning

1 code implementation • 17 Nov 2022 • Pengpeng Zeng, Haonan Zhang, Lianli Gao, Xiangpeng Li, Jin Qian, Heng Tao Shen

Generating consecutive descriptions for videos, i. e., Video Captioning, requires taking full advantage of visual representation along with the generation process.

Caption Generation Question Answering +2

Paper
Code

Progressive Tree-Structured Prototype Network for End-to-End Image Captioning

1 code implementation • 17 Nov 2022 • Pengpeng Zeng, Jinkuan Zhu, Jingkuan Song, Lianli Gao

Specifically, we design a novel embedding method called tree-structured prototype, producing a set of hierarchical representative embeddings which capture the hierarchical semantic structure in textual space.

Image Captioning

Paper
Code

Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation

1 code implementation • 16 Jul 2022 • Chaofan Zheng, Lianli Gao, Xinyu Lyu, Pengpeng Zeng, Abdulmotaleb El Saddik, Heng Tao Shen

Experiments show that our approach achieves a new state-of-the-art performance on VG and GQA datasets and makes a trade-off between the performance of tail predicates and head ones.

Graph Generation Image Captioning +3

Paper
Code

Adaptive Fine-Grained Predicates Learning for Scene Graph Generation

no code implementations • 11 Jul 2022 • Xinyu Lyu, Lianli Gao, Pengpeng Zeng, Heng Tao Shen, Jingkuan Song

The performance of current Scene Graph Generation (SGG) models is severely hampered by hard-to-distinguish predicates, e. g., woman-on/standing on/walking on-beach.

Fine-Grained Image Classification Graph Generation +4

Paper
Add Code

Learning To Generate Scene Graph from Head to Tail

no code implementations • 23 Jun 2022 • Chaofan Zheng, Xinyu Lyu, Yuyu Guo, Pengpeng Zeng, Jingkuan Song, Lianli Gao

SCM is proposed to relieve semantic deviation by ensuring the semantic consistency between the generated scene graph and the ground truth in global and local representations.

Graph Generation Scene Graph Generation

Paper
Add Code

From Pixels to Objects: Cubic Visual Attention for Visual Question Answering

no code implementations • 4 Jun 2022 • Jingkuan Song, Pengpeng Zeng, Lianli Gao, Heng Tao Shen

Existing visual attention models are generally planar, i. e., different channels of the last conv-layer feature map of an image share the same weight.

Object Question Answering +1

Paper
Add Code

Structured Two-stream Attention Network for Video Question Answering

no code implementations • 2 Jun 2022 • Lianli Gao, Pengpeng Zeng, Jingkuan Song, Yuan-Fang Li, Wu Liu, Tao Mei, Heng Tao Shen

To date, visual question answering (VQA) (i. e., image QA and video QA) is still a holy grail in vision and language understanding, especially for video QA.

Question Answering Video Question Answering +2

Paper
Add Code

Support-set based Multi-modal Representation Enhancement for Video Captioning

1 code implementation • 19 May 2022 • Xiaoya Chen, Jingkuan Song, Pengpeng Zeng, Lianli Gao, Heng Tao Shen

Video captioning is a challenging task that necessitates a thorough comprehension of visual scenes.

Video Captioning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.