Search Results for author: Zengchang Qin

Found 27 papers, 9 papers with code

Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQA

no code implementations30 Dec 2024 Qingyun Jin, Xiaohui Song, Feng Zhou, Zengchang Qin

In this work, we propose a low-cost method for pruning MHA models into GQA models with any compression ratio of key-value heads.

Integrating MedCLIP and Cross-Modal Fusion for Automatic Radiology Report Generation

1 code implementation10 Dec 2024 Qianhao Han, Junyi Liu, Zengchang Qin, Zheng Zheng

Automating radiology report generation can significantly reduce the workload of radiologists and enhance the accuracy, consistency, and efficiency of clinical documentation. We propose a novel cross-modal framework that uses MedCLIP as both a vision extractor and a retrieval mechanism to improve the process of medical report generation. By extracting retrieved report features and image features through an attention-based extract module, and integrating them with a fusion module, our method improves the coherence and clinical relevance of generated reports. Experimental results on the widely used IU-Xray dataset demonstrate the effectiveness of our approach, showing improvements over commonly used methods in both report quality and relevance. Additionally, ablation studies provide further validation of the framework, highlighting the importance of accurate report retrieval and feature integration in generating comprehensive medical reports.

Retrieval

SyntheT2C: Generating Synthetic Data for Fine-Tuning Large Language Models on the Text2Cypher Task

1 code implementation15 Jun 2024 Ziije Zhong, Linqing Zhong, Zhaoze Sun, Qingyun Jin, Zengchang Qin, Xiaofan Zhang

Given that most KGs reside in graph databases accessible solely through specialized query languages (e. g., Cypher), it is critical to connect LLMs with KG databases by automating the translation of natural language into Cypher queries (termed as "Text2Cypher" task).

Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented Generation

1 code implementation1 Jun 2024 Zijie Zhong, Hanwen Liu, Xiaoya Cui, Xiaofan Zhang, Zengchang Qin

Integrating information from various reference databases is a major challenge for Retrieval-Augmented Generation (RAG) systems because each knowledge source adopts a unique data structure and follows different conventions.

Chunking RAG +2

From Image to Video, what do we need in multimodal LLMs?

no code implementations18 Apr 2024 Suyuan Huang, Haoxin Zhang, Linqing Zhong, Honggu Chen, Yan Gao, Yao Hu, Zengchang Qin

In this paper, we introduce RED-VILLM, a Resource-Efficient Development pipeline which builds robust Video LLMs through leveraging the prior knowledge of Image LLMs.

Video Understanding

LogicalDefender: Discovering, Extracting, and Utilizing Common-Sense Knowledge

no code implementations18 Mar 2024 Yuhe Liu, Mengxue Kang, Zengchang Qin, Xiangxiang Chu

Experiments show that our model has achieved better logical performance, and the extracted logical knowledge can be effectively applied to other scenarios.

Common Sense Reasoning

Boosting Semantic Segmentation from the Perspective of Explicit Class Embeddings

no code implementations ICCV 2023 Yuhe Liu, Chuanjian Liu, Kai Han, Quan Tang, Zengchang Qin

Following this observation, we propose ECENet, a new segmentation paradigm, in which class embeddings are obtained and enhanced explicitly during interacting with multi-stage image features.

Diversity Segmentation +1

Sparse Double Descent: Where Network Pruning Aggravates Overfitting

1 code implementation17 Jun 2022 Zheng He, Zeke Xie, Quanzhi Zhu, Zengchang Qin

People usually believe that network pruning not only reduces the computational cost of deep networks, but also prevents overfitting by decreasing model capacity.

Network Pruning

Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog

no code implementations10 Apr 2022 Shunyu Zhang, Xiaoze Jiang, Zequn Yang, Tao Wan, Zengchang Qin

In our model, the external knowledge is represented with sentence-level facts and graph-level facts, to properly suit the scenario of the composite of dialog history and image.

Logical Reasoning Sentence +1

Can network pruning benefit deep learning under label noise?

no code implementations29 Sep 2021 Zheng He, Quanzhi Zhu, Zengchang Qin

Network pruning is a widely-used technique to reduce the computational cost of over-parameterized neural networks.

Deep Learning Network Pruning

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

no code implementations11 Aug 2020 Xiaoze Jiang, Siyi Du, Zengchang Qin, Yajing Sun, Jing Yu

Visual dialogue is a challenging task that needs to extract implicit information from both visual (image) and textual (dialogue history) contexts.

Information Retrieval Retrieval

DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

4 code implementations7 Jul 2020 Xiaoze Jiang, Jing Yu, Yajing Sun, Zengchang Qin, Zihao Zhu, Yue Hu, Qi Wu

The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation.

Multi-Level Network for High-Speed Multi-Person Pose Estimation

no code implementations26 Nov 2019 Ying Huang, Jiankai Zhuang, Zengchang Qin

In multi-person pose estimation, the left/right joint type discrimination is always a hard problem because of the similar appearance.

Multi-Person Pose Estimation Vocal Bursts Intensity Prediction

DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue

1 code implementation17 Nov 2019 Xiaoze Jiang, Jing Yu, Zengchang Qin, Yingying Zhuang, Xingxing Zhang, Yue Hu, Qi Wu

More importantly, we can tell which modality (visual or semantic) has more contribution in answering the current question by visualizing the gate values.

feature selection Question Answering +2

Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering

no code implementations23 Dec 2018 Zhuoqian Yang, Zengchang Qin, Jing Yu, Yue Hu

Upon the constructed graph, we propose a Scene Graph Convolutional Network (SceneGCN) to jointly reason the object properties and relational semantics for the correct answer.

Cross-Modal Information Retrieval Information Retrieval +2

A sequential guiding network with attention for image captioning

no code implementations1 Nov 2018 Daouda Sow, Zengchang Qin, Mouhamed Niasse, Tao Wan

The recent advances of deep learning in both computer vision (CV) and natural language processing (NLP) provide us a new way of understanding semantics, by which we can deal with more challenging tasks such as automatic description generation from natural images.

Decoder Image Captioning

Textual Relationship Modeling for Cross-Modal Information Retrieval

1 code implementation31 Oct 2018 Jing Yu, Chenghao Yang, Zengchang Qin, Zhuoqian Yang, Yue Hu, Yanbing Liu

A joint neural model is proposed to learn feature representation individually in each modality.

Multimedia

Text Generation Based on Generative Adversarial Nets with Latent Variable

1 code implementation1 Dec 2017 Heng Wang, Zengchang Qin, Tao Wan

We propose the VGAN model where the generative model is composed of recurrent neural network and VAE.

Language Modeling Language Modelling +1

Generative Cooperative Net for Image Generation and Data Augmentation

no code implementations8 May 2017 Qiangeng Xu, Zengchang Qin, Tao Wan

In this paper, we explore a generative model for the task of generating unseen images with desired features.

Data Augmentation Facial expression generation +1

Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks

4 code implementations4 May 2017 Yifan Liu, Zengchang Qin, Zhenbo Luo, Hua Wang

Learning to generate colorful cartoon images from black-and-white sketches is not only an interesting research problem, but also a potential application in digital entertainment.

Image Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.