Search Results for author: Songxin Zhang

Found 4 papers, 0 papers with code

Astrea: A MOE-based Visual Understanding Model with Progressive Alignment

no code implementations12 Mar 2025 Xiaoda Yang, Junyu Lu, Hongshun Qiu, Sijing Li, Hao Li, Shengpeng Ji, Xudong Tang, Jiayang Xu, Jiaqi Duan, Ziyue Jiang, Cong Lin, Sihang Cai, Zejian Xie, Zhuoyang Song, Songxin Zhang

Vision-Language Models (VLMs) based on Mixture-of-Experts (MoE) architectures have emerged as a pivotal paradigm in multimodal understanding, offering a powerful framework for integrating visual and linguistic information.

Contrastive Learning Cross-Modal Retrieval +3

Fine-tuning can Help Detect Pretraining Data from Large Language Models

no code implementations9 Oct 2024 Hengxiang Zhang, Songxin Zhang, BingYi Jing, Hongxin Wei

In light of this, we introduce a novel and effective method termed Fine-tuned Score Deviation (FSD), which improves the performance of current scoring functions for pretraining data detection.

Exploring Learning Complexity for Efficient Downstream Dataset Pruning

no code implementations8 Feb 2024 Wenyu Jiang, Zhenlong Liu, Zejian Xie, Songxin Zhang, BingYi Jing, Hongxin Wei

In this paper, we propose a straightforward, novel, and training-free hardness score named Distorting-based Learning Complexity (DLC), to identify informative images and instructions from the downstream dataset efficiently.

Informativeness

Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects

no code implementations8 Dec 2023 Junyu Lu, Dixiang Zhang, Songxin Zhang, Zejian Xie, Zhuoyang Song, Cong Lin, Jiaxing Zhang, BingYi Jing, Pingjian Zhang

During the instruction fine-tuning stage, we introduce semantic-aware visual feature extraction, a crucial method that enables the model to extract informative features from concrete visual objects.

Image Captioning object-detection +5

Cannot find the paper you are looking for? You can Submit a new open access paper.