Search Results for author: Guoxing Yang

Found 10 papers, 5 papers with code

Visual Prompt Tuning for Few-Shot Text Classification

no code implementations COLING 2022 Jingyuan Wen, Yutian Luo, Nanyi Fei, Guoxing Yang, Zhiwu Lu, Hao Jiang, Jie Jiang, Zhao Cao

In few-shot text classification, a feasible paradigm for deploying VL-PTMs is to align the input samples and their category names via the text encoders.

Few-Shot Learning Few-Shot Text Classification +3

Expert-level vision-language foundation model for real-world radiology and comprehensive evaluation

no code implementations24 Sep 2024 Xiaohong Liu, Guoxing Yang, Yulin Luo, Jiaji Mao, Xiang Zhang, Ming Gao, Shanghang Zhang, Jun Shen, Guangyu Wang

When evaluated on the real-world benchmark involving three representative modalities, 2D images (chest X-rays), multi-view images (mammograms), and 3D images (thyroid CT scans), RadFound significantly outperforms other VL foundation models on both quantitative metrics and human evaluation.

Question Answering Text Generation

TCM-GPT: Efficient Pre-training of Large Language Models for Domain Adaptation in Traditional Chinese Medicine

no code implementations3 Nov 2023 Guoxing Yang, JianYu Shi, Zan Wang, Xiaohong Liu, Guangyu Wang

To the best of our knowledge, our study represents the pioneering validation of domain adaptation of a large language model with 7 billion parameters in TCM domain.

Computational Efficiency Domain Adaptation +2

ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation

no code implementations16 Jun 2023 Guangyu Wang, Guoxing Yang, Zongxin Du, Longjun Fan, Xiaohu Li

Large language models have exhibited exceptional performance on various Natural Language Processing (NLP) tasks, leveraging techniques such as the pre-training, and instruction fine-tuning.

Language Modelling Question Answering

VDT: General-purpose Video Diffusion Transformers via Mask Modeling

1 code implementation22 May 2023 Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding

We also propose a unified spatial-temporal mask modeling mechanism, seamlessly integrated with the model, to cater to diverse video generation scenarios.

Autonomous Driving Video Generation +1

UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling

2 code implementations13 Feb 2023 Haoyu Lu, Yuqi Huo, Guoxing Yang, Zhiwu Lu, Wei Zhan, Masayoshi Tomizuka, Mingyu Ding

Particularly, on the MSRVTT retrieval task, UniAdapter achieves 49. 7% recall@1 with 2. 2% model parameters, outperforming the latest competitors by 2. 0%.

Image-text Retrieval Text Retrieval +3

Towards artificial general intelligence via a multimodal foundation model

1 code implementation27 Oct 2021 Nanyi Fei, Zhiwu Lu, Yizhao Gao, Guoxing Yang, Yuqi Huo, Jingyuan Wen, Haoyu Lu, Ruihua Song, Xin Gao, Tao Xiang, Hao Sun, Ji-Rong Wen

To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-trained with huge multimodal data, which can be quickly adapted for various downstream cognitive tasks.

Image Classification Reading Comprehension +2

L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing

2 code implementations CVPR 2021 Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang

To overcome these limitations, we propose a novel latent space factorization model, called L2M-GAN, which is learned end-to-end and effective for editing both local and global attributes.

Attribute Disentanglement

Cannot find the paper you are looking for? You can Submit a new open access paper.