Search Results for author: Zhihong Chen

Found 43 papers, 28 papers with code

Simplify RLHF as Reward-Weighted SFT: A Variational Method

no code implementations16 Feb 2025 Yuhao Du, Zhuo Li, Pengyu Cheng, Zhihong Chen, Yuejiao Xie, Xiang Wan, Anningzhe Gao

More specifically, by directly minimizing the distribution gap between the learning LLM policy and the optimal solution of RLHF, we transform the alignment objective into a reward-driven re-weighted supervised fine-tuning (SFT) form, which only requires minor adjustment on the SFT loss to obtain noticeable improvement on training stability and effectiveness.

Foundation Models in Radiology: What, How, When, Why and Why Not

no code implementations27 Nov 2024 Magdalini Paschali, Zhihong Chen, Louis Blankemeier, Maya Varma, Alaa Youssef, Christian Bluethgen, Curtis Langlotz, Sergios Gatidis, Akshay Chaudhari

Given the potentially transformative impact that foundation models can have on the field of radiology, this review aims to establish a standardized terminology concerning foundation models, with a specific focus on the requirements of training data, model training paradigms, model capabilities, and evaluation strategies.

RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models

1 code implementation6 Nov 2024 Maya Varma, Jean-Benoit Delbrouck, Zhihong Chen, Akshay Chaudhari, Curtis Langlotz

Fine-tuned vision-language models (VLMs) often capture spurious correlations between image features and textual attributes, resulting in degraded zero-shot performance at test time.

Image Classification Zero-Shot Learning

Overview of the First Shared Task on Clinical Text Generation: RRG24 and "Discharge Me!"

no code implementations25 Sep 2024 Justin Xu, Zhihong Chen, Andrew Johnston, Louis Blankemeier, Maya Varma, Jason Hom, William J. Collins, Ankit Modi, Robert Lloyd, Benjamin Hopkins, Curtis Langlotz, Jean-Benoit Delbrouck

For instance, state-of-the-art systems could automate the generation of sections in clinical reports to alleviate physician workload and streamline hospital documentation.

Text Generation

CheXpert Plus: Augmenting a Large Chest X-ray Dataset with Text Radiology Reports, Patient Demographics and Additional Image Formats

1 code implementation29 May 2024 Pierre Chambon, Jean-Benoit Delbrouck, Thomas Sounack, Shih-Cheng Huang, Zhihong Chen, Maya Varma, Steven QH Truong, Chu The Chuong, Curtis P. Langlotz

To address this, CheXpert Plus serves as a new collection of radiology data sources, made publicly available to enhance the scaling, performance, robustness, and fairness of models for all subsequent machine learning tasks in the field of radiology.

De-identification Fairness

GREEN: Generative Radiology Report Evaluation and Error Notation

no code implementations6 May 2024 Sophie Ostmeier, Justin Xu, Zhihong Chen, Maya Varma, Louis Blankemeier, Christian Bluethgen, Arne Edward Michalson, Michael Moseley, Curtis Langlotz, Akshay S Chaudhari, Jean-Benoit Delbrouck

Evaluating radiology reports is a challenging problem as factual correctness is extremely important due to the need for accurate medical communication about medical images.

Natural Language Understanding

Large Multimodal Agents: A Survey

no code implementations23 Feb 2024 Junlin Xie, Zhihong Chen, Ruifei Zhang, Xiang Wan, Guanbin Li

In this paper, we conduct a systematic review of LLM-driven multimodal agents, which we refer to as large multimodal agents ( LMAs for short).

Decision Making Survey

ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language Models

1 code implementation18 Feb 2024 Guiming Hardy Chen, Shunian Chen, Ruifei Zhang, Junying Chen, Xiangbo Wu, Zhiyi Zhang, Zhihong Chen, Jianquan Li, Xiang Wan, Benyou Wang

Large vision-language models (LVLMs) have shown premise in a broad range of vision-language tasks with their strong reasoning and generalization capabilities.

Language Modelling Question Answering +1

MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria

1 code implementation23 Nov 2023 Wentao Ge, Shunian Chen, Guiming Hardy Chen, Junying Chen, Zhihong Chen, Nuo Chen, Wenya Xie, Shuo Yan, Chenghao Zhu, Ziyue Lin, Song Dingjie, Xidong Wang, Anningzhe Gao, Zhang Zhiyi, Jianquan Li, Xiang Wan, Benyou Wang

To this end, in our paper, we propose a new evaluation paradigm for MLLMs, which is evaluating MLLMs with per-sample criteria using potent MLLM as the judge.

Exploiting Low-confidence Pseudo-labels for Source-free Object Detection

no code implementations19 Oct 2023 Zhihong Chen, Zilei Wang, Yixin Zhang

The LPU module consists of Proposal Soft Training (PST) and Local Spatial Contrastive Learning (LSCL).

Contrastive Learning object-detection +2

AceGPT, Localizing Large Language Models in Arabic

1 code implementation21 Sep 2023 Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Abdulmohsen Alharthi, Bang An, Juncai He, Ziche Liu, Zhiyi Zhang, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu

This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models.

Instruction Following Language Modeling +3

CMB: A Comprehensive Medical Benchmark in Chinese

2 code implementations17 Aug 2023 Xidong Wang, Guiming Hardy Chen, Dingjie Song, Zhiyi Zhang, Zhihong Chen, Qingying Xiao, Feng Jiang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li

We hope this benchmark provide first-hand experience in existing LLMs for medicine and also facilitate the widespread adoption and enhancement of medical LLMs within China.

Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

1 code implementation ICCV 2023 Zunnan Xu, Zhihong Chen, Yong Zhang, Yibing Song, Xiang Wan, Guanbin Li

Parameter Efficient Tuning (PET) has gained attention for reducing the number of parameters while maintaining performance and providing better hardware resource savings, but few studies investigate dense prediction tasks and interaction between modalities.

Decoder Image Segmentation +3

Advancing Visual Grounding with Scene Knowledge: Benchmark and Method

1 code implementation CVPR 2023 Zhihong Chen, Ruifei Zhang, Yibing Song, Xiang Wan, Guanbin Li

Therefore, in this paper, we propose a novel benchmark of \underline{S}cene \underline{K}nowledge-guided \underline{V}isual \underline{G}rounding (SK-VG), where the image content and referring expressions are not sufficient to ground the target objects, forcing the models to have a reasoning ability on the long-form scene knowledge.

Image-text matching Text Matching +1

On the Difference of BERT-style and CLIP-style Text Encoders

1 code implementation6 Jun 2023 Zhihong Chen, Guiming Hardy Chen, Shizhe Diao, Xiang Wan, Benyou Wang

Masked language modeling (MLM) has been one of the most popular pretraining recipes in natural language processing, e. g., BERT, one of the representative models.

Language Modeling Language Modelling +2

HuatuoGPT, towards Taming Language Model to Be a Doctor

2 code implementations24 May 2023 Hongbo Zhang, Junying Chen, Feng Jiang, Fei Yu, Zhihong Chen, Jianquan Li, Guiming Chen, Xiangbo Wu, Zhiyi Zhang, Qingying Xiao, Xiang Wan, Benyou Wang, Haizhou Li

Experimental results demonstrate that HuatuoGPT achieves state-of-the-art results in performing medical consultation among open-source LLMs in GPT-4 evaluation, human evaluation, and medical benchmark datasets.

Language Modeling Language Modelling +1

Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts

1 code implementation ICCV 2023 Zhihong Chen, Shizhe Diao, Benyou Wang, Guanbin Li, Xiang Wan

Medical vision-and-language pre-training (Med-VLP) has shown promising improvements on many downstream medical tasks owing to its applicability to extracting generic representations from medical images and texts.

Image Retrieval Image-text Classification +8

GIPA: A General Information Propagation Algorithm for Graph Learning

1 code implementation19 Jan 2023 Houyi Li, Zhihong Chen, Zhao Li, Qinkai Zheng, Peng Zhang, Shuigeng Zhou

Specifically, the bit-wise correlation calculates the element-wise attention weight through a multi-layer perceptron (MLP) based on the dense representations of two nodes and their edge; The feature-wise correlation is based on the one-hot representations of node attribute features for feature selection.

Attribute feature selection +3

Generalizing Multimodal Variational Methods to Sets

no code implementations19 Dec 2022 Jinzhao Zhou, Yiqun Duan, Zhihong Chen, Yu-Cheng Chang, Chin-Teng Lin

Making sense of multiple modalities can yield a more comprehensive description of real-world phenomena.

Toward expanding the scope of radiology report summarization to multiple anatomies and modalities

1 code implementation15 Nov 2022 Zhihong Chen, Maya Varma, Xiang Wan, Curtis Langlotz, Jean-Benoit Delbrouck

We then conduct extensive experiments to evaluate the performance of models both within and across modality-anatomy pairs in MIMIC-RRS.

Anatomy

Improving Radiology Summarization with Radiograph and Anatomy Prompts

no code implementations15 Oct 2022 Jinpeng Hu, Zhihong Chen, Yang Liu, Xiang Wan, Tsung-Hui Chang

The impression is crucial for the referring physicians to grasp key information since it is concluded from the findings and reasoning of radiologists.

Anatomy Contrastive Learning +1

Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge

1 code implementation15 Sep 2022 Zhihong Chen, Guanbin Li, Xiang Wan

Most existing methods mainly contain three elements: uni-modal encoders (i. e., a vision encoder and a language encoder), a multi-modal fusion module, and pretext tasks, with few studies considering the importance of medical domain expert knowledge and explicitly exploiting such knowledge to facilitate Med-VLP.

Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training

1 code implementation15 Sep 2022 Zhihong Chen, Yuhao Du, Jinpeng Hu, Yang Liu, Guanbin Li, Xiang Wan, Tsung-Hui Chang

Besides, we conduct further analysis to better verify the effectiveness of different components of our approach and various settings of pre-training.

Self-Supervised Learning

Cross-modal Memory Networks for Radiology Report Generation

1 code implementation ACL 2021 Zhihong Chen, Yaling Shen, Yan Song, Xiang Wan

Medical imaging plays a significant role in clinical practice of medical diagnosis, where the text reports of the images are essential in understanding them and facilitating later treatments.

Decoder Medical Diagnosis +1

Graph Enhanced Contrastive Learning for Radiology Findings Summarization

1 code implementation ACL 2022 Jinpeng Hu, Zhuo Li, Zhihong Chen, Zhen Li, Xiang Wan, Tsung-Hui Chang

To address the limitation, we propose a unified framework for exploiting both extra knowledge and the original findings in an integrated way so that the critical information (i. e., key words and their relations) can be extracted in an appropriate way to facilitate impression generation.

Contrastive Learning

Word Graph Guided Summarization for Radiology Findings

1 code implementation Findings (ACL) 2021 Jinpeng Hu, Jianling Li, Zhihong Chen, Yaling Shen, Yan Song, Xiang Wan, Tsung-Hui Chang

In this paper, we propose a novel method for automatic impression generation, where a word graph is constructed from the findings to record the critical words and their relations, then a Word Graph guided Summarization model (WGSum) is designed to generate impressions with the help of the word graph.

Text Summarization

Pre-trained Language Models in Biomedical Domain: A Systematic Survey

1 code implementation11 Oct 2021 Benyou Wang, Qianqian Xie, Jiahuan Pei, Zhihong Chen, Prayag Tiwari, Zhao Li, Jie Fu

In this paper, we summarize the recent progress of pre-trained language models in the biomedical domain and their applications in biomedical downstream tasks.

Survey

Path-based Deep Network for Candidate Item Matching in Recommenders

no code implementations18 May 2021 Houyi Li, Zhihong Chen, Chenliang Li, Rong Xiao, Hongbo Deng, Peng Zhang, Yongchao Liu, Haihong Tang

PDN utilizes Trigger Net to capture the user's interest in each of his/her interacted item, and Similarity Net to evaluate the similarity between each interacted item and the target item based on these items' profile and CF information.

Diversity Recommendation Systems +1

Generalizable Representation Learning for Mixture Domain Face Anti-Spoofing

no code implementations6 May 2021 Zhihong Chen, Taiping Yao, Kekai Sheng, Shouhong Ding, Ying Tai, Jilin Li, Feiyue Huang, Xinyu Jin

Face anti-spoofing approach based on domain generalization(DG) has drawn growing attention due to its robustness forunseen scenarios.

Domain Generalization Face Anti-Spoofing +2

Generating Radiology Reports via Memory-driven Transformer

2 code implementations EMNLP 2020 Zhihong Chen, Yan Song, Tsung-Hui Chang, Xiang Wan

Particularly, this is the first work reporting the generation results on MIMIC-CXR to the best of our knowledge.

Decoder Text Generation

Attention-Guided Discriminative Region Localization and Label Distribution Learning for Bone Age Assessment

1 code implementation30 May 2020 Chao Chen, Zhihong Chen, Xinyu Jin, Lanjuan Li, William Speier, Corey W. Arnold

However, training with the global image underutilizes discriminative local information, while providing extra annotations is expensive and subjective.

Age Estimation regression

ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance

1 code implementation21 May 2020 Zhihong Chen, Rong Xiao, Chenliang Li, Gangfeng Ye, Haochuan Sun, Hongbo Deng

Most of ranking models are trained only with displayed items (most are hot items), but they are utilized to retrieve items in the entire space which consists of both displayed and non-displayed items (most are long-tail items).

Attribute Clustering +2

HoMM: Higher-order Moment Matching for Unsupervised Domain Adaptation

2 code implementations27 Dec 2019 Chao Chen, Zhihang Fu, Zhihong Chen, Sheng Jin, Zhaowei Cheng, Xinyu Jin, Xian-Sheng Hua

In particular, our proposed HoMM can perform arbitrary-order moment tensor matching, we show that the first-order HoMM is equivalent to Maximum Mean Discrepancy (MMD) and the second-order HoMM is equivalent to Correlation Alignment (CORAL).

Unsupervised Domain Adaptation

Towards Self-similarity Consistency and Feature Discrimination for Unsupervised Domain Adaptation

no code implementations13 Apr 2019 Chao Chen, Zhihang Fu, Zhihong Chen, Zhaowei Cheng, Xinyu Jin, Xian-Sheng Hua

Recent advances in unsupervised domain adaptation mainly focus on learning shared representations by global distribution alignment without considering class information across domains.

Unsupervised Domain Adaptation

Joint Domain Alignment and Discriminative Feature Learning for Unsupervised Deep Domain Adaptation

1 code implementation28 Aug 2018 Chao Chen, Zhihong Chen, Boyuan Jiang, Xinyu Jin

Recently, considerable effort has been devoted to deep domain adaptation in computer vision and machine learning communities.

Domain Adaptation

Ro-SOS: Metric Expression Network (MEnet) for Robust Salient Object Segmentation

1 code implementation15 May 2018 Delu Zeng, Yixuan He, Li Liu, Zhihong Chen, Jiabin Huang, Jie Chen, John Paisley

In this paper, we propose an end-to-end generic salient object segmentation model called Metric Expression Network (MEnet) to deal with saliency detection with the tolerance of distortion.

Saliency Detection Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.