Search Results for author: Yifan Peng

Found 147 papers, 53 papers with code

Automatic recognition of abdominal lymph nodes from clinical text

1 code implementation EMNLP (ClinicalNLP) 2020 Yifan Peng, SungWon Lee, Daniel C. Elton, Thomas Shen, Yu-Xing Tang, Qingyu Chen, Shuai Wang, Yingying Zhu, Ronald Summers, Zhiyong Lu

We then introduce an end-to-end approach based on the combination of rules and transformer-based methods to detect these abdominal lymph node mentions and classify their types from the MRI radiology reports.

EchoGen: Generating Conclusions from Echocardiogram Notes

no code implementations BioNLP (ACL) 2022 Liyan Tang, Shravan Kooragayalu, Yanshan Wang, Ying Ding, Greg Durrett, Justin F. Rousseau, Yifan Peng

Generating a summary from findings has been recently explored (Zhang et al., 2018, 2020) in note types such as radiology reports that typically have short length.

Attribute

CMU’s IWSLT 2022 Dialect Speech Translation System

no code implementations IWSLT (ACL) 2022 Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe

We use additional paired Modern Standard Arabic data (MSA) to directly improve the speech recognition (ASR) and machine translation (MT) components of our cascaded systems.

Decoder Knowledge Distillation +5

OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning

no code implementations31 May 2025 Yifan Peng, Shakeel Muhammad, Yui Sudo, William Chen, Jinchuan Tian, Chyi-Jiunn Lin, Shinji Watanabe

To address this, we develop a scalable data-cleaning pipeline using public toolkits, yielding a dataset with 166, 000 hours of speech across 75 languages.

Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC

no code implementations30 May 2025 Qingzheng Wang, Jiancheng Sun, Yifan Peng, Shinji Watanabe

Multilingual speech processing with self-supervised or supervised pre-trained Speech Foundation Models (SFM) has achieved strong performance on tasks like Language Identification (LID) and Automatic Speech Recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Optimization-Free Diffusion Model -- A Perturbation Theory Approach

no code implementations29 May 2025 Yuehaw Khoo, Mathias Oster, Yifan Peng

Diffusion models have emerged as a powerful framework in generative modeling, typically relying on optimizing neural networks to estimate the score function via forward SDE simulations.

model

Machine Learning Applications Related to Suicide in Military and Veterans: A Scoping Literature Review

no code implementations18 May 2025 Yuhan Zhang, Yishu Wei, Yanshan Wang, Yunyu Xiao, COL, Ronald K. Poropatich, Gretchen L. Haas, Yiye Zhang, Chunhua Weng, Jinze Liu, Lisa A. Brenner, James M. Bjork, Yifan Peng

This study aims to assess and summarize current research and provides a comprehensive review regarding the application of machine learning techniques in assessing and predicting suicidal ideation, attempts, and mortality among members of military and veteran populations.

Articles

On The Landscape of Spoken Language Models: A Comprehensive Survey

no code implementations11 Apr 2025 Siddhant Arora, Kai-Wei Chang, Chung-Ming Chien, Yifan Peng, Haibin Wu, Yossi Adi, Emmanuel Dupoux, Hung-Yi Lee, Karen Livescu, Shinji Watanabe

The field of spoken language processing is undergoing a shift from training custom-built, task-specific models toward using and optimizing spoken language models (SLMs) which act as universal speech processing systems.

Survey

Glossy Object Reconstruction with Cost-effective Polarized Acquisition

no code implementations CVPR 2025 Bojian Wu, Yifan Peng, Ruizhen Hu, Xiaowei Zhou

The challenge of image-based 3D reconstruction for glossy objects lies in separating diffuse and specular components on glossy surfaces from captured images, a task complicated by the ambiguity in discerning lighting conditions and material properties using RGB data alone.

3D Reconstruction Novel View Synthesis +2

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

1 code implementation1 Apr 2025 Juncheng Wu, Wenlong Deng, Xingxuan Li, Sheng Liu, Taomian Mi, Yifan Peng, Ziyang Xu, Yi Liu, Hyunjin Cho, Chang-In Choi, Yihan Cao, Hui Ren, Xiang Li, Xiaoxiao Li, Yuyin Zhou

Our pipeline generates detailed reasoning for various medical questions from 7 medical datasets, resulting in a dataset of 32, 682 question-answer pairs, each with detailed, step-by-step explanations.

Knowledge Graphs Mathematical Reasoning

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

no code implementations11 Mar 2025 Siddhant Arora, Yifan Peng, Jiatong Shi, Jinchuan Tian, William Chen, Shikhar Bharadwaj, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Shuichiro Shimizu, Vaibhav Srivastav, Shinji Watanabe

Motivated by this, we introduce an open-source, user-friendly toolkit designed to build unified web interfaces for various cascaded and E2E spoken dialogue systems.

Diversity Spoken Dialogue Systems

Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues

no code implementations CVPR 2025 Yuhui Liu, Liangxun Ou, Qiang Fu, Hadi Amata, Wolfgang Heidrich, Yifan Peng

However, existing stereo depth estimation algorithms struggle to perceive high-frequency information and resolve high-resolution depth maps in realistic camera settings with large depth variations.

Image Reconstruction Stereo Depth Estimation +1

Semi-Supervised Learning from Small Annotated Data and Large Unlabeled Data for Fine-grained PICO Entity Recognition

no code implementations26 Dec 2024 Fangyi Chen, Gongbo Zhang, Yilu Fang, Yifan Peng, Chunhua Weng

Materials and Methods: Using a corpus of 2, 511 abstracts with PICO mentions from 4 public datasets, we developed a semi-supervised method to facilitate the training of a NER model, FinePICO, by combining limited annotated data of PICO entities and abundant unlabeled data.

named-entity-recognition Named Entity Recognition +2

Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization

1 code implementation26 Dec 2024 Yihan Wu, Yichen Lu, Yifan Peng, Xihua Wang, Ruihua Song, Shinji Watanabe

Audiovisual Automatic Speech Recognition (AV-ASR) aims to improve speech recognition accuracy by leveraging visual signals.

Automatic Speech Recognition speech-recognition +1

A MapReduce Approach to Effectively Utilize Long Context Information in Retrieval Augmented Language Models

no code implementations17 Dec 2024 Gongbo Zhang, Zihan Xu, Qiao Jin, Fangyi Chen, Yilu Fang, Yi Liu, Justin F. Rousseau, Ziyang Xu, Zhiyong Lu, Chunhua Weng, Yifan Peng

While holding great promise for improving and facilitating healthcare, large language models (LLMs) struggle to produce up-to-date responses on evolving topics due to outdated knowledge or hallucination.

Hallucination RAG +2

Deciphering genomic codes using advanced NLP techniques: a scoping review

no code implementations25 Nov 2024 Shuyan Cheng, Yishu Wei, Yiliang Zhou, Zihan Xu, Drew N Wright, Jinze Liu, Yifan Peng

The goal of this review is to assess data and model accessibility in the most recent literature, gaining a better understanding of the existing capabilities and constraints of these tools in processing genomic sequencing data.

Suicide Risk Assessment on Social Media with Semi-Supervised Learning

no code implementations18 Nov 2024 Max Lovitt, Haotian Ma, Song Wang, Yifan Peng

With social media communities increasingly becoming places where suicidal individuals post and congregate, natural language processing presents an exciting avenue for the development of automated suicide risk assessment systems.

Pseudo Label

Enhancing disease detection in radiology reports through fine-tuning lightweight LLM on weak labels

no code implementations25 Sep 2024 Yishu Wei, Xindi Wang, Hanley Ong, Yiliang Zhou, Adam Flanders, George Shih, Yifan Peng

These findings demonstrate the potential of fine-tuning LLMs with synthetic labels, offering a promising direction for future research on LLM specialization in the medical domain.

Robust Audiovisual Speech Recognition Models with Mixture-of-Experts

no code implementations19 Sep 2024 Yihan Wu, Yifan Peng, Yichen Lu, Xuankai Chang, Ruihua Song, Shinji Watanabe

Moreover, to incorporate visual information effectively, we inject visual information into the ASR model through a mixture-of-experts module.

Mixture-of-Experts Robust Speech Recognition +1

ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration

no code implementations14 Sep 2024 Masao Someki, Kwanghee Choi, Siddhant Arora, William Chen, Samuele Cornell, Jionghao Han, Yifan Peng, Jiatong Shi, Vaibhav Srivastav, Shinji Watanabe

We introduce ESPnet-EZ, an extension of the open-source speech processing toolkit ESPnet, aimed at quick and easy development of speech models.

SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH)

no code implementations24 Jul 2024 Bernardo Consoli, Xizhi Wu, Song Wang, Xinyu Zhao, Yanshan Wang, Justin Rousseau, Tom Hartvigsen, Li Shen, Huanmei Wu, Yifan Peng, Qi Long, Tianlong Chen, Ying Ding

Extracting social determinants of health (SDoH) from unstructured medical notes depends heavily on labor-intensive annotations, which are typically task-specific, hampering reusability and limiting sharing.

Computational Efficiency Language Modeling +2

Multi-Convformer: Extending Conformer with Multiple Convolution Kernels

1 code implementation4 Jul 2024 Darshan Prabhu, Yifan Peng, Preethi Jyothi, Shinji Watanabe

Convolutions have become essential in state-of-the-art end-to-end Automatic Speech Recognition~(ASR) systems due to their efficient modelling of local context.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Towards Robust Speech Representation Learning for Thousands of Languages

no code implementations30 Jun 2024 William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, Jinchuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe

We propose XEUS, a Cross-lingual Encoder for Universal Speech, trained on over 1 million hours of data across 4057 languages, extending the language coverage of SSL models 4-fold.

Representation Learning Self-Supervised Learning +1

Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss

no code implementations23 Jun 2024 Muhammad Shakeel, Yui Sudo, Yifan Peng, Shinji Watanabe

Our method outperforms a conventional contextual biasing baseline on the LibriSpeech corpus, achieving a relative improvement of 22. 5% in biased word error rate (B-WER) and up to 44% compared to the non-contextual baseline with a biasing list size of 100.

Automatic Speech Recognition speech-recognition +1

On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models

no code implementations13 Jun 2024 Jinchuan Tian, Yifan Peng, William Chen, Kwanghee Choi, Karen Livescu, Shinji Watanabe

The Open Whisper-style Speech Model (OWSM) series was introduced to achieve full transparency in building advanced speech-to-text (S2T) foundation models.

Language Modeling Language Modelling +2

Joint Beam Search Integrating CTC, Attention, and Transducer Decoders

no code implementations5 Jun 2024 Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Brian Yan, Jiatong Shi, Yifan Peng, Shinji Watanabe

In addition, we propose three novel joint beam search algorithms by combining three decoders (CTC, RNN-T, and attention) to further improve performance.

Automatic Speech Recognition Decoder +2

Point Resampling and Ray Transformation Aid to Editable NeRF Models

no code implementations12 May 2024 Zhenyang Li, Zilong Chen, Feifan Qu, Mingqing Wang, Yizhou Zhao, Kai Zhang, Yifan Peng

In NeRF-aided editing tasks, object movement presents difficulties in supervision generation due to the introduction of variability in object positions.

NeRF Object

Characterizing the Dilemma of Performance and Index Size in Billion-Scale Vector Search and Breaking It with Second-Tier Memory

no code implementations6 May 2024 Rongxin Cheng, Yifan Peng, Xingda Wei, Hongrui Xie, Rong Chen, Sijie Shen, Haibo Chen

In this paper, we are the first to characterize the trade-off of performance and index size in existing SSD-based graph and cluster indexes: to improve throughput by 5. 7$\times$ and 1. 7$\times$, these indexes have to pay a 5. 8$\times$ storage amplification and 7. 7$\times$ with respect to the dataset size, respectively.

RAG

Evaluating GPT-4 with Vision on Detection of Radiological Findings on Chest Radiographs

no code implementations22 Mar 2024 Yiliang Zhou, Hanley Ong, Patrick Kennedy, Carol Wu, Jacob Kazam, Keith Hentel, Adam Flanders, George Shih, Yifan Peng

The study examines the application of GPT-4V, a multi-modal large language model equipped with visual recognition, in detecting radiological findings from a set of 100 chest radiographs and suggests that GPT-4V is currently not ready for real-world diagnostic usage in interpreting chest radiographs.

Diagnostic Language Modeling +2

MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation

no code implementations19 Mar 2024 Yifan Peng, Ilia Kulikov, Yilin Yang, Sravya Popuri, Hui Lu, Changhan Wang, Hongyu Gong

There have been emerging research interest and advances in speech-to-speech translation (S2ST), translating utterances from one language to another.

Decoder Language Modeling +3

Deep learning with noisy labels in medical prediction problems: a scoping review

no code implementations19 Mar 2024 Yishu Wei, Yu Deng, Cong Sun, Mingquan Lin, Hongmei Jiang, Yifan Peng

This scoping review aims to comprehensively review label noise management in deep learning-based medical prediction problems, which includes label noise detection, label noise handling, and evaluation.

Learning with noisy labels Management

A survey of recent methods for addressing AI fairness and bias in biomedicine

no code implementations13 Feb 2024 Yifan Yang, Mingquan Lin, Han Zhao, Yifan Peng, Furong Huang, Zhiyong Lu

Such biases can occur before, during, or after the development of AI models, making it critical to understand and address potential biases to enable the accurate and reliable application of AI models in clinical settings.

Articles Diagnostic +1

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition

no code implementations31 Jan 2024 Yihan Wu, Soumi Maiti, Yifan Peng, Wangyou Zhang, Chenda Li, Yuyue Wang, Xihua Wang, Shinji Watanabe, Ruihua Song

Existing speech language models typically utilize task-dependent prompt tokens to unify various speech tasks in a single model.

Decoder Language Modeling +6

Multivariate Density Estimation via Variance-Reduced Sketching

1 code implementation22 Jan 2024 Yifan Peng, Yuehaw Khoo, Daren Wang

In this work, we introduce a new framework called Variance-Reduced Sketching (VRS), specifically designed to estimate multivariate density functions with a reduced curse of dimensionality.

Density Estimation regression

Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search

no code implementations19 Jan 2024 Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Yifan Peng, Shinji Watanabe

The proposed method can be trained effectively by combining a bias phrase index loss and special tokens to detect the bias phrases in the input speech data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

A Span-based Model for Extracting Overlapping PICO Entities from RCT Publications

no code implementations8 Jan 2024 Gongbo Zhang, Yiliang Zhou, Yan Hu, Hua Xu, Chunhua Weng, Yifan Peng

On the PICO-Corpus, PICOX obtained higher recall and F1 scores than the baseline and improved the micro recall score from 56. 66 to 67. 33.

Data Augmentation PICO

Leveraging Generative AI for Clinical Evidence Summarization Needs to Ensure Trustworthiness

no code implementations19 Nov 2023 Gongbo Zhang, Qiao Jin, Denis Jered McInerney, Yong Chen, Fei Wang, Curtis L. Cole, Qian Yang, Yanshan Wang, Bradley A. Malin, Mor Peleg, Byron C. Wallace, Zhiyong Lu, Chunhua Weng, Yifan Peng

Evidence-based medicine promises to improve the quality of healthcare by empowering medical decisions and practices with the best available evidence.

UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions

no code implementations4 Oct 2023 Siddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng, Roshan Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Karen Livescu, Shinji Watanabe

Recent studies leverage large language models with multi-tasking capabilities, using natural language prompts to guide the model's behavior and surpassing performance of task-specific models.

 Ranked #1 on Spoken Language Understanding on Fluent Speech Commands (using extra training data)

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning

no code implementations26 Sep 2023 William Chen, Jiatong Shi, Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, Shinji Watanabe

We show that further efficiency can be achieved with a vanilla HuBERT Base model, which can maintain 94% of XLS-R's performance with only 3% of the data, 4 GPUs, and limited trials.

Denoising Self-Supervised Learning

Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

1 code implementation18 Sep 2023 Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-Yi Lee

To achieve comprehensive coverage of diverse speech tasks and harness instruction tuning, we invite the community to collaborate and contribute, facilitating the dynamic growth of the benchmark.

Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks

no code implementations14 Sep 2023 Soumi Maiti, Yifan Peng, Shukjae Choi, Jee-weon Jung, Xuankai Chang, Shinji Watanabe

We propose a decoder-only language model, VoxtLM, that can perform four tasks: speech recognition, speech synthesis, text generation, and speech continuation.

Decoder Language Modeling +5

Demonstration-based learning for few-shot biomedical named entity recognition under machine reading comprehension

1 code implementation12 Aug 2023 Leilei Su, Jian Chen, Yifan Peng, Cong Sun

The objective of this study is to devise a strategy that can improve the model's capability to recognize biomedical entities in scenarios of few-shot learning.

Few-Shot Learning Machine Reading Comprehension +2

High-performance Data Management for Whole Slide Image Analysis in Digital Pathology

1 code implementation10 Aug 2023 Haoju Leng, Ruining Deng, Shunxing Bao, Dazheng Fang, Bryan A. Millis, Yucheng Tang, Haichun Yang, Xiao Wang, Yifan Peng, Lipeng Wan, Yuankai Huo

The performance evaluation encompasses two key scenarios: (1) a pure CPU-based image analysis scenario ("CPU scenario"), and (2) a GPU-based deep learning framework scenario ("GPU scenario").

Management whole slide images

From Military to Healthcare: Adopting and Expanding Ethical Principles for Generative Artificial Intelligence

no code implementations4 Aug 2023 David Oniani, Jordan Hilsman, Yifan Peng, COL, Ronald K. Poropatich, COL Jeremy C. Pamplin, LTC Gary L. Legault, Yanshan Wang

In 2020, the U. S. Department of Defense officially disclosed a set of ethical principles to guide the use of Artificial Intelligence (AI) technologies on future battlefields.

Decision Making

A scoping review on multimodal deep learning in biomedical images and texts

no code implementations14 Jul 2023 Zhaoyi Sun, Mingquan Lin, Qingqing Zhu, Qianqian Xie, Fei Wang, Zhiyong Lu, Yifan Peng

In this scoping review, we aim to provide a comprehensive overview of the current state of the field and identify key concepts, types of studies, and research gaps with a focus on biomedical images and texts joint learning, mainly because these two were the most commonly available data types in MDL research.

Cross-Modal Retrieval Decision Making +6

Classifying Crime Types using Judgment Documents from Social Media

no code implementations29 Jun 2023 Haoxuan Xu, Zeyu He, Mengfan Shen, Songning Lai, Ziqiang Han, Yifan Peng

Experiments show that the proposed method achieves state-of-the-art results on the present dataset.

An empirical study of using radiology reports and images to improve ICU mortality prediction

no code implementations20 Jun 2023 Mingquan Lin, Song Wang, Ying Ding, Lihui Zhao, Fei Wang, Yifan Peng

Background: The predictive Intensive Care Unit (ICU) scoring system plays an important role in ICU management because it predicts important outcomes, especially mortality.

ICU Mortality Management +1

Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports

1 code implementation14 Jun 2023 Qingqing Zhu, Tejas Sudharshan Mathai, Pritam Mukherjee, Yifan Peng, Ronald M. Summers, Zhiyong Lu

Pre-filling a radiology report holds promise in mitigating reporting errors, and despite efforts in the literature to generate medical reports, there exists a lack of approaches that exploit the longitudinal nature of patient visit records in the MIMIC-CXR dataset.

Decoder speech-recognition +1

Less Likely Brainstorming: Using Language Models to Generate Alternative Hypotheses

no code implementations30 May 2023 Liyan Tang, Yifan Peng, Yanshan Wang, Ying Ding, Greg Durrett, Justin F. Rousseau

To tackle this problem, we propose a controlled text generation method that uses a novel contrastive learning strategy to encourage models to differentiate between generating likely and less likely outputs according to humans.

Contrastive Learning Decision Making +1

A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks

2 code implementations18 May 2023 Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe

Conformer, a convolution-augmented Transformer variant, has become the de facto encoder architecture for speech processing due to its superior performance in various tasks, including automatic speech recognition (ASR), speech translation (ST) and spoken language understanding (SLU).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Generative Modeling via Hierarchical Tensor Sketching

no code implementations11 Apr 2023 Yifan Peng, Yian Chen, E. Miles Stoudenmire, Yuehaw Khoo

We propose a hierarchical tensor-network approach for approximating high-dimensional probability density via empirical distribution.

Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation

1 code implementation ICCV 2023 Xiaoyang Lyu, Peng Dai, Zizhang Li, Dongyu Yan, Yi Lin, Yifan Peng, Xiaojuan Qi

We found that the color rendering loss results in optimization bias against low-intensity areas, causing gradient vanishing and leaving these areas unoptimized.

Neural Rendering Surface Reconstruction

FactReranker: Fact-guided Reranker for Faithful Radiology Report Summarization

no code implementations15 Mar 2023 Qianqian Xie, Jiayu Zhou, Yifan Peng, Fei Wang

We propose to extract medical facts of the input medical report, its gold summary, and candidate summaries based on the RadGraph schema and design the fact-guided reranker to efficiently incorporate the extracted medical facts for selecting the optimal summary.

Graph Generation

Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding

1 code implementation27 Feb 2023 Yifan Peng, Kwangyoun Kim, Felix Wu, Prashant Sridhar, Shinji Watanabe

Self-supervised speech representation learning (SSL) has shown to be effective in various downstream tasks, but SSL models are usually large and slow.

Model Compression Representation Learning +3

Improving Massively Multilingual ASR With Auxiliary CTC Objectives

1 code implementation24 Feb 2023 William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe

In this paper, we introduce our work on improving performance on FLEURS, a 102-language open ASR benchmark, by conditioning the entire model on language identity (LID).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

SpeechLMScore: Evaluating speech generation using speech language model

2 code implementations8 Dec 2022 Soumi Maiti, Yifan Peng, Takaaki Saeki, Shinji Watanabe

While human evaluation is the most reliable metric for evaluating speech generation systems, it is generally costly and time-consuming.

Language Modeling Language Modelling +5

SODA: A Natural Language Processing Package to Extract Social Determinants of Health for Cancer Studies

no code implementations6 Dec 2022 Zehao Yu, Xi Yang, Chong Dang, Prakash Adekkanattu, Braja Gopal Patra, Yifan Peng, Jyotishman Pathak, Debbie L. Wilson, Ching-Yuan Chang, Wei-Hsuan Lo-Ciganic, Thomas J. George, William R. Hogan, Yi Guo, Jiang Bian, Yonghui Wu

Objective: We aim to develop an open-source natural language processing (NLP) package, SODA (i. e., SOcial DeterminAnts), with pre-trained transformer models to extract social determinants of health (SDoH) for cancer patients, examine the generalizability of SODA to a new disease domain (i. e., opioid use), and evaluate the extraction rate of SDoH using cancer populations.

RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

no code implementations15 Oct 2022 Ajay Jaiswal, Kumar Ashutosh, Justin F Rousseau, Yifan Peng, Zhangyang Wang, Ying Ding

Our extensive experiments on popular medical imaging classification tasks (cardiopulmonary disease and lesion classification) using real-world datasets, show the performance benefit of RoS-KD, its ability to distill knowledge from many popular large networks (ResNet-50, DenseNet-121, MobileNet-V2) in a comparatively small network, and its robustness to adversarial attacks (PGD, FSGM).

Classification Knowledge Distillation +1

E-Branchformer: Branchformer with Enhanced merging for speech recognition

1 code implementation30 Sep 2022 Kwangyoun Kim, Felix Wu, Yifan Peng, Jing Pan, Prashant Sridhar, Kyu J. Han, Shinji Watanabe

Conformer, combining convolution and self-attention sequentially to capture both local and global information, has shown remarkable performance and is currently regarded as the state-of-the-art for automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Radiomics-Guided Global-Local Transformer for Weakly Supervised Pathology Localization in Chest X-Rays

1 code implementation10 Jul 2022 Yan Han, Gregory Holste, Ying Ding, Ahmed Tewfik, Yifan Peng, Zhangyang Wang

Using the learned self-attention of its image branch, RGT extracts a bounding box for which to compute radiomic features, which are further processed by the radiomics branch; learned image and radiomic features are then fused and mutually interact via cross-attention layers.

Medical Image Analysis

Radiology Text Analysis System (RadText): Architecture and Evaluation

1 code implementation19 Mar 2022 Song Wang, Mingquan Lin, Ying Ding, George Shih, Zhiyong Lu, Yifan Peng

Analyzing radiology reports is a time-consuming and error-prone task, which raises the need for an efficient automated radiology report analysis system to alleviate the workloads of radiologists and encourage precise diagnosis.

De-identification named-entity-recognition +5

Prior Knowledge Enhances Radiology Report Generation

no code implementations11 Jan 2022 Song Wang, Liyan Tang, Mingquan Lin, George Shih, Ying Ding, Yifan Peng

In this work, we propose to mine and represent the associations among medical findings in an informative knowledge graph and incorporate this prior knowledge with radiology report generation to help improve the quality of generated reports.

CU-UD: text-mining drug and chemical-protein interactions with ensembles of BERT-based models

1 code implementation11 Nov 2021 Mehmet Efruz Karabulut, K. Vijay-Shanker, Yifan Peng

Our system obtained 0. 7708 in precision and 0. 7770 in recall, for an F1 score of 0. 7739, demonstrating the effectiveness of using ensembles of BERT-based language models for automatically detecting relations between chemicals and proteins.

DrugProt

Lymph Node Detection in T2 MRI with Transformers

no code implementations9 Nov 2021 Tejas Sudharshan Mathai, SungWon Lee, Daniel C. Elton, Thomas C. Shen, Yifan Peng, Zhiyong Lu, Ronald M. Summers

Identification of lymph nodes (LN) in T2 Magnetic Resonance Imaging (MRI) is an important step performed by radiologists during the assessment of lymphoproliferative diseases.

RadBERT-CL: Factually-Aware Contrastive Learning For Radiology Report Classification

no code implementations28 Oct 2021 Ajay Jaiswal, Liyan Tang, Meheli Ghosh, Justin Rousseau, Yifan Peng, Ying Ding

Radiology reports are unstructured and contain the imaging findings and corresponding diagnoses transcribed by radiologists which include clinical facts and negated and/or uncertain statements.

Classification Contrastive Learning

SCALP -- Supervised Contrastive Learning for Cardiopulmonary Disease Classification and Localization in Chest X-rays using Patient Metadata

no code implementations27 Oct 2021 Ajay Jaiswal, TianHao Li, Cyprian Zander, Yan Han, Justin F. Rousseau, Yifan Peng, Ying Ding

In this paper, we proposed a novel and simple data augmentation method based on patient metadata and supervised knowledge to create clinically accurate positive and negative augmentations for chest X-rays.

Contrastive Learning Data Augmentation +1

CheXT: Knowledge-Guided Cross-Attention Transformer for Abnormality Classification and Localization in Chest X-rays

no code implementations29 Sep 2021 Yan Han, Ying Ding, Ahmed Tewfik, Yifan Peng, Zhangyang Wang

During training, the image branch leverages its learned attention to estimate pathology localization, which is then utilized to extract radiomic features from images in the radiomics branch.

Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

1 code implementation4 Sep 2021 Zhanghexuan Ji, Mohammad Abuzar Shaikh, Dana Moukheiber, Sargur Srihari, Yifan Peng, Mingchen Gao

Self-supervised learning provides an opportunity to explore unlabeled chest X-rays and their associated free-text reports accumulated in clinical routine without manual supervision.

Representation Learning Self-Supervised Learning +2

A framework for massive scale personalized promotion

no code implementations27 Aug 2021 Yitao Shen, Yue Wang, Xingyu Lu, Feng Qi, Jia Yan, Yixiang Mu, Yao Yang, Yifan Peng, Jinjie Gu

In order to do effective optimization in the second stage, counterfactual prediction and noise-reduction are essential for the first stage.

counterfactual

Improving BERT Model Using Contrastive Learning for Biomedical Relation Extraction

1 code implementation NAACL (BioNLP) 2021 Peng Su, Yifan Peng, K. Vijay-Shanker

In this work, we explore the method of employing contrastive learning to improve the text representation from the BERT model for relation extraction.

Contrastive Learning Data Augmentation +2

Knowledge-Augmented Contrastive Learning for Abnormality Classification and Localization in Chest X-rays with Radiomics using a Feedback Loop

no code implementations11 Apr 2021 Yan Han, Chongyan Chen, Ahmed Tewfik, Benjamin Glicksberg, Ying Ding, Yifan Peng, Zhangyang Wang

The key knob of our framework is a unique positive sampling approach tailored for the medical images, by seamlessly integrating radiomic features as a knowledge augmentation.

Contrastive Learning

Pneumonia Detection on Chest X-ray using Radiomic Features and Contrastive Learning

no code implementations12 Jan 2021 Yan Han, Chongyan Chen, Ahmed H Tewfik, Ying Ding, Yifan Peng

Traditionally, radiomics, as a subfield of radiology that can extract a large number of quantitative features from medical images, demonstrates its potential to facilitate medical imaging diagnosis before the deep learning era.

Contrastive Learning Deep Learning +1

Deep-Learned Broadband Encoding Stochastic Filters for Computational Spectroscopic Instruments

no code implementations17 Dec 2020 Hongya Song, Yaoguang Ma, Yubing Han, Weidong Shen, Wenyi Zhang, Yanghui Li, Xu Liu, Yifan Peng, Xiang Hao

Computational spectroscopic instruments with Broadband Encoding Stochastic (BEST) filters allow the reconstruction of the spectrum at high precision with only a few filters.

Instrumentation and Detectors

Efficient Long-Range Convolutions for Point Clouds

1 code implementation11 Oct 2020 Yifan Peng, Lin Lin, Lexing Ying, Leonardo Zepeda-Núñez

We showcase this framework by introducing a neural network architecture that combines LRC-layers with short-range convolutional layers to accurately learn the energy and force associated with a $N$-body potential.

Navigating the landscape of COVID-19 research through literature analysis: A bird's eye view

no code implementations7 Aug 2020 Lana Yeganova, Rezarta Islamaj, Qingyu Chen, Robert Leaman, Alexis Allot, Chin-Hsuan Wei, Donald C. Comeau, Won Kim, Yifan Peng, W. John Wilbur, Zhiyong Lu

In this study we analyze the LitCovid collection, 13, 369 COVID-19 related articles found in PubMed as of May 15th, 2020 with the purpose of examining the landscape of literature and presenting it in a format that facilitates information navigation and understanding.

Articles Clustering +3

COVID-19-CT-CXR: a freely accessible and weakly labeled chest X-ray and CT image collection on COVID-19 from biomedical literature

1 code implementation11 Jun 2020 Yifan Peng, Yu-Xing Tang, Sung-Won Lee, Yingying Zhu, Ronald M. Summers, Zhiyong Lu

(1) We show that COVID-19-CT-CXR, when used as additional training data, is able to contribute to improved DL performance for the classification of COVID-19 and non-COVID-19 CT. (2) We collected CT images of influenza and trained a DL baseline to distinguish a diagnosis of COVID-19, influenza, or normal or other types of diseases on CT. (3) We trained an unsupervised one-class classifier from non-COVID-19 CXR and performed anomaly detection to detect COVID-19 CXR.

Anomaly Detection Articles +3

MULAN: Multitask Universal Lesion Analysis Network for Joint Lesion Detection, Tagging, and Segmentation

16 code implementations12 Aug 2019 Ke Yan, You-Bao Tang, Yifan Peng, Veit Sandfort, Mohammadhadi Bagheri, Zhiyong Lu, Ronald M. Summers

When reading medical images such as a computed tomography (CT) scan, radiologists generally search across the image to find lesions, characterize and measure them, and then describe them in the radiological report.

Computed Tomography (CT) Lesion Detection +2

A deep learning approach for automated detection of geographic atrophy from color fundus photographs

1 code implementation7 Jun 2019 Tiarnan D. Keenan, Shazia Dharssi, Yifan Peng, Qingyu Chen, Elvira Agrón, Wai T. Wong, Zhiyong Lu, Emily Y. Chew

Results: The deep learning models (GA detection, CGA detection from all eyes, and centrality detection from GA eyes) had AUC of 0. 933-0. 976, 0. 939-0. 976, and 0. 827-0. 888, respectively.

Deep Learning Specificity

A self-attention based deep learning method for lesion attribute detection from CT reports

no code implementations30 Apr 2019 Yifan Peng, Ke Yan, Veit Sandfort, Ronald M. Summers, Zhiyong Lu

In radiology, radiologists not only detect lesions from the medical image, but also describe them with various attributes such as their type, location, size, shape, and intensity.

Attribute Sentence

Fine-grained lesion annotation in CT images with knowledge mined from radiology reports

no code implementations4 Mar 2019 Ke Yan, Yifan Peng, Zhiyong Lu, Ronald M. Summers

To address this problem, we define a set of 145 labels based on RadLex to describe a large variety of lesions in the DeepLesion dataset.

Sentence

MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs

1 code implementation21 Jan 2019 Alistair E. W. Johnson, Tom J. Pollard, Nathaniel R. Greenbaum, Matthew P. Lungren, Chih-ying Deng, Yifan Peng, Zhiyong Lu, Roger G. Mark, Seth J. Berkowitz, Steven Horng

Chest radiography is an extremely powerful imaging modality, allowing for a detailed inspection of a patient's thorax, but requiring specialized training for proper interpretation.

DeepSeeNet: A deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs

1 code implementation19 Nov 2018 Yifan Peng, Shazia Dharssi, Qingyu Chen, Tiarnan D. Keenan, Elvira Agrón, Wai T. Wong, Emily Y. Chew, Zhiyong Lu

DeepSeeNet simulates the human grading process by first detecting individual AMD risk factors (drusen size, pigmentary abnormalities) for each eye and then calculating a patient-based AMD severity score using the AREDS Simplified Severity Scale.

Decision Making General Classification

ML-Net: multi-label classification of biomedical texts with deep neural networks

4 code implementations13 Nov 2018 Jingcheng Du, Qingyu Chen, Yifan Peng, Yang Xiang, Cui Tao, Zhiyong Lu

Due to this nature, the multi-label text classification task is often considered to be more challenging compared to the binary or multi-class text classification problems.

Benchmarking Feature Engineering +5

BioSentVec: creating sentence embeddings for biomedical texts

4 code implementations22 Oct 2018 Qingyu Chen, Yifan Peng, Zhiyong Lu

Sentence embeddings have become an essential part of today's natural language processing (NLP) systems, especially together advanced deep learning methods.

 Ranked #1 on Sentence Embeddings For Biomedical Texts on MedSTS (using extra training data)

Articles Benchmarking +3

Depth and Transient Imaging With Compressive SPAD Array Cameras

no code implementations CVPR 2018 Qilin Sun, Xiong Dun, Yifan Peng, Wolfgang Heidrich

Time-of-flight depth imaging and transient imaging are two imaging modalities that have recently received a lot of interest.

Compressive Sensing

Comment Generation for Source Code: State of the Art, Challenges and Opportunities

no code implementations5 Jan 2018 Xiaoran Wang, Yifan Peng, Benwen Zhang

One way to make software development more efficient is to make the program more readable.

Software Engineering

NegBio: a high-performance tool for negation and uncertainty detection in radiology reports

1 code implementation16 Dec 2017 Yifan Peng, Xiaosong Wang, Le Lu, Mohammadhadi Bagheri, Ronald Summers, Zhiyong Lu

Negative and uncertain medical findings are frequent in radiology reports, but discriminating them from positive findings remains challenging for information extraction.

Benchmarking Negation

Revisiting Cross-Channel Information Transfer for Chromatic Aberration Correction

no code implementations ICCV 2017 Tiancheng Sun, Yifan Peng, Wolfgang Heidrich

Image aberrations can cause severe degradation in image quality for consumer-level cameras, especially under the current tendency to reduce the complexity of lens designs in order to shrink the overall size of modules.

BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations

no code implementations WS 2017 Rezarta Islamaj Do{\u{g}}an, Andrew Chatr-aryamontri, Sun Kim, Chih-Hsuan Wei, Yifan Peng, Donald Comeau, Zhiyong Lu

The Precision Medicine Track in BioCre-ative VI aims to bring together the Bi-oNLP community for a novel challenge focused on mining the biomedical litera-ture in search of mutations and protein-protein interactions (PPI).

Articles Relation Extraction

Deep learning for extracting protein-protein interactions from biomedical literature

no code implementations WS 2017 Yifan Peng, Zhiyong Lu

State-of-the-art methods for protein-protein interaction (PPI) extraction are primarily feature-based or kernel-based by leveraging lexical and syntactic information.

Benchmarking Cross-corpus +3

Studying Relationships between Human Gaze, Description, and Computer Vision

no code implementations CVPR 2013 Kiwon Yun, Yifan Peng, Dimitris Samaras, Gregory J. Zelinsky, Tamara L. Berg

We posit that user behavior during natural viewing of images contains an abundance of information about the content of images as well as information related to user intent and user defined content importance.

Cannot find the paper you are looking for? You can Submit a new open access paper.