Search Results for author: Yuhao Zhang

Found 71 papers, 38 papers with code

Overview of the MEDIQA 2021 Shared Task on Summarization in the Medical Domain

no code implementations NAACL (BioNLP) 2021 Asma Ben Abacha, Yassine Mrabet, Yuhao Zhang, Chaitanya Shivade, Curtis Langlotz, Dina Demner-Fushman

The MEDIQA 2021 shared tasks at the BioNLP 2021 workshop addressed three tasks on summarization for medical text: (i) a question summarization task aimed at exploring new approaches to understanding complex real-world consumer health queries, (ii) a multi-answer summarization task that targeted aggregation of multiple relevant answers to a biomedical question into one concise and relevant answer, and (iii) a radiology report summarization task addressing the development of clinically relevant impressions from radiology report findings.

Text Summarization

Adaptive Deep Reasoning: Triggering Deep Thinking When Needed

no code implementations26 May 2025 Yunhao Wang, Yuhao Zhang, TingHao Yu, Can Xu, Feng Zhang, Fengzong Lian

More recent approaches have attempted to integrate long-chain and short-chain reasoning abilities into a single model, yet they still rely on manual control to toggle between short and long CoT.

Prompt Engineering reinforcement-learning +1

Anymate: A Dataset and Baselines for Learning 3D Object Rigging

no code implementations9 May 2025 Yufan Deng, Yuhao Zhang, Chen Geng, Shangzhe Wu, Jiajun Wu

Rigging and skinning are essential steps to create realistic 3D animations, often requiring significant expertise and manual effort.

UNILoc: Unified Localization Combining Model-Based Geometry and Unsupervised Learning

no code implementations24 Apr 2025 Yuhao Zhang, Guangjin Pan, Musa Furkan Keskin, Ossi Kaltiokallio, Mikko Valkama, Henk Wymeersch

Accurate mobile device localization is critical for emerging 5G/6G applications such as autonomous vehicles and augmented reality.

Autonomous Vehicles

Ga$_2$O$_3$ TCAD Mobility Parameter Calibration using Simulation Augmented Machine Learning with Physics Informed Neural Network

no code implementations3 Apr 2025 Le Minh Long Nguyen, Edric Ong, Matthew Eng, Yuhao Zhang, Hiu Yung Wong

Schottky Barrier Diode (SBD) fabricated with emerging ultra-wide-bandgap material, Gallium Oxide (Ga$_2$O$_3$), is measured and its current-voltage (IV) is used for Ga$_2$O$_3$ Philips Unified Mobility (PhuMob) model parameters, effective anode workfunction, and ambient temperature extraction (7 parameters).

Model Predictive Control for Tracking Bounded References With Arbitrary Dynamics

no code implementations26 Mar 2025 Shibo Han, Bonan Hou, Yuhao Zhang, Xiaotong Shi, Xingwei Zhao

Cost function penalizes both artificial state error and reference error, while terminal constraint is imposed on artificial state error and artificial reference.

Model Predictive Control

Efficient Reachability Analysis for Convolutional Neural Networks Using Hybrid Zonotopes

no code implementations13 Mar 2025 Yuhao Zhang, Xiangru Xu

Feedforward neural networks are widely used in autonomous systems, particularly for control and perception tasks within the system loop.

Efficient Neural Network

Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders

no code implementations21 Feb 2025 Weiqiao Shan, Yuang Li, Yuhao Zhang, Yingfeng Luo, Chen Xu, Xiaofeng Zhao, Long Meng, Yunfei Lu, Min Zhang, Hao Yang, Tong Xiao, Jingbo Zhu

Connecting audio encoders with large language models (LLMs) allows the LLM to perform various audio understanding tasks, such as automatic speech recognition (ASR) and audio captioning (AC).

Audio captioning Automatic Speech Recognition +2

Soundwave: Less is More for Speech-Text Alignment in LLMs

1 code implementation18 Feb 2025 Yuhao Zhang, Zhiheng Liu, Fan Bu, Ruiyu Zhang, Benyou Wang, Haizhou Li

Existing end-to-end speech large language models (LLMs) usually rely on large-scale annotated data for training, while data-efficient training has not been discussed in depth.

Optimizing Speech Multi-View Feature Fusion through Conditional Computation

1 code implementation14 Jan 2025 Weiqiao Shan, Yuhao Zhang, Yuchen Han, Bei Li, Xiaofeng Zhao, Yuang Li, Min Zhang, Hao Yang, Tong Xiao, Jingbo Zhu

Recent advancements have highlighted the efficacy of self-supervised learning (SSL) features in various speech-related tasks, providing lightweight and versatile multi-view speech representations.

Self-Supervised Learning

Robust Model Predictive Control for Constrained Uncertain Systems Based on Concentric Container and Varying Tube

no code implementations4 Dec 2024 Shibo Han, Yuhao Zhang, Xiaotong Shi, Xingwei Zhao

By restricting states and the corresponding inputs in containers with free sizes and a fixed shape, feasible MDs, which are the products of model uncertainty with states and inputs, are restricted into polytopes with free sizes.

Model Predictive Control

Roadmap towards Superhuman Speech Understanding using Large Language Models

no code implementations17 Oct 2024 Fan Bu, Yuhao Zhang, Xidong Wang, Benyou Wang, Qun Liu, Haizhou Li

The success of large language models (LLMs) has prompted efforts to integrate speech and audio data, aiming to create general foundation models capable of processing both textual and non-textual inputs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

1 code implementation17 Sep 2024 Orion Weller, Benjamin Van Durme, Dawn Lawrie, Ashwin Paranjape, Yuhao Zhang, Jack Hessel

Instruction-tuned language models (LM) are able to respond to imperative commands, providing a more natural user interface compared to their base counterparts.

Information Retrieval Retrieval

Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models

1 code implementation31 Jul 2024 Zhengxuan Wu, Yuhao Zhang, Peng Qi, Yumo Xu, Rujun Han, Yian Zhang, Jifan Chen, Bonan Min, Zhiheng Huang

Surprisingly, we find that less is more, as training ReSet with high-quality, yet substantially smaller data (three-fold less) yields superior results.

Instruction Following Multi-Task Learning

RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering

1 code implementation19 Jul 2024 Rujun Han, Yuhao Zhang, Peng Qi, Yumo Xu, Jenyuan Wang, Lan Liu, William Yang Wang, Bonan Min, Vittorio Castelli

Question answering based on retrieval augmented generation (RAG-QA) is an important research topic in NLP and has a wide range of real-world applications.

Domain Generalization Form +6

A One-Layer Decoder-Only Transformer is a Two-Layer RNN: With an Application to Certified Robustness

no code implementations27 May 2024 Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

This paper reveals a key insight that a one-layer decoder-only Transformer is equivalent to a two-layer Recurrent Neural Network (RNN).

ARC Decoder +1

NGD-SLAM: Towards Real-Time Dynamic SLAM without GPU

1 code implementation12 May 2024 Yuhao Zhang, Mihai Bujanca, Mikel Luján

However, these methods incur significant computational overhead as the camera tracking needs to wait for the deep neural network to generate mask at each frame, and they typically require GPUs for real-time operation, which restricts their practicality in real-world robotic applications.

Deep Learning Optical Flow Estimation +1

CodeFort: Robust Training for Code Generation Models

no code implementations11 Apr 2024 Yuhao Zhang, Shiqi Wang, Haifeng Qian, Zijian Wang, Mingyue Shang, Linbo Liu, Sanjay Krishna Gouda, Baishakhi Ray, Murali Krishna Ramanathan, Xiaofei Ma, Anoop Deoras

Code generation models are not robust to small perturbations, which often lead to incorrect generations and significantly degrade the performance of these models.

Code Generation Contrastive Learning +1

Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models

1 code implementation18 Mar 2024 Yi Luo, Zhenghao Lin, Yuhao Zhang, Jiashuo Sun, Chen Lin, Chengjin Xu, Xiangdong Su, Yelong Shen, Jian Guo, Yeyun Gong

Subsequently, the retrieval model correlates new inputs with relevant guidelines, which guide LLMs in response generation to ensure safe and high-quality outputs, thereby aligning with human values.

Response Generation Retrieval

Verified Training for Counterfactual Explanation Robustness under Data Shift

no code implementations6 Mar 2024 Anna P. Meyer, Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

Our empirical evaluation demonstrates that VeriTraCER generates CEs that (1) are verifiably robust to small model updates and (2) display competitive robustness to state-of-the-art approaches in handling empirical model updates including random initialization, leave-one-out, and distribution shifts.

counterfactual Counterfactual Explanation

DragVideo: Interactive Drag-style Video Editing

1 code implementation3 Dec 2023 Yufan Deng, Ruida Wang, Yuhao Zhang, Yu-Wing Tai, Chi-Keung Tang

The main issues are: 1) how to perform direct and accurate user control in editing; 2) how to execute editings like changing shape, expression, and layout without unsightly distortion and artifacts to the edited content; and 3) how to maintain spatio-temporal consistency of video after editing.

Video Editing Video Generation

Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models

1 code implementation30 Nov 2023 Dong Li, Jiandong Jin, Yuhao Zhang, Yanlin Zhong, Yaoyang Wu, Lan Chen, Xiao Wang, Bin Luo

Current methods typically employ backbone networks to individually extract the features of RGB frames and event streams, and subsequently fuse these features for pattern recognition.

Language Modelling Prompt Engineering

TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System

no code implementations11 Nov 2023 Haoyuan Li, Hao Jiang, Tianke Zhang, Zhelun Yu, Aoxiong Yin, Hao Cheng, Siming Fu, Yuhao Zhang, Wanggui He

We anticipate that our work will contribute to the advancement of research on TrainerAgent in both academic and industry communities, potentially establishing it as a new paradigm for model development in the field of AI.

Decision Making Language Modelling +1

Rethinking and Improving Multi-task Learning for End-to-end Speech Translation

1 code implementation7 Nov 2023 Yuhao Zhang, Chen Xu, Bei Li, Hao Chen, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Significant improvements in end-to-end speech translation (ST) have been achieved through the application of multi-task learning.

Multi-Task Learning

Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition

1 code implementation21 Sep 2023 Chen Xu, Xiaoqian Liu, Erfeng He, Yuhao Zhang, Qianqian Dong, Tong Xiao, Jingbo Zhu, Dapeng Man, Wu Yang

In this study, we present synchronous bilingual Connectionist Temporal Classification (CTC), an innovative framework that leverages dual CTC to bridge the gaps of both modality and language in the speech translation (ST) task.

speech-recognition Speech Recognition +1

Channel sensing for holographic interference surfaces based on the principle of interferometry

no code implementations20 Aug 2023 Jindiao Huang, Yuyao Wu, Haifan Yin, Yuhao Zhang, Ruikun Zhang

In this paper, we derive the principles of holographic interference theory for electromagnetic wave reception and transmission, whereby the optical holography is extended to communication holography and a channel sensing architecture for holographic interference surfaces is established.

CTC-based Non-autoregressive Speech Translation

1 code implementation27 May 2023 Chen Xu, Xiaoqian Liu, Xiaowen Liu, Qingxuan Sun, Yuhao Zhang, Murun Yang, Qianqian Dong, Tom Ko, Mingxuan Wang, Tong Xiao, Anxiang Ma, Jingbo Zhu

Combining end-to-end speech translation (ST) and non-autoregressive (NAR) generation is promising in language and speech processing for their advantages of less error propagation and low latency.

Translation

Bridging the Granularity Gap for Acoustic Modeling

1 code implementation27 May 2023 Chen Xu, Yuhao Zhang, Chengbo Jiao, Xiaoqian Liu, Chi Hu, Xin Zeng, Tong Xiao, Anxiang Ma, Huizhen Wang, Jingbo Zhu

While Transformer has become the de-facto standard for speech, modeling upon the fine-grained frame-level features remains an open challenge of capturing long-distance dependencies and distributing the attention weights.

speech-recognition Speech Recognition

A multi-functional simulation platform for on-demand ride service operations

1 code implementation22 Mar 2023 Siyuan Feng, Taijie Chen, Yuhao Zhang, Jintao Ke, Zhengfei Zheng, Hai Yang

In addition, the existing simulators still face many challenges, ranging from their closeness to real environments of ride-sourcing systems, to the completeness of different tasks they can implement.

Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects

1 code implementation13 Feb 2023 Linyi Li, Yuhao Zhang, Luyao Ren, Yingfei Xiong, Tao Xie

To assure high reliability against numerical defects, in this paper, we propose the RANUM approach including novel techniques for three reliability assurance tasks: detection of potential numerical defects, confirmation of potential-defect feasibility, and suggestion of defect fixes.

PECAN: A Deterministic Certified Defense Against Backdoor Attacks

no code implementations27 Jan 2023 Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

Neural networks are vulnerable to backdoor poisoning attacks, where the attackers maliciously poison the training set and insert triggers into the test input to change the prediction of the victim model.

backdoor defense image-classification +2

Tokenization Consistency Matters for Generative Models on Extractive NLP Tasks

1 code implementation19 Dec 2022 Kaiser Sun, Peng Qi, Yuhao Zhang, Lan Liu, William Yang Wang, Zhiheng Huang

We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets, with a notable average of +1. 7 F2 gain when a BART model is trained on SQuAD and evaluated on 8 QA datasets.

Extractive Question-Answering Hallucination +1

Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations

no code implementations17 Dec 2022 Jifan Chen, Yuhao Zhang, Lan Liu, Rui Dong, Xinchi Chen, Patrick Ng, William Yang Wang, Zhiheng Huang

There has been great progress in unifying various table-to-text tasks using a single encoder-decoder model trained via multi-task learning (Xie et al., 2022).

Decoder Multi-Task Learning

Overwatch: Learning Patterns in Code Edit Sequences

no code implementations25 Jul 2022 Yuhao Zhang, Yasharth Bajpai, Priyanshu Gupta, Ameya Ketkar, Miltiadis Allamanis, Titus Barik, Sumit Gulwani, Arjun Radhakrishna, Mohammad Raza, Gustavo Soares, Ashish Tiwari

Our experiments show that Overwatch has 78% precision and that Overwatch not only completed edits when developers missed the opportunity to use the IDE tool support but also predicted new edits that have no tool support in the IDE.

Robustar: Interactive Toolbox Supporting Precise Data Annotation for Robust Vision Learning

1 code implementation18 Jul 2022 Chonghan Chen, Haohan Wang, Leyang Hu, Yuhao Zhang, Shuguang Lyu, Jingcheng Wu, Xinnuo Li, Linjing Sun, Eric P. Xing

We introduce the initial release of our software Robustar, which aims to improve the robustness of vision classification machine learning models through a data-driven perspective.

BIG-bench Machine Learning image-classification +1

Vertical GaN Diode BV Maximization through Rapid TCAD Simulation and ML-enabled Surrogate Model

no code implementations18 Jul 2022 Albert Lu, Jordan Marshall, Yifan Wang, Ming Xiao, Yuhao Zhang, Hiu Yung Wong

In this paper, two methodologies are used to speed up the maximization of the breakdown volt-age (BV) of a vertical GaN diode that has a theoretical maximum BV of ~2100V.

An Ultra-low Power TinyML System for Real-time Visual Processing at Edge

1 code implementation11 Jul 2022 Kunran Xu, Huawei Zhang, Yishi Li, Yuhao Zhang, Rui Lai, Yi Liu

Tiny machine learning (TinyML), executing AI workloads on resource and power strictly restricted systems, is an important and challenging topic.

object-detection Object Detection

BagFlip: A Certified Defense against Data Poisoning

1 code implementation26 May 2022 Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

Machine learning models are vulnerable to data-poisoning attacks, in which an attacker maliciously modifies the training set to change the prediction of a learned model.

Backdoor Attack Data Poisoning +3

Towards Lossless ANN-SNN Conversion under Ultra-Low Latency with Dual-Phase Optimization

1 code implementation16 May 2022 ZiMing Wang, Shuang Lian, Yuhao Zhang, Xiaoxin Cui, Rui Yan, Huajin Tang

By evaluating on challenging datasets including CIFAR-10, CIFAR- 100 and ImageNet, the proposed method demonstrates the state-of-the-art performance in terms of accuracy, latency and energy preservation.

object-detection Object Detection +1

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

1 code implementation28 Jun 2021 Saahil Jain, Ashwin Agrawal, Adriel Saporta, Steven QH Truong, Du Nguyen Duong, Tan Bui, Pierre Chambon, Yuhao Zhang, Matthew P. Lungren, Andrew Y. Ng, Curtis P. Langlotz, Pranav Rajpurkar

We release a development dataset, which contains board-certified radiologist annotations for 500 radiology reports from the MIMIC-CXR dataset (14, 579 entities and 10, 889 relations), and a test dataset, which contains two independent sets of board-certified radiologist annotations for 100 radiology reports split equally across the MIMIC-CXR and CheXpert datasets.

Relation Extraction

Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders

no code implementations ACL 2021 Chen Xu, Bojie Hu, Yanyang Li, Yuhao Zhang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

To our knowledge, we are the first to develop an end-to-end ST system that achieves comparable or even better BLEU performance than the cascaded ST counterpart when large-scale ASR and MT data is available.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Brain Tumors Classification for MR images based on Attention Guided Deep Learning Model

no code implementations6 Apr 2021 Yuhao Zhang, Shuhang Wang, Haoxiang Wu, Kejia Hu, Shufan Ji

In the clinical diagnosis and treatment of brain tumors, manual image reading consumes a lot of energy and time.

General Classification

Certified Robustness to Programmable Transformations in LSTMs

1 code implementation EMNLP 2021 Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

Deep neural networks for natural language processing are fragile in the face of adversarial examples -- small input perturbations, like synonym substitution or word duplication, which cause a neural network to change its prediction.

Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation

3 code implementations NAACL 2021 Yasuhide Miura, Yuhao Zhang, Emily Bao Tsai, Curtis P. Langlotz, Dan Jurafsky

We further show via a human evaluation and a qualitative analysis that our system leads to generations that are more factually complete and consistent compared to the baselines.

Image to text Natural Language Inference +1

Contrastive Learning of Medical Visual Representations from Paired Images and Text

7 code implementations2 Oct 2020 Yuhao Zhang, Hang Jiang, Yasuhide Miura, Christopher D. Manning, Curtis P. Langlotz

Existing work commonly relies on fine-tuning weights transferred from ImageNet pretraining, which is suboptimal due to drastically different image characteristics, or rule-based label extraction from the textual report data paired with medical images, which is inaccurate and hard to generalize.

Contrastive Learning Descriptive +4

Do Syntax Trees Help Pre-trained Transformers Extract Information?

1 code implementation EACL 2021 Devendra Singh Sachan, Yuhao Zhang, Peng Qi, William Hamilton

Our empirical analysis demonstrates that these syntax-infused transformers obtain state-of-the-art results on SRL and relation extraction tasks.

Graph Neural Network named-entity-recognition +5

Learning Architectures from an Extended Search Space for Language Modeling

no code implementations ACL 2020 Yinqiao Li, Chi Hu, Yuhao Zhang, Nuo Xu, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, Changliang Li

Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell.

Chunking Language Modeling +5

Robustness to Programmable String Transformations via Augmented Abstract Training

1 code implementation ICML 2020 Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

We then present an approach to adversarially training models that are robust to such user-defined string transformations.

Learning to Summarize Radiology Findings

2 code implementations WS 2018 Yuhao Zhang, Daisy Yi Ding, Tianpei Qian, Christopher D. Manning, Curtis P. Langlotz

The Impression section of a radiology report summarizes crucial radiology findings in natural language and plays a central role in communicating these findings to physicians.

MULDEF: Multi-model-based Defense Against Adversarial Examples for Neural Networks

no code implementations31 Aug 2018 Siwakorn Srisakaokul, Yuhao Zhang, Zexuan Zhong, Wei Yang, Tao Xie, Bo Li

In particular, given a target model, our framework includes multiple models (constructed from the target model) to form a model family.

Diversity

Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search

3 code implementations21 May 2018 Jinfeng Rao, Wei Yang, Yuhao Zhang, Ferhan Ture, Jimmy Lin

To our best knowledge, this paper presents the first substantial work tackling search over social media posts using neural ranking models.

Information Retrieval Retrieval

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

2 code implementations30 Jan 2018 Xuan Wang, Yu Zhang, Xiang Ren, Yuhao Zhang, Marinka Zitnik, Jingbo Shang, Curtis Langlotz, Jiawei Han

Motivation: State-of-the-art biomedical named entity recognition (BioNER) systems often require handcrafted features specific to each entity type, such as genes, chemicals and diseases.

Feature Engineering Multi-Task Learning +4

Segmental Convolutional Neural Networks for Detection of Cardiac Abnormality With Noisy Heart Sound Recordings

no code implementations6 Dec 2016 Yuhao Zhang, Sandeep Ayyar, Long-Huei Chen, Ethan J. Li

Heart diseases constitute a global health burden, and the problem is exacerbated by the error-prone nature of listening to and interpreting heart sounds.

Classification General Classification

Deep Convolutional Network for Handwritten Chinese Character Recognition

1 code implementation standford.edu 2015 Yuhao Zhang

In this project we explored the performance of deep convolutional neural network on recognizing handwritten Chinese characters.

Cannot find the paper you are looking for? You can Submit a new open access paper.