Search Results for author: Da Ma

Found 24 papers, 6 papers with code

Reducing Tool Hallucination via Reliability Alignment

no code implementations5 Dec 2024 Hongshen Xu, Su Zhu, Zihan Wang, Hang Zheng, Da Ma, Ruisheng Cao, Shuai Fan, Lu Chen, Kai Yu

Large Language Models (LLMs) have extended their capabilities beyond language generation to interact with external systems through tool calling, offering powerful potential for real-world applications.

Hallucination Text Generation

Compressing KV Cache for Long-Context LLM Inference with Inter-Layer Attention Similarity

no code implementations3 Dec 2024 Da Ma, Lu Chen, Situo Zhang, Yuxun Miao, Su Zhu, Zhi Chen, Hongshen Xu, Hanqi Li, Shuai Fan, Lei Pan, Kai Yu

The increasing context window size in Large Language Models (LLMs), such as the GPT and LLaMA series, has improved their ability to tackle complex, long-text tasks, but at the cost of inference efficiency, particularly regarding memory and computational complexity.

Text Generation

SciDFM: A Large Language Model with Mixture-of-Experts for Science

no code implementations27 Sep 2024 Liangtai Sun, Danyu Luo, Da Ma, Zihan Zhao, Baocai Chen, Zhennan Shen, Su Zhu, Lu Chen, Xin Chen, Kai Yu

We further analyze the expert layers and show that the results of expert selection vary with data from different disciplines.

Language Modelling Large Language Model +1

Evolving Subnetwork Training for Large Language Models

no code implementations11 Jun 2024 Hanqi Li, Lu Chen, Da Ma, Zijian Wu, Su Zhu, Kai Yu

In this paper, inspired by the redundancy in the parameters of large language models, we propose a novel training paradigm: Evolving Subnetwork Training (EST).

Language Modelling Large Language Model

Sparsity-Accelerated Training for Large Language Models

no code implementations3 Jun 2024 Da Ma, Lu Chen, Pengyu Wang, Hongshen Xu, Hanqi Li, Liangtai Sun, Su Zhu, Shuai Fan, Kai Yu

Large language models (LLMs) have demonstrated proficiency across various natural language processing (NLP) tasks but often require additional training, such as continual pre-training and supervised fine-tuning.

Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback

no code implementations27 Mar 2024 Hongshen Xu, Zichen Zhu, Situo Zhang, Da Ma, Shuai Fan, Lu Chen, Kai Yu

Large Language Models (LLMs) often generate erroneous outputs, known as hallucinations, due to their limitations in discerning questions beyond their knowledge scope.

Hallucination

Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding

1 code implementation28 Feb 2024 Hongshen Xu, Lu Chen, Zihan Zhao, Da Ma, Ruisheng Cao, Zichen Zhu, Kai Yu

Additionally, we propose several pre-training tasks to model the interaction among text, structure, and image modalities effectively.

document understanding Information Retrieval +1

ChemDFM: A Large Language Foundation Model for Chemistry

1 code implementation26 Jan 2024 Zihan Zhao, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Yi Xia, Bo Chen, Hongshen Xu, Zichen Zhu, Su Zhu, Shuai Fan, Guodong Shen, Kai Yu, Xin Chen

In its utmost form, such a generalist AI chemist could be referred to as Chemical General Intelligence.

ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL

no code implementations28 Oct 2023 Ruisheng Cao, Hanchong Zhang, Hongshen Xu, Jieyu Li, Da Ma, Lu Chen, Kai Yu

Text-to-SQL aims to generate an executable SQL program given the user utterance and the corresponding database schema.

Decoder Text-To-SQL

Spectral Bandwidth Recovery of Optical Coherence Tomography Images using Deep Learning

no code implementations2 Jan 2023 Timothy T. Yu, Da Ma, Jayden Cole, Myeong Jin Ju, Mirza F. Beg, Marinko V. Sarunic

Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases.

Decision Making Generative Adversarial Network +1

Differential Diagnosis of Frontotemporal Dementia and Alzheimer's Disease using Generative Adversarial Network

no code implementations12 Sep 2021 Da Ma, Donghuan Lu, Karteek Popuri, Mirza Faisal Beg

Frontotemporal dementia and Alzheimer's disease are two common forms of dementia and are easily misdiagnosed as each other due to their similar pattern of clinical symptoms.

Binary Classification Data Augmentation +1

Domain Adaptation via CycleGAN for Retina Segmentation in Optical Coherence Tomography

no code implementations6 Jul 2021 Ricky Chen, Timothy T. Yu, Gavin Xu, Da Ma, Marinko V. Sarunic, Mirza Faisal Beg

In this study, we investigated a learning-based approach of adapting the domain of a publicly available dataset, UK Biobank dataset (UKB).

Decision Making Domain Adaptation

Comprehensive Validation of Automated Whole Body Skeletal Muscle, Adipose Tissue, and Bone Segmentation from 3D CT images for Body Composition Analysis: Towards Extended Body Composition

no code implementations1 Jun 2021 Da Ma, Vincent Chow, Karteek Popuri, Mirza Faisal Beg

The latest advances in computer-assisted precision medicine are making it feasible to move from population-wide models that are useful to discover aggregate patterns that hold for group-based analysis to patient-specific models that can drive patient-specific decisions with regard to treatment choices, and predictions of outcomes of treatment.

Anatomy Image Segmentation +2

Cascaded Deep Neural Networks for Retinal Layer Segmentation of Optical Coherence Tomography with Fluid Presence

no code implementations7 Dec 2019 Donghuan Lu, Morgan Heisler, Da Ma, Setareh Dabiri, Sieun Lee, Gavin Weiguang Ding, Marinko V. Sarunic, Mirza Faisal Beg

Optical coherence tomography (OCT) is a non-invasive imaging technology which can provide micrometer-resolution cross-sectional images of the inner structures of the eye.

Grey matter sublayer thickness estimation in themouse cerebellum

1 code implementation8 Jan 2019 Da Ma, Manuel J. Cardoso, Maria A. Zuluaga, Marc Modat, Nick. Powell, Frances Wiseman, Victor Tybulewicz, Elizabeth Fisher, Mark. F. Lythgoe, Sebastien Ourselin

In this work, we introduce a framework to extract the Purkinje layer within the grey matter, enabling the estimation of the thickness of the cerebellar grey matter, the granular layer and molecular layer from gadolinium-enhanced ex vivo mouse brain MRI.

Automatic structural parcellation of mouse brain MRI using multi-atlas label fusion

1 code implementation27 Jan 2014 Da Ma, Manuel J. Cardoso, Marc Modat, Nick Powell, Jack Wells, Holly Holmes, Frances Wiseman, Victor Tybulewicz, Elizabeth Fisher, Mark F. Lythgoe, Sébastien Ourselin

The segmentation accuracy of the multi-atlas framework was evaluated using publicly available mouse brain atlas databases with pre-segmented manually labelled anatomical structures as the gold standard, and optimised parameters were obtained for the STEPS algorithm in the label fusion to achieve the best segmentation accuracy.

Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.