Search Results for author: Xiang Dai

Found 17 papers, 5 papers with code

Identifying Health Risks from Family History: A Survey of Natural Language Processing Techniques

no code implementations15 Mar 2024 Xiang Dai, Sarvnaz Karimi, Nathan O'Callaghan

In addition to the areas where NLP has successfully been utilised, we also identify the areas where more research is needed to unlock the value of patients' records regarding data collection, task formulation and downstream applications.

Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods

no code implementations24 Nov 2022 Xiang Dai, Sarvnaz Karimi

Information Extraction from scientific literature can be challenging due to the highly specialised nature of such text.

An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification

no code implementations11 Oct 2022 Ilias Chalkidis, Xiang Dai, Manos Fergadiotis, Prodromos Malakasiotis, Desmond Elliott

Non-hierarchical sparse attention Transformer-based models, such as Longformer and Big Bird, are popular approaches to working with long documents.

Document Classification

Tensorial tomographic differential phase-contrast microscopy

no code implementations25 Apr 2022 Shiqi Xu, Xiang Dai, Xi Yang, Kevin C. Zhou, Kanghyun Kim, Vinayak Pathak, Carolyn Glass, Roarke Horstmeyer

We report Tensorial Tomographic Differential Phase-Contrast microscopy (T2DPC), a quantitative label-free tomographic imaging method for simultaneous measurement of phase and anisotropy.

Revisiting Transformer-based Models for Long Document Classification

1 code implementation14 Apr 2022 Xiang Dai, Ilias Chalkidis, Sune Darkner, Desmond Elliott

The recent literature in text classification is biased towards short text sequences (e. g., sentences or paragraphs).

Document Classification text-classification

MDAPT: Multilingual Domain Adaptive Pretraining in a Single Model

1 code implementation Findings (EMNLP) 2021 Rasmus Kær Jørgensen, Mareike Hartmann, Xiang Dai, Desmond Elliott

Domain adaptive pretraining, i. e. the continued unsupervised pretraining of a language model on domain-specific text, improves the modelling of text for downstream tasks within the domain.

Language Modelling named-entity-recognition +4

Recognising Biomedical Names: Challenges and Solutions

no code implementations23 Jun 2021 Xiang Dai

However, there are several open challenges of applying these models to recognise biomedical names: 1) Biomedical names may contain complex inner structure (discontinuity and overlapping) which cannot be recognised using standard sequence tagging technique; 2) The training of NER models usually requires large amount of labelled data, which are difficult to obtain in the biomedical domain; and, 3) Commonly used language representation models are pre-trained on generic data; a domain shift therefore exists between these models and target biomedical data.

Data Augmentation NER

NLNDE at CANTEMIST: Neural Sequence Labeling and Parsing Approaches for Clinical Concept Extraction

no code implementations23 Oct 2020 Lukas Lange, Xiang Dai, Heike Adel, Jannik Strötgen

The recognition and normalization of clinical information, such as tumor morphology mentions, is an important, but complex process consisting of multiple subtasks.

Clinical Concept Extraction

An Analysis of Simple Data Augmentation for Named Entity Recognition

2 code implementations COLING 2020 Xiang Dai, Heike Adel

Simple yet effective data augmentation techniques have been proposed for sentence-level and sentence-pair natural language processing tasks.

Data Augmentation named-entity-recognition +3

An Effective Transition-based Model for Discontinuous NER

1 code implementation ACL 2020 Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris

Unlike widely used Named Entity Recognition (NER) data sets in generic domains, biomedical NER data sets often contain mentions consisting of discontinuous spans.

named-entity-recognition Named Entity Recognition +1

Using Similarity Measures to Select Pretraining Data for NER

1 code implementation NAACL 2019 Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris

Word vectors and Language Models (LMs) pretrained on a large amount of unlabelled data can dramatically improve various Natural Language Processing (NLP) tasks.

named-entity-recognition Named Entity Recognition

Recognizing Complex Entity Mentions: A Review and Future Directions

no code implementations ACL 2018 Xiang Dai

Standard named entity recognizers can effectively recognize entity mentions that consist of contiguous tokens and do not overlap with each other.

Entity Linking Named Entity Recognition (NER) +3

Cannot find the paper you are looking for? You can Submit a new open access paper.