Search Results for author: Hoifung Poon

Found 54 papers, 16 papers with code

Offset Unlearning for Large Language Models

no code implementations17 Apr 2024 James Y. Huang, Wenxuan Zhou, Fei Wang, Fred Morstatter, Sheng Zhang, Hoifung Poon, Muhao Chen

Despite the strong capabilities of Large Language Models (LLMs) to acquire knowledge from their training corpora, the memorization of sensitive information in the corpora such as copyrighted, harmful, and private content has led to ethical and legal concerns.

Memorization

T-Rex: Text-assisted Retrosynthesis Prediction

1 code implementation26 Jan 2024 Yifeng Liu, Hanwen Xu, Tangqi Fang, Haocheng Xi, Zixuan Liu, Sheng Zhang, Hoifung Poon, Sheng Wang

As a fundamental task in computational chemistry, retrosynthesis prediction aims to identify a set of reactants to synthesize a target molecule.

Re-Ranking Retrosynthesis

BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys

no code implementations16 Oct 2023 Yu Gu, Jianwei Yang, Naoto Usuyama, Chunyuan Li, Sheng Zhang, Matthew P. Lungren, Jianfeng Gao, Hoifung Poon

In a comprehensive battery of tests on counterfactual medical image generation, BiomedJourney substantially outperforms prior state-of-the-art methods in instruction image editing and medical image generation such as InstructPix2Pix and RoentGen.

counterfactual Denoising +2

Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events

no code implementations12 Jul 2023 Yu Gu, Sheng Zhang, Naoto Usuyama, Yonas Woldesenbet, Cliff Wong, Praneeth Sanapathi, Mu Wei, Naveen Valluri, Erika Strandberg, Tristan Naumann, Hoifung Poon

We find that while LLMs already possess decent competency in structuring biomedical text, by distillation into a task-specific student model through self-supervised learning, substantial gains can be attained over out-of-box LLMs, with additional advantages such as cost, efficiency, and white-box model access.

Self-Supervised Learning

Automatic Calibration and Error Correction for Generative Large Language Models via Pareto Optimal Self-Supervision

no code implementations28 Jun 2023 Theodore Zhao, Mu Wei, J. Samuel Preston, Hoifung Poon

Generative Large language models (LLMs) have demonstrated remarkable capabilities for a wide range of applications, but reducing ungrounded or erroneous responses remains a major growth area.

Relation Extraction

LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day

no code implementations NeurIPS 2023 Chunyuan Li, Cliff Wong, Sheng Zhang, Naoto Usuyama, Haotian Liu, Jianwei Yang, Tristan Naumann, Hoifung Poon, Jianfeng Gao

In this paper, we propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.

Instruction Following Language Modelling +2

Self-Verification Improves Few-Shot Clinical Information Extraction

1 code implementation30 May 2023 Zelalem Gero, Chandan Singh, Hao Cheng, Tristan Naumann, Michel Galley, Jianfeng Gao, Hoifung Poon

Extracting patient information from unstructured text is a critical task in health decision-support and clinical research.

In-Context Learning

Context-faithful Prompting for Large Language Models

1 code implementation20 Mar 2023 Wenxuan Zhou, Sheng Zhang, Hoifung Poon, Muhao Chen

However, their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks (e. g., knowledge acquisition tasks).

counterfactual Machine Reading Comprehension +1

BLIAM: Literature-based Data Synthesis for Synergistic Drug Combination Prediction

no code implementations14 Feb 2023 Cai Yang, Addie Woicik, Hoifung Poon, Sheng Wang

Instead of obtaining features from language models, we propose BLIAM, a literature-based data synthesis approach to directly generate training data points that are interpretable and model-agnostic to downstream applications.

Data Augmentation Language Modelling

Continual Contrastive Finetuning Improves Low-Resource Relation Extraction

no code implementations21 Dec 2022 Wenxuan Zhou, Sheng Zhang, Tristan Naumann, Muhao Chen, Hoifung Poon

In this paper, we aim at bridging the gap and propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.

Contrastive Learning Relation +3

BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

2 code implementations19 Oct 2022 Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, Tie-Yan Liu

Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain.

 Ranked #1 on Document Classification on HOC (Micro F1 metric)

Document Classification Language Modelling +3

Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning

1 code implementation30 Aug 2022 Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon

We present a bi-encoder framework for named entity recognition (NER), which applies contrastive learning to map candidate text spans and entity types into the same vector representation space.

Contrastive Learning Metric Learning +5

Making the Most of Text Semantics to Improve Biomedical Vision--Language Processing

1 code implementation21 Apr 2022 Benedikt Boecking, Naoto Usuyama, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Stephanie Hyland, Maria Wetscherek, Tristan Naumann, Aditya Nori, Javier Alvarez-Valle, Hoifung Poon, Ozan Oktay

We release a new dataset with locally-aligned phrase grounding annotations by radiologists to facilitate the study of complex semantic modelling in biomedical vision--language processing.

Contrastive Learning Language Modelling +4

Knowledge-Rich Self-Supervision for Biomedical Entity Linking

no code implementations15 Dec 2021 Sheng Zhang, Hao Cheng, Shikhar Vashishth, Cliff Wong, Jinfeng Xiao, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon

Zero-shot entity linking has emerged as a promising direction for generalizing to new entities, but it still requires example gold entity mentions during training and canonical descriptions for all entities, both of which are rarely available outside of Wikipedia.

Contrastive Learning Entity Linking

Modular Self-Supervision for Document-Level Relation Extraction

no code implementations EMNLP 2021 Sheng Zhang, Cliff Wong, Naoto Usuyama, Sarthak Jain, Tristan Naumann, Hoifung Poon

Extracting relations across large text spans has been relatively underexplored in NLP, but it is particularly important for high-value domains such as biomedicine, where obtaining high recall of the latest findings is crucial for practical applications.

Document-level Relation Extraction Reading Comprehension +1

Combining Probabilistic Logic and Deep Learning for Self-Supervised Learning

no code implementations27 Jul 2021 Hoifung Poon, Hai Wang, Hunter Lang

We first present deep probabilistic logic(DPL), which offers a unifying framework for task-specific self-supervision by composing probabilistic logic with deep learning.

Active Learning Language Modelling +5

Targeted Adversarial Training for Natural Language Understanding

1 code implementation NAACL 2021 Lis Pereira, Xiaodong Liu, Hao Cheng, Hoifung Poon, Jianfeng Gao, Ichiro Kobayashi

We present a simple yet effective Targeted Adversarial Training (TAT) algorithm to improve adversarial training for natural language understanding.

Natural Language Understanding

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

1 code implementation31 Jul 2020 Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon

In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models.

Continual Pretraining +11

Adversarial Training for Large Neural Language Models

3 code implementations20 Apr 2020 Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon, Jianfeng Gao

In natural language processing (NLP), pre-training large neural language models such as BERT have demonstrated impressive gain in generalization for a variety of tasks, with further improvement from adversarial fine-tuning.

Ranked #6 on Natural Language Inference on ANLI test (using extra training data)

Natural Language Inference Natural Language Understanding

Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision

no code implementations EMNLP 2018 Hai Wang, Hoifung Poon

In this paper, we propose deep probabilistic logic (DPL) as a general framework for indirect supervision, by composing probabilistic logic with deep learning.

Reading Comprehension Representation Learning

Neural-Symbolic Learning and Reasoning: A Survey and Interpretation

no code implementations10 Nov 2017 Tarek R. Besold, Artur d'Avila Garcez, Sebastian Bader, Howard Bowman, Pedro Domingos, Pascal Hitzler, Kai-Uwe Kuehnberger, Luis C. Lamb, Daniel Lowd, Priscila Machado Vieira Lima, Leo de Penning, Gadi Pinkas, Hoifung Poon, Gerson Zaverucha

Recent studies in cognitive science, artificial intelligence, and psychology have produced a number of cognitive models of reasoning, learning, and language that are underpinned by computation.

Philosophy

EZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation

no code implementations25 Sep 2017 Maxim Grechkin, Hoifung Poon, Bill Howe

In science and other high-value domains, large repositories of data samples are often available, together with two sources of organic supervision: a lexicon for the annotation classes, and text descriptions that accompany some data samples.

NLP for Precision Medicine

no code implementations ACL 2017 Hoifung Poon, Chris Quirk, Kristina Toutanova, Wen-tau Yih

We will introduce precision medicine and showcase the vast opportunities for NLP in this burgeoning field with great societal impact.

Decision Making Entity Linking +2

Distant Supervision for Relation Extraction beyond the Sentence Boundary

no code implementations EACL 2017 Chris Quirk, Hoifung Poon

At the core of our approach is a graph representation that can incorporate both standard dependencies and discourse relations, thus providing a unifying way to model relations within and across sentences.

Relation Relation Extraction +1

Sum-Product Networks: A New Deep Architecture

1 code implementation14 Feb 2012 Hoifung Poon, Pedro Domingos

Experiments show that inference and learning with SPNs can be both faster and more accurate than with standard deep networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.