Search Results for author: Simon Lin

Found 10 papers, 4 papers with code

Phrase2VecGLM: Neural generalized language model--based semantic tagging for complex query reformulation in medical IR

no code implementations WS 2018 Manirupa Das, Eric Fosler-Lussier, Simon Lin, Soheil Moosavinasab, David Chen, Steve Rust, Yungui Huang, Rajiv Ramnath

In this work, we develop a novel, completely unsupervised, neural language model-based document ranking approach to semantic tagging of documents, using the document to be tagged as a query into the GLM to retrieve candidate phrases from top-ranked related documents, thus associating every document with novel related concepts extracted from the text.

Document Ranking Information Retrieval +4

SurfCon: Synonym Discovery on Privacy-Aware Clinical Data

1 code implementation21 Jun 2019 Zhen Wang, Xiang Yue, Soheil Moosavinasab, Yungui Huang, Simon Lin, Huan Sun

To solve the problem, we propose a new framework SurfCon that leverages two important types of information in the privacy-aware clinical data, i. e., the surface form information, and the global context information for synonym discovery.

Distributed representation of patients and its use for medical cost prediction

no code implementations13 Sep 2019 Xianlong Zeng, Soheil Moosavinasab, En-Ju D Lin, Simon Lin, Razvan Bunescu, Chang Liu

Efficient representation of patients is very important in the healthcare domain and can help with many tasks such as medical risk prediction.

Representation Learning

Sequence-to-Set Semantic Tagging: End-to-End Multi-label Prediction using Neural Attention for Complex Query Reformulation and Automated Text Categorization

no code implementations11 Nov 2019 Manirupa Das, Juanxi Li, Eric Fosler-Lussier, Simon Lin, Soheil Moosavinasab, Steve Rust, Yungui Huang, Rajiv Ramnath

Our approach to generate document encodings employing our sequence-to-set models for inference of semantic tags, gives to the best of our knowledge, the state-of-the-art for both, the unsupervised query expansion task for the TREC CDS 2016 challenge dataset when evaluated on an Okapi BM25--based document retrieval system; and also over the MLTM baseline (Soleimani et al, 2016), for both supervised and semi-supervised multi-label prediction tasks on the del. icio. us and Ohsumed datasets.

Multi-Label Classification Retrieval +1

Rationalizing Medical Relation Prediction from Corpus-level Statistics

1 code implementation ACL 2020 Zhen Wang, Jennifer Lee, Simon Lin, Huan Sun

Nowadays, the interpretability of machine learning models is becoming increasingly important, especially in the medical domain.

Decision Making Relation

Sequence-to-Set Semantic Tagging for Complex Query Reformulation and Automated Text Categorization in Biomedical IR using Self-Attention

no code implementations WS 2020 Manirupa Das, Juanxi Li, Eric Fosler-Lussier, Simon Lin, Steve Rust, Yungui Huang, Rajiv Ramnath

Novel contexts, comprising a set of terms referring to one or more concepts, may often arise in complex querying scenarios such as in evidence-based medicine (EBM) involving biomedical literature.

Retrieval Text Categorization

COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval

1 code implementation EMNLP 2021 Xinliang Frederick Zhang, Heming Sun, Xiang Yue, Simon Lin, Huan Sun

For evaluation, we introduce Query Bank and Relevance Set, where the former contains 1, 236 human-paraphrased queries while the latter contains ~32 human-annotated FAQ items for each query.

16k Retrieval

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering

2 code implementations30 Oct 2020 Xiang Yue, Xinliang Frederick Zhang, Ziyu Yao, Simon Lin, Huan Sun

Clinical question answering (QA) aims to automatically answer questions from medical professionals based on clinical texts.

Domain Adaptation Question Answering +2

Transformer-based unsupervised patient representation learning based on medical claims for risk stratification and analysis

no code implementations23 Jun 2021 Xianlong Zeng, Simon Lin, Chang Liu

The claims data, containing medical codes, services information, and incurred expenditure, can be a good resource for estimating an individual's health condition and medical risk level.

Management Representation Learning

Pre-training transformer-based framework on large-scale pediatric claims data for downstream population-specific tasks

no code implementations24 Jun 2021 Xianlong Zeng, Simon Lin, Chang Liu

In addition, our framework showed a great generalizability potential to transfer learned knowledge from one institution to another, paving the way for future healthcare model pre-training across institutions.

Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.