Compression Network with Transformer for Approximate Nearest Neighbor Search

30 Jul 2021 Haokui Zhang, Wenze Hu, Buzhou Tang, Xiaoyu Wang

Specifically, we propose a new network structure called Compression Network with Transformer (CNT) to compress the feature into a low dimensional space, and an inhomogeneous neighborhood relationship preserving (INRP) loss that aims to maintain high search accuracy.

Information Retrieval Quantization

Decomposing Word Embedding with the Capsule Network

7 Apr 2020 Xin Liu, Qingcai Chen, Yan Liu, Joanna Siebert, Baotian Hu, Xiang-Ping Wu, Buzhou Tang

We propose a Capsule network-based method to Decompose the unsupervised word Embedding of an ambiguous word into context specific Sense embedding, called CapsDecE2S.

Word Embeddings Word Sense Disambiguation

A Deep Learning-Based System for PharmaCoNER

WS 2019 Ying Xiong, Yedan Shen, Yuanhang Huang, Shuai Chen, Buzhou Tang, Xiaolong Wang, Qingcai Chen, Jun Yan, Yi Zhou

The Biological Text Mining Unit at BSC and CNIO organized the first shared task on chemical {\&} drug mention recognition from Spanish medical texts called PharmaCoNER (Pharmacological Substances, Compounds and proteins and Named Entity Recognition track) in 2019, which includes two tracks: one for NER offset and entity classification (track 1) and the other one for concept indexing (track 2).

General Classification Named Entity Recognition

Trigger Word Detection and Thematic Role Identification via BERT and Multitask Learning

WS 2019 Dongfang Li, Ying Xiong, Baotian Hu, Hanyang Du, Buzhou Tang, Qingcai Chen

In this paper, we present our approaches for trigger word detection (task 1) and the identification of its thematic role (task 2) in AGAC track of BioNLP Open Shared Task 2019.

Drug Discovery Multi-Task Learning

HITSZ-ICRC: A Report for SMM4H Shared Task 2019-Automatic Classification and Extraction of Adverse Effect Mentions in Tweets

WS 2019 Shuai Chen, Yuanhang Huang, Xiaowei Huang, Haoming Qin, Jun Yan, Buzhou Tang

This is the system description of the Harbin Institute of Technology Shenzhen (HITSZ) team for the first and second subtasks of the fourth Social Media Mining for Health Applications (SMM4H) shared task in 2019.

The BQ Corpus: A Large-scale Domain-specific Chinese Corpus For Sentence Semantic Equivalence Identification

EMNLP 2018 Jing Chen, Qingcai Chen, Xin Liu, Haijun Yang, Daohe Lu, Buzhou Tang

As the largest manually annotated public Chinese SSEI corpus in the bank domain, the BQ corpus is not only useful for Chinese question semantic matching research, but also a significant resource for cross-lingual and cross-domain SSEI research.

Paraphrase Identification Question Answering

LCQMC:A Large-scale Chinese Question Matching Corpus

COLING 2018 Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li, Buzhou Tang

In this paper, we first use a search engine to collect large-scale question pairs related to high-frequency words from various domains, then filter irrelevant pairs by the Wasserstein distance, and finally recruit three annotators to manually check the left pairs.

Information Retrieval Machine Translation

Incorporating Label Dependency for Answer Quality Tagging in Community Question Answering via CNN-LSTM-CRF

COLING 2016 Yang Xiang, Xiaoqiang Zhou, Qingcai Chen, Zhihui Zheng, Buzhou Tang, Xiaolong Wang, Yang Qin

In community question answering (cQA), the quality of answers are determined by the matching degree between question-answer pairs and the correlation among the answers.

Community Question Answering

Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering

IJCNLP 2015 Xiaoqiang Zhou, Baotian Hu, Qingcai Chen, Buzhou Tang, Xiaolong Wang

In this paper, the answer selection problem in community question answering (CQA) is regarded as an answer sequence labeling task, and a novel approach is proposed based on the recurrent architecture for this problem.

Answer Selection Community Question Answering

