no code implementations • COLING 2022 • Tao Shen, Xiubo Geng, Daxin Jiang
Besides a norm-grounding knowledge model, we present a novel norm-supported ethical judgment model in line with neural module networks to alleviate dilemma situations and improve norm-level explainability.
no code implementations • COLING 2022 • Jiazhan Feng, Chongyang Tao, Zhen Li, Chang Liu, Tao Shen, Dongyan Zhao
In this paper, we propose a reciprocal learning approach to jointly optimize a knowledge retriever and a response ranker for knowledge-grounded response retrieval without ground-truth knowledge labels.
no code implementations • 12 Sep 2023 • Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-Guang Lou
Reasoning presents a significant and challenging issue for Large Language Models (LLMs).
1 code implementation • 31 Aug 2023 • Hongtai Jing, Zhengtao Gao, Sheng Xu, Tao Shen, Zhangzhi Peng, Shwai He, Tao You, Shuang Ye, Wei Lin, Siqi Sun
Remarkably, BALMFold outperforms those well-established methods like AlphaFold2, IgFold, ESMFold, and OmegaFold in the antibody benchmark, demonstrating significant potential to advance innovative engineering and streamline therapeutic antibody development by reducing the need for unnecessary trials.
1 code implementation • 2 Jun 2023 • Le Zhang, Jiayang Chen, Tao Shen, Yu Li, Siqi Sun
The field of protein folding research has been greatly advanced by deep learning methods, with AlphaFold2 (AF2) demonstrating exceptional performance and atomic-level precision.
no code implementations • 23 May 2023 • Shengchao Chen, Guodong Long, Tao Shen, Tianyi Zhou, Jing Jiang
Federated weather forecasting is a promising collaborative learning framework for analyzing meteorological data across participants from different countries and regions, thus embodying a global-scale real-time weather data predictive analytics platform to tackle climate change.
no code implementations • 15 May 2023 • Zhongju Yuan, Tao Shen, Sheng Xu, Leiye Yu, Ruobing Ren, Siqi Sun
Deep learning-based approaches, such as AlphaFold2 (AF2), have significantly advanced protein tertiary structure prediction, achieving results comparable to real biological experimental methods.
1 code implementation • 12 May 2023 • Jiazhan Feng, Chongyang Tao, Xiubo Geng, Tao Shen, Can Xu, Guodong Long, Dongyan Zhao, Daxin Jiang
Information retrieval (IR) plays a crucial role in locating relevant resources from vast amounts of data, and its applications have evolved from traditional knowledge bases to modern search engines (SEs).
no code implementations • 27 Apr 2023 • Tao Shen, Guodong Long, Xiubo Geng, Chongyang Tao, Tianyi Zhou, Daxin Jiang
In this work, we propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios.
no code implementations • 6 Feb 2023 • Ziyang Luo, Pu Zhao, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Jing Ma, Qingwen Lin, Daxin Jiang
The conventional dense retrieval paradigm relies on encoding images and texts into dense representations using dual-stream encoders, however, it faces challenges with low retrieval speed in large-scale retrieval scenarios.
1 code implementation • 22 Jan 2023 • Shengchao Chen, Guodong Long, Tao Shen, Jing Jiang
To relieve the data exposure concern across regions, a novel federated learning approach has been proposed to collaboratively learn a brand-new spatio-temporal Transformer-based foundation model across participants with heterogeneous meteorological data.
no code implementations • CVPR 2023 • Meng Cao, Fangyun Wei, Can Xu, Xiubo Geng, Long Chen, Can Zhang, Yuexian Zou, Tao Shen, Daxin Jiang
Weakly-Supervised Video Grounding (WSVG) aims to localize events of interest in untrimmed videos with only video-level annotations.
1 code implementation • ICCV 2023 • Ziyang Luo, Pu Zhao, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Jing Ma, QIngwei Lin, Daxin Jiang
To address this issue, we propose a novel sparse retrieval paradigm for ITR that exploits sparse representations in the vocabulary space for images and texts.
no code implementations • 20 Dec 2022 • Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Guodong Long, Can Xu, Daxin Jiang
Long document retrieval aims to fetch query-relevant documents from a large-scale collection, where knowledge distillation has become de facto to improve a retriever by mimicking a heterogeneous yet powerful cross-encoder.
no code implementations • 20 Dec 2022 • Chang Liu, Chongyang Tao, Xiubo Geng, Tao Shen, Dongyan Zhao, Can Xu, Binxing Jiao, Daxin Jiang
Different from previous works that only rely on one positive and hard negatives as candidate passages, we create dark examples that all have moderate relevance to the query through mixing-up and masking in discrete space.
no code implementations • 19 Dec 2022 • Tao Shen, Yifan Cui
A common concern when a policymaker draws causal inferences from and makes decisions based on observational data is that the measured covariates are insufficiently rich to account for all sources of confounding, i. e., the standard no confoundedness assumption fails to hold.
no code implementations • 11 Nov 2022 • Yang Li, Canran Xu, Tao Shen, Jing Jiang, Guodong Long
The sharing task description is unable to stimulate the unique task-related information in each training sample, especially for tasks with the finite-label space.
1 code implementation • 13 Sep 2022 • Sen yang, Tao Shen, Yuqi Fang, Xiyue Wang, Jun Zhang, Wei Yang, Junzhou Huang, Xiao Han
The high-content image-based assay is commonly leveraged for identifying the phenotypic impact of genetic perturbations in biology field.
1 code implementation • 12 Sep 2022 • Zheqi Lv, Wenqiao Zhang, Shengyu Zhang, Kun Kuang, Feng Wang, Yongwei Wang, Zhengyu Chen, Tao Shen, Hongxia Yang, Beng Chin Ooi, Fei Wu
DUET is deployed on a powerful cloud server that only requires the low cost of forwarding propagation and low time delay of data transmission between the device and the cloud.
1 code implementation • 31 Aug 2022 • Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang
In large-scale retrieval, the lexicon-weighting paradigm, learning weighted sparse representations in vocabulary space, has shown promising results with high quality and low latency.
1 code implementation • 29 Aug 2022 • Kai Zhang, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, Daxin Jiang
The alignment is achieved by weakened knowledge distillations to enlighten the retriever via two aspects -- 1) a lexicon-augmented contrastive objective to challenge the dense encoder and 2) a pair-wise rank-consistent regularization to make dense model's behavior incline to the other.
1 code implementation • 4 Jul 2022 • Tao Shen, Zhihang Hu, Zhangzhi Peng, Jiayang Chen, Peng Xiong, Liang Hong, Liangzhen Zheng, YiXuan Wang, Irwin King, Sheng Wang, Siqi Sun, Yu Li
When E2Efold-3D is coupled with the experimental techniques, the RNA structure prediction field can be greatly advanced.
no code implementations • 21 Jun 2022 • YuFei Wang, Jiayi Zheng, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Daxin Jiang
This paper focuses on the data augmentation for low-resource NLP tasks where the training set is limited.
no code implementations • 16 Jun 2022 • Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Binxing Jiao, Daxin Jiang
A ranker plays an indispensable role in the de facto 'retrieval & rerank' pipeline, but its training still lags behind -- learning from moderate negatives or/and serving as an auxiliary module for a retriever.
no code implementations • 23 May 2022 • Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Kai Zhang, Daxin Jiang
Large-scale retrieval is to recall relevant documents from a huge collection given a query.
1 code implementation • 1 Apr 2022 • Jiayang Chen, Zhihang Hu, Siqi Sun, Qingxiong Tan, YiXuan Wang, Qinze Yu, Licheng Zong, Liang Hong, Jin Xiao, Tao Shen, Irwin King, Yu Li
Non-coding RNA structure and function are essential to understanding various biological processes, such as cell signaling, gene expression, and post-transcriptional regulations.
1 code implementation • ACL 2022 • Yucheng Zhou, Tao Shen, Xiubo Geng, Guodong Long, Daxin Jiang
Generating new events given context with correlated ones plays a crucial role in many event-centric reasoning tasks.
1 code implementation • 28 Jan 2022 • Qiyu Wu, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Daxin Jiang
A straightforward solution is resorting to more diverse positives from a multi-augmenting strategy, while an open question remains about how to unsupervisedly learn from the diverse positives but with uneven augmenting qualities in the text field.
no code implementations • 10 Jan 2022 • Zhenyuan Zhang, Tao Shen, Jie Zhang, Chao Wu
This technique mitigates the user heterogeneity problem and better protects user privacy.
1 code implementation • 11 Nov 2021 • Jiangchao Yao, Shengyu Zhang, Yang Yao, Feng Wang, Jianxin Ma, Jianwei Zhang, Yunfei Chu, Luo Ji, Kunyang Jia, Tao Shen, Anpeng Wu, Fengda Zhang, Ziqi Tan, Kun Kuang, Chao Wu, Fei Wu, Jingren Zhou, Hongxia Yang
However, edge computing, especially edge and cloud collaborative computing, are still in its infancy to announce their success due to the resource-constrained IoT scenarios with very limited algorithms deployed.
no code implementations • 13 Oct 2021 • Yucheng Zhou, Xiubo Geng, Tao Shen, Guodong Long, Daxin Jiang
Event correlation reasoning infers whether a natural language paragraph containing multiple events conforms to human common sense.
no code implementations • Findings (NAACL) 2022 • Yang Li, Guodong Long, Tao Shen, Jing Jiang
It consists of (1) a pairwise type-enriched sentence encoding module injecting both context-free and -related backgrounds to alleviate sentence-level wrong labeling, and (2) a hierarchical type-sentence alignment module enriching a sentence with the triple fact's basic attributes to support long-tail relations.
1 code implementation • 7 Sep 2021 • Xueping Peng, Guodong Long, Tao Shen, Sen Wang, Jing Jiang
Sequential diagnosis prediction on the Electronic Health Record (EHR) has been proven crucial for predictive analytics in the medical domain.
1 code implementation • Findings (EMNLP) 2021 • Bo wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang
Aspect-level sentiment classification (ALSC) aims at identifying the sentiment polarity of a specified aspect in a sentence.
Aspect-Based Sentiment Analysis (ABSA)
Representation Learning
+1
no code implementations • 24 Aug 2021 • Guodong Long, Tao Shen, Yue Tan, Leah Gerrard, Allison Clarke, Jing Jiang
Implementing an open innovation framework in the healthcare industry, namely open health, is to enhance innovation and creative capability of health-related organisations by building a next-generation collaborative framework with partner organisations and the research community.
no code implementations • 24 Aug 2021 • Lin William Cong, Charles M. C. Lee, Yuanyu Qu, Tao Shen
This study reports on the current state-of-affairs in the funding of entrepreneurship and innovations in China and provides a broad survey of academic findings on the subject.
1 code implementation • 19 Aug 2021 • Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang
By comparison, a mixture of multiple global models could capture the heterogeneity across various clients if assigning the client to different global models (i. e., centers) in FL.
no code implementations • 3 Aug 2021 • Zhenhao Tang, Shikui Wang, Shengxian Cao, Yang Li, Tao Shen
Aiming at the problem that delay time is difficult to determine and prediction accuracy is low in building prediction model of SCR system, a dynamic modeling scheme based on a hybrid of multiple data-driven algorithms was proposed.
no code implementations • NAACL 2021 • Yucheng Zhou, Xiubo Geng, Tao Shen, Wenqiang Zhang, Daxin Jiang
That is, we can only access training data in a high-resource language, while need to answer multilingual questions without any labeled data in target languages.
no code implementations • 24 May 2021 • Huanding Zhang, Tao Shen, Fei Wu, Mingyang Yin, Hongxia Yang, Chao Wu
Federated learning (FL) is a an emerging technique that can collaboratively train a shared model while keeping the data decentralized, which is a rational solution for distributed GNN training.
no code implementations • 11 May 2021 • Jiaxiang Wu, Shitong Luo, Tao Shen, Haidong Lan, Sheng Wang, Junzhou Huang
In this paper, we propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
no code implementations • 10 May 2021 • Liangzhen Zheng, Haidong Lan, Tao Shen, Jiaxiang Wu, Sheng Wang, Wei Liu, Junzhou Huang
Protein structure prediction has been a grand challenge for over 50 years, owing to its broad scientific and application interests.
no code implementations • 18 Oct 2020 • Fengda Zhang, Kun Kuang, Zhaoyang You, Tao Shen, Jun Xiao, Yin Zhang, Chao Wu, Yueting Zhuang, Xiaolin Li
FURL poses two new challenges: (1) data distribution shift (Non-IID distribution) among clients would make local models focus on different categories, leading to the inconsistency of representation spaces.
no code implementations • COLING 2020 • Hao Huang, Guodong Long, Tao Shen, Jing Jiang, Chengqi Zhang
Many graph embedding approaches have been proposed for knowledge graph completion via link prediction.
2 code implementations • COLING 2020 • Yang Li, Tao Shen, Guodong Long, Jing Jiang, Tianyi Zhou, Chengqi Zhang
Then, facilitated by the proposed base model, we introduce collaborating relation features shared among relations in the hierarchies to promote the relation-augmenting process and balance the training data for long-tail relations.
1 code implementation • 24 Sep 2020 • Xueping Peng, Guodong Long, Tao Shen, Sen Wang, Jing Jiang, Chengqi Zhang
Electronic health records (EHRs) are longitudinal records of a patient's interactions with healthcare systems.
3 code implementations • 27 Jun 2020 • Tao Shen, Jie Zhang, Xinkang Jia, Fengda Zhang, Gang Huang, Pan Zhou, Kun Kuang, Fei Wu, Chao Wu
The experiments show that FML can achieve better performance than alternatives in typical FL setting, and clients can be benefited from FML with different models and tasks.
1 code implementation • 15 Jun 2020 • Xueping Peng, Guodong Long, Tao Shen, Sen Wang, Jing Jiang
The key challenge of patient journey understanding is to design an effective encoding mechanism which can properly tackle the aforementioned multi-level structured patient journey data with temporal sequential visits and a set of medical codes.
3 code implementations • 3 May 2020 • Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang
However, due to the diverse nature of user behaviors, assigning users' gradients to different global models (i. e., centers) can better capture the heterogeneity of data distributions across users.
1 code implementation • 30 Apr 2020 • Bo Wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang
In experiments, we achieve state-of-the-art performance on three benchmarks and a zero-shot dataset for link prediction, with highlights of inference costs reduced by 1-2 orders of magnitude compared to a textual encoding method.
Ranked #4 on
Link Prediction
on UMLS
no code implementations • EMNLP 2020 • Tao Shen, Yi Mao, Pengcheng He, Guodong Long, Adam Trischler, Weizhu Chen
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training, to inject language models with structured knowledge via learning from raw text.
no code implementations • 27 Nov 2019 • Yang Li, Guodong Long, Tao Shen, Tianyi Zhou, Lina Yao, Huan Huo, Jing Jiang
Distantly supervised relation extraction intrinsically suffers from noisy labels due to the strong assumption of distant supervision.
1 code implementation • IJCNLP 2019 • Tao Shen, Xiubo Geng, Tao Qin, Daya Guo, Duyu Tang, Nan Duan, Guodong Long, Daxin Jiang
We consider the problem of conversational question answering over a large-scale knowledge base.
1 code implementation • 15 Sep 2019 • Xueping Peng, Guodong Long, Tao Shen, Sen Wang, Jing Jiang, Michael Blumenstein
In this paper, we propose a medical concept embedding method based on applying a self-attention mechanism to represent each medical concept.
no code implementations • 6 Sep 2019 • Tao Shen, Xiubo Geng, Tao Qin, Guodong Long, Jing Jiang, Daxin Jiang
These two problems lead to a poorly-trained semantic parsing model.
2 code implementations • NAACL 2019 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
Neural networks equipped with self-attention have parallelizable computation, light-weight structure, and the ability to capture both long-range and local dependencies.
1 code implementation • ICLR 2018 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
In this paper, we propose a model, called "bi-directional block self-attention network (Bi-BloSAN)", for RNN/CNN-free sequence encoding.
1 code implementation • 31 Jan 2018 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, Chengqi Zhang
In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other.
Ranked #56 on
Natural Language Inference
on SNLI
1 code implementation • 14 Sep 2017 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, Chengqi Zhang
Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively.
Ranked #69 on
Natural Language Inference
on SNLI