no code implementations • EMNLP 2020 • Lei Sha
Previous works usually apply beam-search-based methods or stochastic searching methods to lexically-constrained generation.
1 code implementation • 4 Sep 2024 • Bofei Gao, Feifan Song, Yibo Miao, Zefan Cai, Zhe Yang, Liang Chen, Helan Hu, Runxin Xu, Qingxiu Dong, Ce Zheng, Wen Xiao, Ge Zhang, Daoguang Zan, Keming Lu, Bowen Yu, Dayiheng Liu, Zeyu Cui, Jian Yang, Lei Sha, Houfeng Wang, Zhifang Sui, Peiyi Wang, Tianyu Liu, Baobao Chang
Finally, based on our unified perspective, we explore the challenges and future research directions for aligning large language models with human preferences.
no code implementations • 31 Aug 2024 • Cheng Qian, Hainan Zhang, Lei Sha, Zhiming Zheng
With the growing deployment of LLMs in daily applications like chatbots and content generation, efforts to ensure outputs align with human values and avoid harmful content have intensified.
no code implementations • 28 May 2024 • Junda Zhu, Lingyong Yan, Haibo Shi, Dawei Yin, Lei Sha
The ATM steers the Generator to have a robust perspective of useful documents for question answering with the help of an auxiliary Attacker agent.
no code implementations • 19 Apr 2024 • Guanhua Chen, Wenhan Yu, Lei Sha
While Retrieval-Augmented Generation (RAG) plays a crucial role in the application of Large Language Models (LLMs), existing retrieval methods in knowledge-dense domains like law and medicine still suffer from a lack of multi-perspective views, which are essential for improving interpretability and reliability.
no code implementations • 28 Mar 2024 • Jingyuan Ma, Damai Dai, Lei Sha, Zhifang Sui
Large language models (LLMs) demonstrate substantial capabilities in solving math problems.
1 code implementation • 26 Feb 2024 • Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, Hongning Wang, Minlie Huang
The safety of Large Language Models (LLMs) has gained increasing attention in recent years, but there still lacks a comprehensive approach for detecting safety issues within LLMs' responses in an aligned, customizable and explainable manner.
no code implementations • 25 Feb 2024 • Hao Wang, Hao Li, Minlie Huang, Lei Sha
In addition, our approach can be generalized into a broader method for generating transferable adversarial suffixes that can successfully attack multiple LLMs, even black-box LLMs, such as ChatGPT and Gemini.
no code implementations • 6 Feb 2024 • Hao Wang, Lei Sha
The proposed approach aims to enhance the fluency of generated text by guiding the generation process with PPCs.
no code implementations • 1 Dec 2023 • Lei Sha, Thomas Lukasiewicz
In this approach, we use a semi-supervised contrastive learning method to encourage the disentanglement of attributes in latent spaces.
no code implementations • 15 Jan 2023 • Lei Sha, Oana-Maria Camburu, Thomas Lukasiewicz
One form of explanation for a prediction is an extractive rationale, i. e., a subset of features of an instance that lead the model to give its prediction on that instance.
no code implementations • 16 Nov 2022 • Tommaso Salvatori, Yuhang Song, Yordan Yordanov, Beren Millidge, Zhenghua Xu, Lei Sha, Cornelius Emde, Rafal Bogacz, Thomas Lukasiewicz
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience.
1 code implementation • 8 Oct 2022 • Lei Sha, Yuhang Song, Yordan Yordanov, Tommaso Salvatori, Thomas Lukasiewicz
Transformers have become an indispensable module for text generation models since their great success in machine translation.
no code implementations • 19 Nov 2021 • Yuntao Li, Can Xu, Huang Hu, Lei Sha, Yan Zhang, Daxin Jiang
The sequence representation plays a key role in the learning of matching degree between the dialogue context and the response.
no code implementations • 15 Nov 2021 • Niall Taylor, Lei Sha, Dan W Joyce, Thomas Lukasiewicz, Alejo Nevado-Holgado, Andrey Kormilitzin
In this work, we apply InfoCal, the current state-of-the-art model that produces extractive rationales for its predictions, to the task of predicting hospital readmission using hospital discharge notes.
no code implementations • 14 Oct 2021 • Lingzhi Wang, Huang Hu, Lei Sha, Can Xu, Kam-Fai Wong, Daxin Jiang
Furthermore, we propose to evaluate the CRS models in an end-to-end manner, which can reflect the overall performance of the entire system rather than the performance of individual modules, compared to the separate evaluations of the two modules used in previous work.
no code implementations • 29 Sep 2021 • Luca Pinchetti, Lei Sha, Thomas Lukasiewicz
By doing so, it is possible to merge multiple datasets based on different categorical models by projecting the data points into a unified latent space.
no code implementations • NeurIPS 2021 • Tommaso Salvatori, Yuhang Song, Yujian Hong, Simon Frieder, Lei Sha, Zhenghua Xu, Rafal Bogacz, Thomas Lukasiewicz
We conclude by discussing the possible impact of this work in the neuroscience community, by showing that our model provides a plausible framework to study learning and retrieval of memories in the brain, as it closely mimics the behavior of the hippocampus as a memory index and generative model.
1 code implementation • Findings (ACL) 2021 • Lei Sha, Patrick Hohenecker, Thomas Lukasiewicz
Experimental results on the test set show that our proposed method is a good fit for this novel NLP task.
no code implementations • 16 Dec 2020 • Lei Sha, Thomas Lukasiewicz
After the latent space is disentangled, the style of a sentence can be transformed by tuning the style representation without affecting other features of the sentence.
no code implementations • 16 Dec 2020 • Lei Sha, Oana-Maria Camburu, Thomas Lukasiewicz
We use an adversarial-based technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
no code implementations • EMNLP 2018 • Chen Shi, Qi Chen, Lei Sha, Sujian Li, Xu Sun, Houfeng Wang, Lintao Zhang
The lack of labeled data is one of the main challenges when building a task-oriented dialogue system.
3 code implementations • 27 Nov 2017 • Tianyu Liu, Kexiang Wang, Lei Sha, Baobao Chang, Zhifang Sui
In the decoding phase, dual attention mechanism which contains word level attention and field level attention is proposed to model the semantic relevance between the generated description and the table.
Ranked #1 on Table-to-Text Generation on WikiBio
1 code implementation • 1 Sep 2017 • Lei Sha, Lili Mou, Tianyu Liu, Pascal Poupart, Sujian Li, Baobao Chang, Zhifang Sui
Generating texts from structured data (e. g., a table) is important for various natural language processing tasks such as question answering and dialog systems.
no code implementations • WS 2017 • Feng Qian, Lei Sha, Baobao Chang, Lu-chen Liu, Ming Zhang
In Semantic Role Labeling (SRL) task, the tree structured dependency relation is rich in syntax information, but it is not well handled by existing models.
no code implementations • ACL 2017 • Qiaolin Xia, Lei Sha, Baobao Chang, Zhifang Sui
But the training data of single corpus is often limited.
no code implementations • 3 Apr 2017 • Feng Qian, Lei Sha, Baobao Chang, Lu-chen Liu, Ming Zhang
As for semantic role labeling (SRL) task, when it comes to utilizing parsing information, both traditional methods and recent recurrent neural network (RNN) based methods use the feature engineering way.
no code implementations • COLING 2016 • Tingsong Jiang, Tianyu Liu, Tao Ge, Lei Sha, Baobao Chang, Sujian Li, Zhifang Sui
In this paper, we present a novel time-aware knowledge graph completion model that is able to predict links in a KG using both the existing facts and the temporal information of the facts.
no code implementations • COLING 2016 • Lei Sha, Baobao Chang, Zhifang Sui, Sujian Li
After read the premise again, the model can get a better understanding of the premise, which can also affect the understanding of the hypothesis.
Ranked #40 on Natural Language Inference on SNLI
no code implementations • NAACL 2016 • Lei Sha, Sujian Li, Baobao Chang, Zhifang Sui
Automatic event schema induction (AESI) means to extract meta-event from raw text, in other words, to find out what types (templates) of event may exist in the raw text and what roles (slots) may exist in each event type.