Despite showing increasingly human-like abilities, large language models (LLMs) often struggle with factual inaccuracies, i. e. "hallucinations", even when they hold relevant knowledge.
One critical issue for chat systems is to stay consistent about preferences, opinions, beliefs and facts of itself, which has been shown a difficult problem.
We propose Contrast Instructions -- a benchmarking strategy for the consistency of RM.
Large Language Models (LLMs) have revolutionized natural language processing, yet aligning these models with human values and preferences using RLHF remains a significant challenge.
We propose a dialogue model that can access the vast and dynamic information from any search engine for response generation.
Current self-training methods such as standard self-training, co-training, tri-training, and others often focus on improving model performance on a single task, utilizing differences in input features, model architectures, and training processes.
Pretrained natural language processing (NLP) models have achieved high overall performance, but they still make systematic errors.
While previous work focuses on building systems for inducing grammars on text that are well-aligned with video content, we investigate the scenario, in which text and video are only in loose correspondence.
Building robust and general dialogue models for spoken conversations is challenging due to the gap in distributions of spoken and written data.
We propose to use a top-down parser as a model-based pruning method, which also enables parallel encoding during inference.
Our DP-FP employs novel (1) representation clipping followed by noise addition in the forward propagation stage, as well as (2) micro-batch construction via subsampling to achieve DP amplification and reduce noise power to $1/M$, where $M$ is the number of micro-batch in a step.
In the Chinese medical insurance industry, the assessor's role is essential and requires significant efforts to converse with the claimant.
Human language understanding operates at multiple levels of granularity (e. g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined.
Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM) model, which is an end-to-end system that directly predicts the answer beginning and ending points in a passage.
Ranked #3 on Open-Domain Question Answering on SQuAD1.1
Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research.
We simply compute the distance between the machine attentions and the "true" alignments, and minimize this cost in the training procedure.
In the training stage, our method induces several sense centroids (embedding) for each polysemous word.
Ranked #4 on Word Sense Induction on SemEval 2010 WSI
In this paper, we enhance the attention-based neural machine translation (NMT) by adding explicit coverage embedding models to alleviate issues of repeating and dropping translations in NMT.
Our method simply takes into account the translation options of each word or phrase in the source sentence, and picks a very small target vocabulary for each sentence based on a word-to-word translation model or a bilingual phrase library learned from a traditional machine translation model.
Most conventional sentence similarity methods only focus on similar parts of two input sentences, and simply ignore the dissimilar parts, which usually give us some clues and semantic meanings about the sentences.
Ranked #13 on Question Answering on WikiQA
In this work, we propose a semi-supervised method for short text clustering, where we represent texts as distributed vectors with neural networks, and use a small amount of labeled data to specify our intention for clustering.