As we embark on a new era of LLMs, it becomes increasingly crucial to understand their capabilities, limitations, and differences.
1 code implementation • 7 Sep 2023 • Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong
Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.
This paper aims to fill this gap by investigating different methods of combining retrieved passages with LLMs to enhance answer generation.
End-to-end task-oriented dialogue (TOD) systems have achieved promising performance by leveraging sophisticated natural language understanding and natural language generation capabilities of pre-trained models.
Physical-Layer Authentication (PLA) has been recently believed as an endogenous-secure and energy-efficient technique to recognize IoT terminals.
Despite advancements in conversational AI, language models encounter challenges to handle diverse conversational tasks, and existing dialogue dataset collections often lack diversity and comprehensiveness.
It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question.
Upon the designed DNSC architecture, we further combine adversarial learning, variational autoencoder, and diffusion model to propose the Latent Diffusion DNSC (Latent-Diff DNSC) scheme to realize intelligent online de-noising.
Graphic layout designs play an essential role in visual communication.
Dense retrievers have made significant strides in text retrieval and open-domain question answering, even though most achievements were made possible only with large amounts of human supervision.
Parsing natural language questions into executable logical forms is a useful and interpretable way to perform question answering on structured data such as knowledge bases (KB) or databases (DB).
Finally, we fine-tune the model with limited data with true labels to fully adapt it to the target domain.
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text.
Research Replication Prediction (RRP) is the task of predicting whether a published research result can be replicated or not.
The key insight of our framework is to learn representations by minimizing the compression complexity and maximizing the predictive information in latent space.
Stochastic linear mixing models (SLMM) assume the mixture coefficients depend on input, making them more flexible and effective to capture complex output dependence.
This paper presents an efficient variational inference framework for deriving a family of structured gaussian process regression network (SGPRN) models.
Faceted summarization provides briefings of a document from different perspectives.
Ranked #1 on Unsupervised Extractive Summarization on FacetSum
Therefore, we consider predicting user engagement status as the very first and critical step to online evaluation for intelligent assistants.
Recent years have seen a flourishing of neural keyphrase generation (KPG) works, including the release of several large-scale datasets and a host of new models to tackle them.
Accurate interpretation of such prediction outcomes from a machine learning model that explicitly captures temporal correlations can significantly benefit the domain experts.
We propose multivariate nonstationary Gaussian processes for jointly modeling multiple clinical variables, where the key parameters, length-scales, standard deviations and the correlations between the observed output, are all time dependent.
An issue faced by SGP, especially in latent variable models, is the inefficient learning of the inducing inputs, which leads to poor model prediction.
Recently, concatenating multiple keyphrases as a target sequence has been proposed as a new learning paradigm for keyphrase generation.
Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning.
Ranked #3 on Text Simplification on ASSET
With both previous and new evaluation metrics, our model outperforms strong baselines on all datasets.
In our work, we propose a system, called crowdsourced subjective knowledge acquisition (CoSKA), for subjective knowledge acquisition powered by crowdsourcing and existing KBs.
Keyphrase provides highly-condensed information that can be effectively used for understanding, organizing and retrieving text content.