Language Modelling

4455 papers with code • 51 benchmarks • 157 datasets

Language Modeling is the task of predicting the next word or character in a document. This technique can be used to train language models that can further be applied to a wide range of natural language tasks like text generation, text classification, and question answering.

Historically, language modelling was done with N-gram language models (which still have niche uses), but since the 2010s neural language models took over, and starting from the 2020s SOTA was achieved exclusively with large language models (LLMs).

A model's language modeling capability is measured using cross-entropy and perplexity. Some datasets to evaluate language modeling are WikiText-103, One Billion Word, Text8, C4, The Pile, among others.

Some notable state-of-the-art language models include:

Check below for all state-of-the-art models.

Here are some additional readings to go deeper on the task:

( Image credit: Exploring the Limits of Language Modeling )

Libraries

Use these libraries to find Language Modelling models and implementations
31 papers
124,793
12 papers
18,303
10 papers
29,225
See all 15 libraries.

Future Language Modeling from Temporal Document History

jlab-nlp/future-language-modeling 16 Apr 2024

While there are many automated systems for predicting future numerical data, such as weather, stock prices, and demand for products, there is relatively little work in automatically predicting textual data.

1
16 Apr 2024

Forcing Diffuse Distributions out of Language Models

y0mingzhang/diffuse-probabilities 16 Apr 2024

Despite being trained specifically to follow user instructions, today's language models perform poorly when instructed to produce random outputs.

0
16 Apr 2024

Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training

akreal/bloomzmms 16 Apr 2024

Our zero-shot evaluation results confirm the robustness of our approach across multiple tasks, including speech translation and multilingual spoken language understanding, thereby opening new avenues for applying LLMs in the speech domain.

0
16 Apr 2024

Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models

algolzw/daclip-uir 15 Apr 2024

Though diffusion models have been successfully applied to various image restoration (IR) tasks, their performance is sensitive to the choice of training datasets.

528
15 Apr 2024

Compression Represents Intelligence Linearly

hkust-nlp/llm-compression-intelligence 15 Apr 2024

We open-source our compression datasets as well as our data collection pipelines to facilitate future researchers to assess compression properly.

66
15 Apr 2024

Memory Sharing for Large Language Model based Agents

ghupppp/memorysharingllm 15 Apr 2024

In the realm of artificial intelligence, the adaptation of Large Language Model (LLM)-based agents to execute tasks via natural language prompts represents a significant advancement, notably eliminating the need for explicit retraining or fine tuning for fixed-answer tasks such as common sense questions and yes/no queries.

8
15 Apr 2024

in2IN: Leveraging individual Information to Generate Human INteractions

pabloruizponce/in2IN 15 Apr 2024

For this, we introduce in2IN, a novel diffusion model for human-human motion generation which is conditioned not only on the textual description of the overall interaction but also on the individual descriptions of the actions performed by each person involved in the interaction.

7
15 Apr 2024

Knowledge-enhanced Visual-Language Pretraining for Computational Pathology

magic-ai4med/kep 15 Apr 2024

In this paper, we consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources, along with the domain specific knowledge in pathology.

5
15 Apr 2024

A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions

ai-hpc-research-team/slm4crp 15 Apr 2024

The task of chemical reaction predictions (CRPs) plays a pivotal role in advancing drug discovery and material science.

3
15 Apr 2024

LegalPro-BERT: Classification of Legal Provisions by fine-tuning BERT Large Language Model

amit5-ai/LegalPro-BERT 15 Apr 2024

Contract analysis requires the identification and classification of key provisions and paragraphs within an agreement.

1
15 Apr 2024