Formal Algorithms for Transformers

no code yet • 19 Jul 2022

This document aims to be a self-contained, mathematically precise overview of transformer architectures and algorithms (*not* results).

TWEETS

MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures

google-research/jax3d 30 Jul 2022

Neural Radiance Fields (NeRFs) have demonstrated amazing ability to synthesize images of 3D scenes from novel views.

Novel View Synthesis

TWEETS

Confident Adaptive Language Modeling

no code yet • 14 Jul 2022

Recent advances in Transformer-based large language models (LLMs) have led to significant performance improvements across many tasks.

Language Modelling Text Generation

TWEETS

Language Modelling with Pixels

xplip/pixel 14 Jul 2022

Language models are defined over a finite set of inputs, which creates a vocabulary bottleneck when we attempt to scale the number of supported languages.

Language Modelling Named Entity Recognition

TWEETS

Language Models (Mostly) Know What They Know

no code yet • 11 Jul 2022

We study whether language models can evaluate the validity of their own claims and predict which questions they will be able to answer correctly.

Multiple-choice

TWEETS

Language Model Cascades

google-research/cascades 21 Jul 2022

Prompted models have demonstrated impressive few-shot learning abilities.

Few-Shot Learning Language Modelling +1

TWEETS

Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?

no code yet • 21 Jul 2022

There have been a lot of interest in the scaling properties of Transformer models.

Inductive Bias

TWEETS

On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence

no code yet • 11 Jul 2022

Ten years into the revival of deep networks and artificial intelligence, we propose a theoretical framework that sheds light on understanding deep networks within a bigger picture of Intelligence in general.

TWEETS

Lecture Notes on Neural Information Retrieval

no code yet • 27 Jul 2022

These lecture notes focus on the recent advancements in neural information retrieval, with particular emphasis on the systems and models exploiting transformer networks.

Information Retrieval Natural Language Processing

TWEETS

An Introduction to Lifelong Supervised Learning

no code yet • 10 Jul 2022

Following these different classes of learning algorithms, we discuss the commonly used evaluation benchmarks and metrics for lifelong learning (Chapter 6) and wrap up with a discussion of future challenges and important research directions in Chapter 7.

TWEETS