Few-shot Learning with Retrieval Augmented Language Models

no code yet • 5 Aug 2022

Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is unclear whether they work in few-shot settings.

Fact Checking Few-Shot Learning +3

TWEETS

Meaning without reference in large language models

no code yet • 5 Aug 2022

The widespread success of large language models (LLMs) has been met with skepticism that they possess anything like human concepts or meanings.

TWEETS

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models

hadasah/btm 5 Aug 2022

New ELMs are learned by branching from (mixtures of) ELMs in the current set, further training the parameters on data for the new domain, and then merging the resulting model back into the set for future use.

TWEETS

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

facebookresearch/ParlAI 5 Aug 2022

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.

Continual Learning

TWEETS

Why Do Networks Need Negative Weights?

no code yet • 5 Aug 2022

Why do networks have negative weights at all?

TWEETS

Non-Asymptotic Analysis of Ensemble Kalman Updates: Effective Dimension and Localization

no code yet • 5 Aug 2022

Many modern algorithms for inverse problems and data assimilation rely on ensemble Kalman updates to blend prior predictions with observed data.

TWEETS

Tailoring to the Tails: Risk Measures for Fine-Grained Tail Sensitivity

no code yet • 5 Aug 2022

As a concrete example, we focus on divergence risk measures based on f-divergence ambiguity sets, which are a widespread tool used to foster distributional robustness of machine learning systems.

Machine Learning

TWEETS

Seamless Iterative Semi-Supervised Correction of Imperfect Labels in Microscopy Images

marwankefah/SISSI 5 Aug 2022

We propose Seamless Iterative Semi-Supervised correction of Imperfect labels (SISSI), a new method for training object detection models with noisy and missing annotations in a semi-supervised fashion.

TWEETS

Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition

no code yet • 5 Aug 2022

The ultimate goal is to enable a global view that considers the beginning and the end positions to predict the entity.

named-entity-recognition NER

TWEETS