Misconceptions
47 papers with code • 1 benchmarks • 1 datasets
Measures whether a model can discern popular misconceptions from the truth.
Example:
input: The daddy longlegs spider is the most venomous spider in the world.
choice: T
choice: F
answer: F
input: Karl Benz is correctly credited with the invention of the first modern automobile.
choice: T
choice: F
answer: T
Source: BIG-bench
Libraries
Use these libraries to find Misconceptions models and implementationsMost implemented papers
Community detection in networks: A user guide
Community detection in networks is one of the most popular topics of modern network science.
Laplace Redux -- Effortless Bayesian Deep Learning
Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection.
Factuality Enhanced Language Models for Open-Ended Text Generation
In this work, we measure and improve the factual accuracy of large-scale LMs for open-ended text generation.
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.
Design Challenges and Misconceptions in Neural Sequence Labeling
We investigate the design challenges of constructing effective and efficient neural sequence labeling systems, by reproducing twelve neural sequence labeling models, which include most of the state-of-the-art structures, and conduct a systematic model comparison on three benchmarks (i. e. NER, Chunking, and POS tagging).
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Fine-tuning pre-trained transformer-based language models such as BERT has become a common practice dominating leaderboards across various NLP benchmarks.
TruthfulQA: Measuring How Models Mimic Human Falsehoods
We crafted questions that some humans would answer falsely due to a false belief or misconception.
Training Compute-Optimal Large Language Models
We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget.
Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics and Prompt Wording
Large language models (LLMs) have become mainstream technology with their versatile use cases and impressive performance.
Parting with Misconceptions about Learning-based Vehicle Motion Planning
The release of nuPlan marks a new era in vehicle motion planning research, offering the first large-scale real-world dataset and evaluation schemes requiring both precise short-term planning and long-horizon ego-forecasting.