Fill Mask
3 papers with code • 3 benchmarks • 3 datasets
Most implemented papers
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Manual annotation of question and answers for videos, however, is tedious and prohibits scalability.
Prompt Tuning or Fine-Tuning - Investigating Relational Knowledge in Pre-Trained Language Models
In this work, we propose using a completely different approach: Instead of spending resources on training an additional model, we simply perform an adaptive fine-tuning of the pre-trained language model on the standard fill-mask task using a small training dataset of existing facts from a knowledge graph.
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling
Masked visual modeling (MVM) has been recently proven effective for visual pre-training.