Browse State-of-the-Art
Datasets
Methods
More
Newsletter
RC2022
About
Trends
Portals
Libraries
Sign In
Subscribe to the PwC Newsletter
×
Stay informed on the latest trending ML papers with code, research developments, libraries, methods, and datasets.
Read previous issues
Join the community
×
You need to
log in
to edit.
You can
create a new account
if you don't have one.
Edit Category
×
Description with markdown (optional):
Image
Multi-Modal Methods
Edit
Computer Vision
• 10 methods
Methods
Add a Method
Method
Year
Papers
GLIDE
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
2021
19
UNIMO
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
2020
4
EmbraceNet
EmbraceNet: A robust deep learning architecture for multimodal classification
2019
4
Vokenization
Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision
2020
3
CTAL
CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations
2021
2
VATT
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
2021
2
MAVL
Class-agnostic Object Detection with Multi-modal Transformer
2021
2
SyCoCa
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
2024
2
AVSlowFast
Audiovisual SlowFast Networks for Video Recognition
2020
1
PO3D-VQA
3D-Aware Visual Question Answering about Parts, Poses and Occlusions
2023
1