Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate.
The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.
Ranked #1 on Domain Generalization on ImageNet-Sketch (using extra training data)
We present an efficient method for joint optimization of topology, materials and lighting from multi-view image observations.
For the first time, we train a detector with all the twenty-one-thousand classes of the ImageNet dataset and show that it generalizes to new datasets without fine-tuning.
Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.
Ranked #1 on Task-Oriented Dialogue Systems on KVRET
The second edition of Deep Learning Interviews is home to hundreds of fully-solved problems, from a wide range of key topics in AI.
Censorship of Internet content in China is understood to operate through a system of intermediary liability whereby service providers are liable for the content on their platforms.
Transformers have attracted increasing interests in computer vision, but they still fall behind state-of-the-art convolutional networks.
Ranked #1 on Image Classification on ImageNet (using extra training data)
The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing.