Using the proposed CSST framework, we obtain practical self-training methods (for both vision and NLP tasks) for optimizing different non-decomposable metrics using deep neural networks.
The presence of images that flip Oracle predictions and those that do not, makes this a challenging setting for adversarial robustness.
Modeling dynamics of human motion is one of the most challenging sequence modeling problem, with diverse applications in animation industry, human-robot interaction, motion-based surveillance, etc.
Similarly, performance on multi-disciplinary tasks such as Visual Question Answering (VQA) is considered a marker for gauging progress in Computer Vision.
A class of recent approaches for generating images, called Generative Adversarial Networks (GAN), have been used to generate impressively realistic images of objects, bedrooms, handwritten digits and a variety of other image modalities.