Active learning, which effectively collects informative unlabeled data for annotation, reduces the demand for labeled data.
We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.
Ranked #14 on Text-to-Image Generation on COCO
The neural attention mechanism has been incorporated into deep neural networks to achieve state-of-the-art performance in various domains.
Existing methods for unsupervised domain adaptation often rely on minimizing some statistical distance between the source and target samples in the latent space.
Crossformer with states sharing not only provides the desired cross-layer guidance and regularization but also reduces the memory requirement.
Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with multi label examples.
We study calibration in question answering, estimating whether model correctly predicts answer for each question.
However, the quality of uncertainty estimation is highly dependent on the dropout probabilities.
We depart from the standard practice of collecting a single reference per each training example, and find that collecting multiple references can achieve better accuracy under the fixed annotation budget.
Attention modules, as simple and effective tools, have not only enabled deep neural networks to achieve state-of-the-art results in many domains, but also enhanced their interpretability.