Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain.
Ranked #1 on
Question Answering
on PubMedQA
In this paper, we present TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes.
By incorporating the vision features in both stages, the model is able to generate effective rationales that contribute to answer inference.
Diffusion methods have been proven to be very effective to generate images while conditioning on a text prompt.
In the text-to-image setting, the method creates hard prompts for diffusion models, allowing API users to easily generate, discover, and mix and match image concepts without prior knowledge on how to prompt the model.
The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models.
Ranked #1 on
Image Retrieval
on COCO
The technique works as follows: we first encourage sparse latent representations when we train a GNN in a supervised setting, then we apply symbolic regression to components of the learned model to extract explicit physical relations.
Most 3D instance segmentation methods exploit a bottom-up strategy, typically including resource-exhaustive post-processing.
Ranked #1 on
3D Instance Segmentation
on S3DIS
(using extra training data)
Most state-of-the-art approaches for weather and climate modeling are based on physics-informed numerical models of the atmosphere.
We propose Dual PatchNorm: two Layer Normalization layers (LayerNorms), before and after the patch embedding layer in Vision Transformers.