CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition

deeptibhegde/clip-goes-3d 20 Mar 2023

Attempting to train the visual and text encoder to account for this shift results in catastrophic forgetting and a notable decrease in performance.

Retrieval Scene Understanding

83
0.80 stars / hour

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

6,473
0.79 stars / hour

GPT Understands, Too

THUDM/GLM 18 Mar 2021

On the SuperGlue benchmark, GPTs achieve comparable and sometimes better performance to similar-sized BERTs in supervised learning.

Knowledge Probing Natural Language Understanding +1

1,049
0.73 stars / hour

Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws

wassimtenachi/physo 6 Mar 2023

Here we present $\Phi$-SO, a Physical Symbolic Optimization framework for recovering analytical symbolic expressions from physics data using deep reinforcement learning techniques by learning units constraints.

Symbolic Regression

1,229
0.72 stars / hour

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

cloneofsimo/lora 25 Aug 2022

Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic images of the subject contextualized in different scenes.

Image Generation

3,112
0.69 stars / hour

Deep Learning for Camera Calibration and Beyond: A Survey

kangliao929/awesome-deep-camera-calibration 19 Mar 2023

In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations.

Camera Calibration

91
0.65 stars / hour

Generative Semantic Segmentation

fudan-zvg/gss 20 Mar 2023

To that end, the segmentation mask is expressed with a special type of image (dubbed as maskige).

Semantic Segmentation

70
0.64 stars / hour

Wavelet Diffusion Models are fast and scalable Image Generators

vinairesearch/wavediff 29 Nov 2022

Diffusion models are rising as a powerful solution for high-fidelity image generation, which exceeds GANs in quality in many circumstances.

Image Generation

103
0.64 stars / hour

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

salesforce/CodeGen 25 Mar 2022

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.

Code Generation Language Modelling +1

2,669
0.63 stars / hour

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion

winddori2002/TriAAN-VC 16 Mar 2023

The existing methods do not simultaneously satisfy the above two aspects of VC, and their conversion outputs suffer from a trade-off problem between maintaining source contents and target characteristics.

Voice Conversion

38
0.55 stars / hour