QLoRA: Efficient Finetuning of Quantized LLMs

artidoro/qlora 23 May 2023

Our best model family, which we name Guanaco, outperforms all previous openly released models on the Vicuna benchmark, reaching 99. 3% of the performance level of ChatGPT while only requiring 24 hours of finetuning on a single GPU.

Chatbot +3

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

kyegomez/tree-of-thoughts 17 May 2023

Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference.

Decision Making Language Modelling

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models

shihaozhaozsh/uni-controlnet 25 May 2023

Text-to-Image diffusion models have made tremendous progress over the past two years, enabling the generation of highly realistic images based on open-domain text descriptions.

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

Liuhong99/Sophia 23 May 2023

Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training.

Language Modelling Stochastic Optimization

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

opengvlab/internchat 18 May 2023

Synthesizing visual content that meets users' needs often requires flexible and precise controllability of the pose, shape, expression, and layout of the generated objects.

Image Manipulation

VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks

opengvlab/interngpt 18 May 2023

We hope this model can set a new baseline for generalist vision and language models.

Language Modelling

EasySpider: A No-Code Visual System for Crawling the Web

NaiboWang/EasySpider ACM The Web Conference 2023

As such, web-crawling is an essential tool for both computational and non-computational scientists to conduct research.


Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach

drboog/profusion 23 May 2023

However, generating images of novel concept provided by the user input image is still a challenging task.

Text-to-Image Generation

Any-to-Any Generation via Composable Diffusion

microsoft/i-code 19 May 2023

We present Composable Diffusion (CoDi), a novel generative model capable of generating any combination of output modalities, such as language, image, video, or audio, from any combination of input modalities.

Audio Generation

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models

shi-labs/prompt-free-diffusion 25 May 2023

Text-to-image (T2I) research has grown explosively in the past year, owing to the large-scale pre-trained diffusion models and many emerging personalization and editing approaches.

Conditional Text-to-Image Synthesis Image Generation +3

