BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

microsoft/biogpt 19 Oct 2022

Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain.

Document Classification Language Modelling +3

Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

AttendAndExcite/Attend-and-Excite 31 Jan 2023

Recent text-to-image generative models have demonstrated an unparalleled ability to generate diverse and creative imagery guided by a target text prompt.

Generative Semantic Nursing

Multimodal Chain-of-Thought Reasoning in Language Models

amazon-science/mm-cot 2 Feb 2023

By incorporating the vision features in both stages, the model is able to generate effective rationales that contribute to answer inference.

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

salesforce/lavis 30 Jan 2023

The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models.

Image Captioning Image Retrieval +5

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

showlab/Tune-A-Video 22 Dec 2022

To reproduce the success of text-to-image (T2I) generation, recent works in text-to-video (T2V) generation employ large-scale text-video dataset for fine-tuning.

Style Transfer Text-to-Video Generation +1

InstructPix2Pix: Learning to Follow Image Editing Instructions

timothybrooks/instruct-pix2pix 17 Nov 2022

We propose a method for editing images from human instructions: given an input image and a written instruction that tells the model what to do, our model follows these instructions to edit the image.

Language Modelling Text-based Image Editing +1

STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation

ucaszyp/steps 2 Feb 2023

By fitting a bridge-shaped curve to the illumination map distribution, both regions are suppressed and two tasks are bridged naturally.

Depth Estimation Image Enhancement

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

heatz123/naturalspeech 9 May 2022

In this paper, we answer these questions by first defining the human-level quality based on the statistical significance of subjective measure and introducing appropriate guidelines to judge it, and then developing a TTS system called NaturalSpeech that achieves human-level quality on a benchmark dataset.

Speech Synthesis Text-To-Speech Synthesis

ArchiSound: Audio Generation with Diffusion

archinetai/audio-diffusion-pytorch 30 Jan 2023

The recent surge in popularity of diffusion models for image generation has brought new attention to the potential of these models in other areas of media generation.

Audio Generation Image Generation

Parsel: A (De-)compositional Framework for Algorithmic Reasoning with Language Models

ezelikman/parsel 20 Dec 2022

Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical multi-step reasoning tasks like generating complex programs.

Automated Theorem Proving Code Generation +2

