Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

casia-iva-lab/obj2seq 28 Sep 2022

Obj2Seq is able to flexibly determine input categories to satisfy customized requirements, and be easily extended to different visual tasks.

Multi-Label Classification Object Detection +1

High-Resolution Image Synthesis with Latent Diffusion Models

compvis/stable-diffusion CVPR 2022

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond.

Denoising Image Inpainting +3

Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning

yiyixuxu/denoising-diffusion-flax 8 Aug 2022

The main idea behind our approach is to first represent the discrete data as binary bits, and then train a continuous diffusion model to model these bits as real numbers which we call analog bits.

Image Captioning Image Generation

Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools Segmentation

angeloucn/min_max_similarity 29 Mar 2022

To address this issue, we proposed a semi-supervised segmentation network based on contrastive learning.

Contrastive Learning Video Segmentation +1

VToonify: Controllable High-Resolution Portrait Video Style Transfer

williamyang1991/vtoonify 22 Sep 2022

Although a series of successful portrait image toonification models built upon the powerful StyleGAN have been proposed, these image-oriented methods have obvious limitations when applied to videos, such as the fixed frame size, the requirement of face alignment, missing non-facial details and temporal inconsistency.

Face Alignment Style Transfer +1

NP-Match: When Neural Processes meet Semi-Supervised Learning

jianf-wang/np-match 3 Jul 2022

Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data.

Semi-Supervised Image Classification

DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

IDEA-Research/detrex 7 Mar 2022

Compared to other models on the leaderboard, DINO significantly reduces its model size and pre-training data size while achieving better results.

 Ranked #1 on Object Detection on COCO minival (using extra training data)

Real-Time Object Detection

Poisson Flow Generative Models

newbeeer/poisson_flow 22 Sep 2022

We interpret the data points as electrical charges on the $z=0$ hyperplane in a space augmented with an additional dimension $z$, generating a high-dimensional electric field (the gradient of the solution to Poisson equation).

Image Generation

SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation

visual-attention-network/segnext 18 Sep 2022

Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90. 6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it.

Semantic Segmentation

