Wavelet Diffusion Models are fast and scalable Image Generators

vinairesearch/wavediff 29 Nov 2022

Diffusion models are rising as a powerful solution for high-fidelity image generation, which exceeds GANs in quality in many circumstances.

Image Generation

103
0.59 stars / hour

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation

KU-CVLAB/CAT-Seg 21 Mar 2023

However, the problem of transferring these capabilities learned from image-level supervision to the pixel-level task of segmentation and addressing arbitrary unseen categories at inference makes this task challenging.

Image Segmentation Open Vocabulary Semantic Segmentation +2

40
0.58 stars / hour

Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

exiawsh/streampetr 21 Mar 2023

In this paper, we propose a long-sequence modeling framework, named StreamPETR, for multi-view 3D object detection.

3D Object Detection object-detection

53
0.54 stars / hour

Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval

anosorae/irra 22 Mar 2023

To alleviate these issues, we present IRRA: a cross-modal Implicit Relation Reasoning and Aligning framework that learns relations between local visual-textual tokens and enhances global image-text matching without requiring additional prior supervision.

 Ranked #1 on Text based Person Retrieval on RSTPReid (using extra training data)

Language Modelling Masked Language Modeling +6

30
0.54 stars / hour

Nerfstudio: A Modular Framework for Neural Radiance Field Development

nerfstudio-project/nerfstudio 8 Feb 2023

Neural Radiance Fields (NeRF) are a rapidly growing area of research with wide-ranging applications in computer vision, graphics, robotics, and more.

4,190
0.53 stars / hour

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

eleutherai/gpt-neox 20 May 2022

Relative positional embeddings (RPE) have received considerable attention since RPEs effectively model the relative distance among tokens and enable length extrapolation.

Language Modelling

4,431
0.53 stars / hour

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

salesforce/CodeGen 25 Mar 2022

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.

Code Generation Language Modelling +1

2,638
0.52 stars / hour

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

idea-research/groundingdino 9 Mar 2023

To effectively fuse language and vision modalities, we conceptually divide a closed-set detector into three phases and propose a tight fusion solution, which includes a feature enhancer, a language-guided query selection, and a cross-modality decoder for cross-modality fusion.

object-detection Referring Expression +2

109
0.49 stars / hour

Memorizing Transformers

lucidrains/memorizing-transformers-pytorch ICLR 2022

Language models typically need to be trained or finetuned in order to acquire new knowledge, which involves updating their weights.

Language Modelling

455
0.48 stars / hour

FedBN: Federated Learning on Non-IID Features via Local Batch Normalization

adap/flower ICLR 2021

The emerging paradigm of federated learning (FL) strives to enable collaborative training of deep models on the network edge without centrally aggregating raw data and hence improving data privacy.

Autonomous Driving Federated Learning

2,279
0.48 stars / hour