Probing the 3D Awareness of Visual Foundation Models

mbanani/probe3d 12 Apr 2024

Given that such models can classify, delineate, and localize objects in 2D, we ask whether they also represent their 3D structure?

1.51 stars / hour

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models

facebookresearch/llm-transparency-tool 10 Apr 2024

We present the LM Transparency Tool (LM-TT), an open-source interactive toolkit for analyzing the internal workings of Transformer-based language models.

Decision Making

1.49 stars / hour

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

pku-yuangroup/magictime 7 Apr 2024

Recent advances in Text-to-Video generation (T2V) have achieved remarkable success in synthesizing high-quality general videos from textual descriptions.

Text-to-Video Generation Video Generation

1.29 stars / hour

Solving Data Quality Problems with Desbordante: a Demo

mstrutov/desbordante 27 Jul 2023

However, most existing data profiling systems that focus on complex statistics do not provide proper integration with the tools used by contemporary data scientists.

Anomaly Detection Descriptive

1.16 stars / hour

Arc2Face: A Foundation Model of Human Faces

Recognito-Vision/Face-SDK-Linux-Demos 18 Mar 2024

This paper presents Arc2Face, an identity-conditioned face foundation model, which, given the ArcFace embedding of a person, can generate diverse photo-realistic images with an unparalleled degree of face similarity than existing models.

Diffusion Personalization Tuning Free Face Generation +1

1.03 stars / hour

Unrecognizable Yet Identifiable: Image Distortion with Preserved Embeddings

Recognito-Vision/NIST-FRVT-Top-1-Face-Recognition 26 Jan 2024

In the realm of security applications, biometric authentication systems play a crucial role, yet one often encounters challenges concerning privacy and security while developing one.

Face Recognition Security Studies

1.02 stars / hour

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Beomi/InfiniTransformer 10 Apr 2024

This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation.

Book summarization Language Modelling +1

0.97 stars / hour

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

fudan-generative-vision/champ 21 Mar 2024

In this study, we introduce a methodology for human image animation by leveraging a 3D human parametric model within a latent diffusion framework to enhance shape alignment and motion guidance in curernt human generative techniques.

Animated GIF Generation Image Animation +1

0.94 stars / hour

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

liming-ai/ControlNet_Plus_Plus 11 Apr 2024

To this end, we propose ControlNet++, a novel approach that improves controllable generation by explicitly optimizing pixel-level cycle consistency between generated images and conditional controls.


0.92 stars / hour

Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

FaceOnLive/Face-Liveness-Detection-SDK-Linux 12 Apr 2024

SPSC and SDSC augment live samples into simulated attack samples by simulating spoofing clues of physical and digital attacks, respectively, which significantly improve the capability of the model to detect "unseen" attack types.

Data Augmentation Face Anti-Spoofing +1

0.87 stars / hour