Trending Research

Unrecognizable Yet Identifiable: Image Distortion with Preserved Embeddings

Recognito-Vision/Face-SDK-Android-Demo • 26 Jan 2024

In the realm of security applications, biometric authentication systems play a crucial role, yet one often encounters challenges concerning privacy and security while developing one.

Face Recognition Security Studies

213

0.31 stars / hour

Paper
Code

GhostFaceNets: Lightweight Face Recognition Model From Cheap Operations

Recognito-Vision/Android-FaceRecognition-FaceLivenessDetection • IEEE Access 2023

The development of deep learning-based biometric models that can be deployed on devices with constrained memory and computational resources has proven to be a significant challenge.

Ranked #1 on Face Recognition on CFP-FF

Face Identification Face Verification +1

213

0.31 stars / hour

Paper
Code

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

scutzzj/aniportrait • • 26 Mar 2024

In this study, we propose AniPortrait, a novel framework for generating high-quality animation driven by audio and a reference portrait image.

Face Reenactment

3,721

0.30 stars / hour

Paper
Code

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Leeroo-AI/mergoo • • 12 Mar 2024

We investigate efficient methods for training Large Language Models (LLMs) to possess capabilities in multiple specialized domains, such as coding, math reasoning and world knowledge.

Ranked #30 on Question Answering on TriviaQA

Arithmetic Reasoning Code Generation +6

248

0.30 stars / hour

Paper
Code

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

liming-ai/ControlNet_Plus_Plus • • 11 Apr 2024

To this end, we propose ControlNet++, a novel approach that improves controllable generation by explicitly optimizing pixel-level cycle consistency between generated images and conditional controls.

SSIM

175

0.29 stars / hour

Paper
Code

Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization

princeton-vl/multislam_diffpose • • 23 Apr 2024

The backbone is trained end-to-end using a novel differentiable solver for wide-baseline two-view pose.

Optical Flow Estimation Visual Odometry

0.29 stars / hour

Paper
Code

Magic Clothing: Controllable Garment-Driven Image Synthesis

shinechen1024/magicclothing • • 15 Apr 2024

We propose Magic Clothing, a latent diffusion model (LDM)-based network architecture for an unexplored garment-driven image synthesis task.

Image Generation

980

0.28 stars / hour

Paper
Code

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

instantstyle/instantstyle • • 3 Apr 2024

Tuning-free diffusion-based models have demonstrated significant potential in the realm of image personalization and customization.

Text-to-Image Generation

1,181

0.28 stars / hour

Paper
Code

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

opendatalab/unimernet • • 23 Apr 2024

This paper presents the UniMER dataset to provide the first study on Mathematical Expression Recognition (MER) towards complex real-world scenarios.

Image Augmentation

0.27 stars / hour

Paper
Code

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

dvlab-research/minigemini • • 27 Mar 2024

We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i. e., high-resolution visual tokens, high-quality data, and VLM-guided generation.

Ranked #9 on Visual Question Answering on MM-Vet

GPT-4 Image Comprehension +2

2,915

0.27 stars / hour

Paper
Code