OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

levihsu/ootdiffusion 4 Mar 2024

We present OOTDiffusion, a novel network architecture for realistic and controllable image-based virtual try-on (VTON).

Denoising Image Generation +1

3,991
0.48 stars / hour

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Doubiiu/DynamiCrafter 18 Oct 2023

Animating a still image offers an engaging visual experience.

Image Animation

1,401
0.45 stars / hour

A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities

ibrahimethemhamamci/ct-clip 26 Mar 2024

A major challenge in computational research in 3D medical imaging is the lack of comprehensive datasets.

Anomaly Detection Retrieval

33
0.44 stars / hour

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

openai/transformer-debugger 1 Nov 2022

Research in mechanistic interpretability seeks to explain behaviors of machine learning models in terms of their internal components.

Language Modelling

3,540
0.43 stars / hour

RGBD GS-ICP SLAM

lab-of-ai-and-robotics/gs_icp_slam 19 Mar 2024

Simultaneous Localization and Mapping (SLAM) with dense representation plays a key role in robotics, Virtual Reality (VR), and Augmented Reality (AR) applications.

Simultaneous Localization and Mapping

62
0.42 stars / hour

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

squeezeailab/llm2llm 22 Mar 2024

LLM2LLM (1) fine-tunes a baseline student LLM on the initial seed data, (2) evaluates and extracts data points that the model gets wrong, and (3) uses a teacher LLM to generate synthetic data based on these incorrect data points, which are then added back into the training data.

Data Augmentation GSM8K +1

52
0.42 stars / hour

When Do We Not Need Larger Vision Models?

bfshi/scaling_on_scales 19 Mar 2024

Our results show that a multi-scale smaller model has comparable learning capacity to a larger model, and pre-training smaller models with S$^2$ can match or even exceed the advantage of larger models.

Depth Estimation

148
0.41 stars / hour

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

kuleshov-group/caduceus 5 Mar 2024

Large-scale sequence modeling has sparked rapid advances that now extend into biology and genomics.

83
0.40 stars / hour

Detecting Machine-Generated Texts by Multi-Population Aware Optimization for Maximum Mean Discrepancy

zshsh98/mmd-mp 25 Feb 2024

Unfortunately, it is challenging to distinguish MGTs and human-written texts because the distributional discrepancy between them is often very subtle due to the remarkable performance of LLMs.

Hallucination Sentence

35
0.40 stars / hour

TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document

yuliang-liu/monkey 7 Mar 2024

We present TextMonkey, a large multimodal model (LMM) tailored for text-centric tasks.

document understanding Key Information Extraction +4

1,228
0.38 stars / hour