Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

openai/transformer-debugger 1 Nov 2022

Research in mechanistic interpretability seeks to explain behaviors of machine learning models in terms of their internal components.

Language Modelling

2,833
6.04 stars / hour

DeepSeek-VL: Towards Real-World Vision-Language Understanding

deepseek-ai/deepseek-vl 8 Mar 2024

The DeepSeek-VL family (both 1. 3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks.

Chatbot Language Modelling +3

901
5.87 stars / hour

VideoMamba: State Space Model for Efficient Video Understanding

opengvlab/videomamba 11 Mar 2024

Addressing the dual challenges of local redundancy and global dependencies in video understanding, this work innovatively adapts the Mamba to the video domain.

Video Understanding

237
3.87 stars / hour

V3D: Video Diffusion Models are Effective 3D Generators

heheyas/v3d 11 Mar 2024

To fully unleash the potential of video diffusion to perceive the 3D world, we further introduce geometrical consistency prior and extend the video diffusion model to a multi-view consistent 3D generator.

Novel View Synthesis

245
3.22 stars / hour

Extreme Compression of Large Language Models via Additive Quantization

vahe1994/aqlm 11 Jan 2024

The emergence of accurate open large language models (LLMs) has led to a race towards quantization techniques for such models enabling execution on end-user devices.

Quantization

727
1.94 stars / hour

TripoSR: Fast 3D Object Reconstruction from a Single Image

vast-ai-research/triposr 4 Mar 2024

This technical report introduces TripoSR, a 3D reconstruction model leveraging transformer architecture for fast feed-forward 3D generation, producing 3D mesh from a single image in under 0. 5 seconds.

3D Object Reconstruction From A Single Image 3D Reconstruction +1

2,930
1.88 stars / hour

DragAnything: Motion Control for Anything using Entity Representation

showlab/draganything 12 Mar 2024

We introduce DragAnything, which utilizes a entity representation to achieve motion control for any object in controllable video generation.

Object Video Generation

121
1.88 stars / hour

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

jiaweizzhao/galore 6 Mar 2024

Our approach reduces memory usage by up to 65. 5% in optimizer states while maintaining both efficiency and performance for pre-training on LLaMA 1B and 7B architectures with C4 dataset with up to 19. 7B tokens, and on fine-tuning RoBERTa on GLUE tasks.

732
1.66 stars / hour

SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

initialneil/splattingavatar 8 Mar 2024

We present SplattingAvatar, a hybrid 3D representation of photorealistic human avatars with Gaussian Splatting embedded on a triangle mesh, which renders over 300 FPS on a modern GPU and 30 FPS on a mobile device.

125
1.53 stars / hour

SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

zcablii/sardet_100k 11 Mar 2024

To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.

 Ranked #1 on 2D Object Detection on SARDet-100K (using extra training data)

Object object-detection +1

91
1.37 stars / hour