The AdEMAMix Optimizer: Better, Faster, Older

nanowell/AdEMAMix-Optimizer-Pytorch 5 Sep 2024

This work questions the use of a single EMA to accumulate past gradients and empirically demonstrates how this choice can be sub-optimal: a single EMA cannot simultaneously give a high weight to the immediate past, and a non-negligible weight to older gradients.

Image Classification Language Modelling

120
0.73 stars / hour

Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos

coltonstearns/dynamic-gaussian-marbles 26 Jun 2024

We evaluate on the Nvidia Dynamic Scenes dataset and the DyCheck iPhone dataset, and show that Gaussian Marbles significantly outperforms other Gaussian baselines in quality, and is on-par with non-Gaussian representations, all while maintaining the efficiency, compositionality, editability, and tracking benefits of Gaussians.

Novel View Synthesis Point Tracking

52
0.67 stars / hour

Text2SQL is Not Enough: Unifying AI and Databases with TAG

tag-research/tag-bench 27 Aug 2024

Such systems would allow users to leverage the powerful reasoning and knowledge capabilities of language models (LMs) alongside the scalable computational power of data management systems.

RAG Text-To-SQL +1

345
0.64 stars / hour

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

caiyuanhao1998/x-gaussian 7 Mar 2024

X-ray is widely applied for transmission imaging due to its stronger penetration than natural light.

CT Reconstruction Novel View Synthesis

197
0.64 stars / hour

Agent Workflow Memory

zorazrw/agent-workflow-memory 11 Sep 2024

Despite the potential of language model-based agents to solve real-world tasks such as web navigation, current methods still struggle with long-horizon tasks with complex action trajectories.

Language Modelling

75
0.58 stars / hour

Self-Harmonized Chain of Thought

Xalp/ECHO 6 Sep 2024

Chain-of-Thought (CoT) prompting reveals that large language models are capable of performing complex reasoning via intermediate steps.

43
0.56 stars / hour

MAD-ICP: It Is All About Matching Data -- Robust and Informed LiDAR Odometry

rvp-group/mad-icp 9 May 2024

Most of these systems implicitly rely on assumptions about the operating environment, the sensor used, and motion pattern.

243
0.83 stars / hour

LinFusion: 1 GPU, 1 Minute, 16K Image

huage001/linfusion 3 Sep 2024

We find that the distilled model, termed LinFusion, achieves performance on par with or superior to the original SD after only modest training, while significantly reducing time and memory complexity.

16k Causal Inference +1

169
0.49 stars / hour

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

jishengpeng/wavtokenizer 29 Aug 2024

Despite the reduced number of tokens, WavTokenizer achieves state-of-the-art reconstruction quality with outstanding UTMOS scores and inherently contains richer semantic information.

Language Modelling

630
0.46 stars / hour

Frequency-aware Feature Fusion for Dense Image Prediction

linwei-chen/freqfusion 23 Aug 2024

The offset generator refines large inconsistent features and thin boundaries by replacing inconsistent features with more consistent ones through resampling, while the AHPF generator enhances high-frequency detailed boundary information lost during downsampling.

41
0.44 stars / hour