Depth Anything V2

DepthAnything/Depth-Anything-V2 13 Jun 2024

This work presents Depth Anything V2.

Monocular Depth Estimation

8.35 stars / hour

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

buaacyw/meshanything 14 Jun 2024

Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement.


3.43 stars / hour

Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image

AiuniAI/Unique3D 30 May 2024

In this work, we introduce Unique3D, a novel image-to-3D framework for efficiently generating high-quality 3D meshes from single-view images, featuring state-of-the-art generation fidelity and strong generalizability.

Image to 3D Single-View 3D Reconstruction +1

3.11 stars / hour

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

deepseek-ai/deepseek-coder-v2 17 Jun 2024

Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks.

16k Language Modelling +2

3.11 stars / hour

StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning

ictnlp/streamspeech 5 Jun 2024

Simultaneous speech-to-speech translation (Simul-S2ST, a. k. a streaming speech translation) outputs target speech while receiving streaming speech inputs, which is critical for real-time communication.

 Ranked #1 on de-en on CVSS

Automatic Speech Recognition (ASR) de-en +11

1.51 stars / hour

Meta Learning Text-to-Speech Synthesis in over 7000 Languages

digitalphonetics/ims-toucan 10 Jun 2024

In this work, we take on the challenging task of building a single text-to-speech synthesis system that is capable of generating speech in over 7000 languages, many of which lack sufficient data for traditional TTS development.

Meta-Learning Speech Synthesis +1

1.43 stars / hour

ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code

gersteinlab/ml-bench 16 Nov 2023

Despite Large Language Models (LLMs) like GPT-4 achieving impressive results in function-level code generation, they struggle with repository-scale code understanding (e. g., coming up with the right arguments for calling routines), requiring a deeper comprehension of complex file interactions.

Code Generation Navigate

1.19 stars / hour

Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

google-deepmind/loft 19 Jun 2024

Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.


1.00 stars / hour

TextGrad: Automatic "Differentiation" via Text

zou-group/textgrad 11 Jun 2024

Without modifying the framework, TextGrad improves the zero-shot accuracy of GPT-4o in Google-Proof Question Answering from $51\%$ to $55\%$, yields $20\%$ relative performance gain in optimizing LeetCode-Hard coding problem solutions, improves prompts for reasoning, designs new druglike small molecules with desirable in silico binding, and designs radiation oncology treatment plans with high specificity.

 Ranked #1 on on GPQA

Question Answering Specificity

0.91 stars / hour

Structure-Aware Sparse-View X-ray 3D Reconstruction

caiyuanhao1998/sax-nerf CVPR 2024

In this paper, we propose a framework, Structure-Aware X-ray Neural Radiodensity Fields (SAX-NeRF), for sparse-view X-ray 3D reconstruction.

3D Reconstruction Low-Dose X-Ray Ct Reconstruction +1

0.80 stars / hour