To create rich visualizations, data analysts often need to iterate back and forth among data processing and chart specification to achieve their goals.
Decompilation aims to convert binary code to high-level source code, but traditional tools like Ghidra often produce results that are difficult to read and execute.
We implement a custom kernel that performs the matrix multiplications and the log-sum-exp reduction over the vocabulary in flash memory, making global memory consumption for the cross-entropy computation negligible.
We scale a proof-of-concept model to 3. 5 billion parameters and 800 billion tokens.
We present FireRedASR, a family of large-scale automatic speech recognition (ASR) models for Mandarin, designed to meet diverse requirements in superior performance and optimal efficiency across various applications.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
Since the release of ChatGPT, large language models (LLMs) have demonstrated remarkable capabilities across various domains.
After supervised finetuning the Qwen2. 5-32B-Instruct language model on s1K and equipping it with budget forcing, our model s1-32B exceeds o1-preview on competition math questions by up to 27% (MATH and AIME24).
Ranked #4 on
Mathematical Reasoning
on AIME24
We present rStar-Math to demonstrate that small language models (SLMs) can rival or even surpass the math reasoning capability of OpenAI o1, without distillation from superior models.
The quantification of audio aesthetics remains a complex challenge in audio processing, primarily due to its subjective nature, which is influenced by human perception and cultural context.
Chest X-rays (CXRs) play an integral role in driving critical decisions in disease management and patient care.