Qwen Technical Report

qwenlm/qwen 28 Sep 2023

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.

Language Modeling Language Modelling +3

16,608
0.37 stars / hour

Cut Your Losses in Large-Vocabulary Language Models

unslothai/unsloth 13 Nov 2024

We implement a custom kernel that performs the matrix multiplications and the log-sum-exp reduction over the vocabulary in flash memory, making global memory consumption for the cross-entropy computation negligible.

24,298
0.37 stars / hour

DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing

ucbepic/docetl 16 Oct 2024

Our evaluation on four different unstructured document analysis tasks demonstrates that DocETL finds plans with outputs that are 25 to 80% more accurate than well-engineered baselines, addressing a critical gap in unstructured data analysis.

1,648
0.35 stars / hour

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

hjyao00/mulberry 24 Dec 2024

Using CoMCTS, we construct Mulberry-260k, a multimodal dataset with a tree of rich, explicit and well-defined reasoning nodes for each question.

423
0.34 stars / hour

OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer

om-ai-lab/OmAgent 24 Jun 2024

Recent advancements in Large Language Models (LLMs) have expanded their capabilities to multimodal contexts, including comprehensive video understanding.

AI Agent Video Understanding

1,588
0.33 stars / hour

DeepSeek-VL: Towards Real-World Vision-Language Understanding

deepseek-ai/deepseek-vl 8 Mar 2024

The DeepSeek-VL family (both 1. 3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks.

Chatbot Language Modelling +3

3,345
0.33 stars / hour

Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

git-disl/virus 29 Jan 2025

By designing a new red-teaming method, we in this paper show that purely relying on the moderation guardrail for data filtration is not reliable.

Red Teaming Safety Alignment

37
0.32 stars / hour

Computational Job Market Analysis with Natural Language Processing

google-research/arxiv-latex-cleaner 29 Apr 2024

[Abridged Abstract] Recent technological advances underscore labor market dynamics, yielding significant consequences for employment prospects and increasing job vacancy data across platforms and languages.

Active Learning De-identification +1

5,846
0.34 stars / hour

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

deepseek-ai/dreamcraft3d 25 Oct 2023

The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene.

3D Generation

2,759
0.30 stars / hour

VPIT: Real-time Embedded Single Object 3D Tracking Using Voxel Pseudo Images

opendr-eu/opendr 6 Jun 2022

In this paper, we propose a novel voxel-based 3D single object tracking (3D SOT) method called Voxel Pseudo Image Tracking (VPIT).

3D Single Object Tracking Object +1

684
0.30 stars / hour