Data Formulator 2: Iteratively Creating Rich Visualizations with AI

microsoft/data-formulator 28 Aug 2024

To create rich visualizations, data analysts often need to iterate back and forth among data processing and chart specification to achieve their goals.

Code Generation Navigate

1,282
0.39 stars / hour

A Scalable Communication Protocol for Networks of Large Language Models

agora-protocol/paper-demo 14 Oct 2024

These requisites, which we refer to as the Agent Communication Trilemma, are hard to achieve in large networks of agents.

108
0.37 stars / hour

AutoGen Studio: A No-Code Developer Tool for Building and Debugging Multi-Agent Systems

microsoft/autogen 9 Aug 2024

Multi-agent systems, where multiple agents (generative AI models + tools) collaborate, are emerging as an effective pattern for solving long-running, complex tasks in numerous domains.

33,740
0.37 stars / hour

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

mark12ding/sam2long 21 Oct 2024

Benefiting from its heuristic search design, SAM2Long is robust toward occlusions and object reappearances, and can effectively segment and track objects for complex long-term videos.

Object Segmentation +4

279
0.35 stars / hour

D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement

Peterande/D-FINE 17 Oct 2024

When pretrained on Objects365, D-FINE-L / X attains 57. 1% / 59. 3% AP, surpassing all existing real-time detectors.

 Ranked #1 on Real-Time Object Detection on MS COCO (using extra training data)

Real-Time Object Detection regression

721
0.35 stars / hour

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

THUDM/WebRL 4 Nov 2024

Specifically, WebRL incorporates 1) a self-evolving curriculum that generates new tasks from unsuccessful attempts, 2) a robust outcome-supervised reward model (ORM), and 3) adaptive reinforcement learning strategies to ensure consistent improvements.

188
0.34 stars / hour

OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer

om-ai-lab/OmAgent 24 Jun 2024

Recent advancements in Large Language Models (LLMs) have expanded their capabilities to multimodal contexts, including comprehensive video understanding.

AI Agent Video Understanding

1,217
0.34 stars / hour

LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

microsoft/LLM2CLIP 7 Nov 2024

In this paper, we propose LLM2CLIP, a novel approach that embraces the power of LLMs to unlock CLIP's potential.

Contrastive Learning Image Captioning +3

64
0.34 stars / hour

Geometry-Informed Neural Networks

ml-jku/ginns-geometry-informed-neural-networks 21 Feb 2024

Geometry is a ubiquitous tool in computer graphics, design, and engineering.

Diversity

91
0.33 stars / hour