garak: A Framework for Security Probing Large Language Models

leondz/garak 16 Jun 2024

As Large Language Models (LLMs) are deployed and integrated into thousands of applications, the need for scalable evaluation of how models respond to adversarial attacks grows rapidly.

2,919
0.45 stars / hour

The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use

showlab/computer_use_ootb 15 Nov 2024

The recently released model, Claude 3. 5 Computer Use, stands out as the first frontier AI model to offer computer use in public beta as a graphical user interface (GUI) agent.

767
0.45 stars / hour

MureObjectStitch: Multi-reference Image Composition

bcmi/mureobjectstitch-image-composition 12 Nov 2024

Generative image composition aims to regenerate the given foreground object in the background image to produce a realistic composite image.

Object

80
0.41 stars / hour

Accelerating Vision Diffusion Transformers with Skip Branches

opensparsellms/skip-dit 26 Nov 2024

Diffusion Transformers (DiT), an emerging image and video generation model architecture, has demonstrated great potential because of its high generation quality and scalability properties.

Denoising Image Generation +1

36
0.40 stars / hour

Adaptive Blind All-in-One Image Restoration

davidserra9/abair 27 Nov 2024

Blind all-in-one image restoration models aim to recover a high-quality image from an input degraded with unknown distortions.

5-Degradation Blind All-in-One Image Restoration

18
0.40 stars / hour

Treat Visual Tokens as Text? But Your MLLM Only Needs Fewer Efforts to See

ZhangAIPI/YOPO_MLLM_Pruning 8 Oct 2024

In this study, we investigate the redundancy in visual computation at both the parameter and computational pattern levels within LLaVA, a representative MLLM, and introduce a suite of streamlined strategies to enhance efficiency.

25
0.39 stars / hour

Learning to Fly in Seconds

rl-tools/rl-tools 22 Nov 2023

Our framework enables Simulation-to-Reality (Sim2Real) transfer for direct RPM control after only 18 seconds of training on a consumer-grade laptop as well as its deployment on microcontrollers to control a multirotor under real-time guarantees.

Reinforcement Learning (RL)

628
0.38 stars / hour

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

dongyh20/insight-v 21 Nov 2024

In this paper, we present Insight-V, an early effort to 1) scalably produce long and robust reasoning data for complex multi-modal tasks, and 2) an effective training pipeline to enhance the reasoning capabilities of multi-modal large language models (MLLMs).

Visual Reasoning

81
0.37 stars / hour

A Survey on LLM-as-a-Judge

idea-finai/llm-as-evaluator 23 Nov 2024

Accurate and consistent evaluation is crucial for decision-making across numerous fields, yet it remains a challenging task due to inherent subjectivity, variability, and scale.

Survey

33
0.37 stars / hour

From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

llm-as-a-judge/awesome-llm-as-a-judge 25 Nov 2024

Assessment and evaluation have long been critical challenges in artificial intelligence (AI) and natural language processing (NLP).

50
0.37 stars / hour