Transparent Image Layer Diffusion using Latent Transparency

layerdiffusion/layerdiffusion 27 Feb 2024

We show that latent transparency can be applied to different open source image generators, or be adapted to various conditional control systems to achieve applications like foreground/background-conditioned layer generation, joint layer generation, structural control of layer contents, etc.

Image Matting

848
9.49 stars / hour

YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

wongkinyiu/yolov9 21 Feb 2024

It can be used to obtain complete information, so that train-from-scratch models can achieve better results than state-of-the-art models pre-trained using large datasets, the comparison results are shown in Figure 1.

object-detection Object Detection

5,860
4.98 stars / hour

Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases

eladlev/autoprompt 5 Feb 2024

Recent studies have demonstrated the capabilities of LLMs to automatically conduct prompt engineering by employing a meta-prompt that incorporates the outcomes of the last trials and proposes an improved prompt.

Prompt Engineering

1,095
3.97 stars / hour

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

lichao-sun/sorareview 27 Feb 2024

Sora is a text-to-video generative AI model, released by OpenAI in February 2024.

Marketing Video Generation

230
3.02 stars / hour

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

mit-han-lab/distrifuser 29 Feb 2024

To overcome this dilemma, we observe the high similarity between the input from adjacent diffusion steps and propose displaced patch parallelism, which takes advantage of the sequential nature of the diffusion process by reusing the pre-computed feature maps from the previous timestep to provide context for the current step.

73
2.79 stars / hour

Datasets for Large Language Models: A Comprehensive Survey

lmmlzn/awesome-llms-datasets 28 Feb 2024

Additionally, a comprehensive review of the existing available dataset resources is also provided, including statistics from 444 datasets, covering 8 language categories and spanning 32 domains.

Language Modelling Large Language Model

69
2.17 stars / hour

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

mbzuai-oryx/mobillama 26 Feb 2024

"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development.

278
1.92 stars / hour

Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

sally-sh/vsp-llm 23 Feb 2024

In visual speech processing, context modeling capability is one of the most important requirements due to the ambiguous nature of lip movements.

speech-recognition Translation +1

219
1.65 stars / hour

Training-Free Long-Context Scaling of Large Language Models

hkunlp/chunkllama 27 Feb 2024

The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length.

98
1.55 stars / hour

The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QA

zhangzhao219/wsdm-cup-2024 28 Feb 2024

Conversational multi-doc question answering aims to answer specific questions based on the retrieved documents as well as the contextual conversations.

Natural Language Understanding Question Answering

48
1.48 stars / hour