YUAN 2.0: A Large Language Model with Localized Filtering-based Attention

ieit-yuan/yuan-2.0 27 Nov 2023

In this work, the Localized Filtering-based Attention (LFA) is introduced to incorporate prior knowledge of local dependencies of natural language into Attention.

Code Generation Language Modelling +2

Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models

pku-yuangroup/video-bench 27 Nov 2023

Video-based large language models (Video-LLMs) have been recently introduced, targeting both fundamental improvements in perception and comprehension, and a diverse range of user inquiries.

Decision Making Question Answering

Scalable AI Safety via Doubly-Efficient Debate

google-deepmind/debate 23 Nov 2023

The emergence of pre-trained AI systems with powerful capabilities across a diverse and ever-increasing set of complex domains has raised a critical challenge for AI safety as tasks can become too complicated for humans to judge directly.

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

luosiallen/latent-consistency-model 9 Nov 2023

Latent Consistency Models (LCMs) have achieved impressive performance in accelerating text-to-image generative tasks, producing high-quality images with minimal inference steps.

Image Generation

Adversarial Diffusion Distillation

stability-ai/generative-models 28 Nov 2023

We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps while maintaining high image quality.

Image Generation

Instruction Tuning with Human Curriculum

imoneoi/openchat 14 Oct 2023

The dominant paradigm for instruction tuning is the random-shuffled training of maximally diverse instruction-response pairs.

OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving

wzzheng/occworld 27 Nov 2023

In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D Occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes.

Autonomous Driving

Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling

lizhe00/animatablegaussians 27 Nov 2023

Overall, our method can create lifelike avatars with dynamic, realistic and generalized appearances.

UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

ailab-cvc/unireplknet 27 Nov 2023

1) We propose four architectural guidelines for designing large-kernel ConvNets, the core of which is to exploit the essential characteristics of large kernels that distinguish them from small kernels - they can see wide without going deep.

 Ranked #1 on Object Detection on COCO 2017 (mAP metric)

Image Classification Object Detection +3

Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models

rohitgandikota/sliders 20 Nov 2023

We present a method to create interpretable concept sliders that enable precise control over attributes in image generations from diffusion models.

Image Generation

