MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

epfllm/meditron 27 Nov 2023

Large language models (LLMs) can potentially democratize access to medical knowledge.

 Ranked #1 on Multiple Choice Question Answering (MCQA) on MedMCQA (Dev Set (Acc-%) metric)

Conditional Text Generation Multiple Choice Question Answering (MCQA)

616
5.06 stars / hour

YUAN 2.0: A Large Language Model with Localized Filtering-based Attention

ieit-yuan/yuan-2.0 27 Nov 2023

In this work, the Localized Filtering-based Attention (LFA) is introduced to incorporate prior knowledge of local dependencies of natural language into Attention.

Code Generation Language Modelling +2

364
2.31 stars / hour

On Bringing Robots Home

notmahi/dobb-e 27 Nov 2023

We use the Stick to collect 13 hours of data in 22 homes of New York City, and train Home Pretrained Representations (HPR).

179
2.15 stars / hour

LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS

VITA-Group/LightGaussian 28 Nov 2023

Recent advancements in real-time neural rendering using point-based techniques have paved the way for the widespread adoption of 3D representations.

Network Pruning Neural Rendering +2

76
2.04 stars / hour

GS-IR: 3D Gaussian Splatting for Inverse Rendering

lzhnb/gs-ir 26 Nov 2023

We propose GS-IR, a novel inverse rendering approach based on 3D Gaussian Splatting (GS) that leverages forward mapping volume rendering to achieve photorealistic novel view synthesis and relighting results.

Inverse Rendering Novel View Synthesis

85
1.87 stars / hour

Improving Sample Quality of Diffusion Models Using Self-Attention Guidance

lllyasviel/fooocus ICCV 2023

Denoising diffusion models (DDMs) have attracted attention for their exceptional generation quality and diversity.

Denoising Image Generation

19,676
1.57 stars / hour

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

dvlab-research/llama-vid 28 Nov 2023

Current VLMs, while proficient in tasks like image captioning and visual question answering, face computational burdens when processing long videos due to the excessive visual tokens.

Image Captioning Question Answering +1

73
1.24 stars / hour

OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving

wzzheng/occworld 27 Nov 2023

In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D Occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes.

Autonomous Driving

104
1.19 stars / hour

Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling

lizhe00/animatablegaussians 27 Nov 2023

Overall, our method can create lifelike avatars with dynamic, realistic and generalized appearances.

159
1.07 stars / hour

UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

ailab-cvc/unireplknet 27 Nov 2023

1) We propose four architectural guidelines for designing large-kernel ConvNets, the core of which is to exploit the essential characteristics of large kernels that distinguish them from small kernels - they can see wide without going deep.

 Ranked #1 on Object Detection on COCO 2017 (mAP metric)

Image Classification Object Detection +3

88
1.06 stars / hour