The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QA

zhangzhao219/wsdm-cup-2024 28 Feb 2024

Conversational multi-doc question answering aims to answer specific questions based on the retrieved documents as well as the contextual conversations.

Natural Language Understanding Question Answering

65
0.95 stars / hour

UFO: A UI-Focused Agent for Windows OS Interaction

microsoft/UFO 8 Feb 2024

We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision.

Navigate

2,448
0.94 stars / hour

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

yformer/EfficientSAM 1 Dec 2023

On segment anything task such as zero-shot instance segmentation, our EfficientSAMs with SAMI-pretrained lightweight image encoders perform favorably with a significant gain (e. g., ~4 AP on COCO/LVIS) over other fast SAM models.

Image Classification Instance Segmentation +5

1,434
0.86 stars / hour

Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

sally-sh/vsp-llm 23 Feb 2024

In visual speech processing, context modeling capability is one of the most important requirements due to the ambiguous nature of lip movements.

speech-recognition Translation +1

239
0.82 stars / hour

Language Agents as Optimizable Graphs

metauto-ai/gptswarm 26 Feb 2024

Various human-designed prompt engineering techniques have been proposed to improve problem solvers based on Large Language Models (LLMs), yielding many disparate code bases.

Prompt Engineering

79
0.77 stars / hour

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution

louaaron/score-entropy-discrete-diffusion 25 Oct 2023

Experimentally, we test our Score Entropy Discrete Diffusion models (SEDD) on standard language modeling tasks.

Denoising Language Modelling

87
0.76 stars / hour

Diffusion Model-Based Image Editing: A Survey

siatmmlab/awesome-diffusion-model-based-image-editing-methods 27 Feb 2024

In this survey, we provide an exhaustive overview of existing methods using diffusion models for image editing, covering both theoretical and practical aspects in the field.

Denoising Image Inpainting +1

74
0.75 stars / hour

Neural Network Diffusion

nus-hpc-ai-lab/neural-network-diffusion 20 Feb 2024

The autoencoder extracts latent representations of a subset of the trained network parameters.

538
0.74 stars / hour

Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning

michaeltmatthews/craftax 26 Feb 2024

Either they are too slow for meaningful research to be performed without enormous computational resources, like Crafter, NetHack and Minecraft, or they are not complex enough to pose a significant challenge, like Minigrid and Procgen.

NetHack reinforcement-learning +1

67
0.71 stars / hour

Scalable Diffusion Models with Transformers

facebookresearch/DiT ICCV 2023

We explore a new class of diffusion models based on the transformer architecture.

Image Generation

4,188
0.68 stars / hour