In this survey, we provide a comprehensive overview of the current research progress on LLMs within the context of CL.
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.
Ranked #2 on Question Answering on PubChemQA
We introduce Groma, a Multimodal Large Language Model (MLLM) with grounded and fine-grained visual perception ability.
Multimodal Large Language Models (MLLMs) have demonstrated notable capabilities in general visual understanding and reasoning tasks.
MultiBooth addresses these issues by dividing the multi-concept generation process into two phases: a single-concept learning phase and a multi-concept integration phase.
Despite the recent progress in long-context language models, it remains elusive how transformer-based models exhibit the capability to retrieve relevant information from arbitrary locations within the long context.
We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages.
Molecular Relational Learning (MRL), aiming to understand interactions between molecular pairs, plays a pivotal role in advancing biochemical research.
We propose AutoCrawler, a two-stage framework that leverages the hierarchical structure of HTML for progressive understanding.
Our method benefits various applications including in-the-wild metrology monocular-SLAM, and 3D reconstruction, which highlight the versatility of Metric3D v2 models as geometric foundation models.
Ranked #1 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)