Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

yule-buaa/mergelm 6 Nov 2023

Based on this observation, we further sparsify delta parameters of multiple SFT homologous models with DARE and subsequently merge them into a single model by parameter averaging.

GSM8K Instruction Following

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

PKU-YuanGroup/Video-LLaVA 16 Nov 2023

In this work, we unify visual representation into the language feature space to advance the foundational LLM towards a unified LVLM.

Language Modelling Large Language Model +2

Emergence of Segmentation with Minimalistic White-Box Transformers

ma-lab-berkeley/crate 30 Aug 2023

Transformer-like models for vision tasks have recently proven effective for a wide range of downstream applications such as segmentation and detection.

Segmentation Self-Supervised Learning

GraphCast: Learning skillful medium-range global weather forecasting

deepmind/graphcast 24 Dec 2022

Global medium-range weather forecasting is critical to decision-making across many social and economic domains.

Decision Making Weather Forecasting

CogVLM: Visual Expert for Pretrained Language Models

thudm/cogvlm 6 Nov 2023

We introduce CogVLM, a powerful open-source visual language foundation model.

Language Modelling Visual Question Answering

Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

zoeyyao27/cot-igniting-agent 20 Nov 2023

Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demonstrably evidenced by their formidable empirical performance across a spectrum of complex reasoning tasks.

InRank: Incremental Low-Rank Learning

jiaweizzhao/inrank 20 Jun 2023

To remedy this, we design a new training algorithm Incremental Low-Rank Learning (InRank), which explicitly expresses cumulative weight updates as low-rank matrices while incrementally augmenting their ranks during training.

OneFormer3D: One Transformer for Unified Point Cloud Segmentation

filapro/oneformer3d 24 Nov 2023

Semantic, instance, and panoptic segmentation of 3D point clouds have been addressed using task-specific models of distinct design.

3D Instance Segmentation 3D Object Detection +4

HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis

sh-lee-prml/hierspeechpp 21 Nov 2023

Furthermore, we significantly improve the naturalness and speaker similarity of synthetic speech even in zero-shot speech synthesis scenarios.

Speech Synthesis Super-Resolution +2

SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation

jiehonglin/sam-6d 27 Nov 2023

Zero-shot 6D object pose estimation involves the detection of novel objects with their 6D poses in cluttered scenes, presenting significant challenges for model generalizability.

6D Pose Estimation using RGB Instance Segmentation +3

