InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

internlm/internlm-xcomposer 26 Sep 2023

We propose InternLM-XComposer, a vision-language large model that enables advanced image-text comprehension and composition.

Image Comprehension Reading Comprehension

Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models

hpcaitech/colossalai 6 Feb 2023

To address these challenges, we introduce a system that can jointly optimize distributed execution and gradient checkpointing plans.


MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation

jiahao000/mosaicfusion 22 Sep 2023

We present MosaicFusion, a simple yet effective diffusion-based data augmentation approach for large vocabulary instance segmentation.

Data Augmentation Instance Segmentation +1

Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism

huawei-noah/Efficient-Computing 20 Sep 2023

In the past years, YOLO-series models have emerged as the leading approaches in the area of real-time object detection.

object-detection Real-Time Object Detection

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

stanfordnlp/dsp 28 Dec 2022

Retrieval-augmented in-context learning has emerged as a powerful approach for addressing knowledge-intensive tasks using frozen language models (LM) and retrieval models (RM).

Language Modelling Question Answering +1

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets

huawei-noah/Efficient-Computing 17 Aug 2022

Since the modern deep neural networks are of sophisticated design with complex architecture for the accuracy reason, the diversity on distributions of weights and activations is very high.

Classification with Binary Neural Network Quantization

GPT4Image: Can Large Pre-trained Models Help Vision Models on Perception Tasks?

huawei-noah/Efficient-Computing 1 Jun 2023

We present a new learning paradigm in which the knowledge extracted from large pre-trained models are utilized to help models like CNN and ViT learn enhanced representations and achieve better performance.

Descriptive Image Classification

Positive-Unlabeled Compression on the Cloud

huawei-noah/DAFL NeurIPS 2019

In practice, only a small portion of the original training set is required as positive examples and more useful training examples can be obtained from the massive unlabeled data on the cloud through a PU classifier with an attention based multi-scale feature extractor.

Knowledge Distillation

RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose Estimation

adwardlee/renderih ICCV 2023

The current interacting hand (IH) datasets are relatively simplistic in terms of background and texture, with hand joints being annotated by a machine annotator, which may result in inaccuracies, and the diversity of pose distribution is limited.

3D Interacting Hand Pose Estimation Hand Pose Estimation

