We introduce a simple framework that operates on 3D points of single objects or whole scenes coupled with category-agnostic large-scale training from diverse RGB-D videos.
We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs that can be implemented efficiently.
We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters.
Ranked #1 on Language Modelling on CLUE (CMRC2018)
In this work, we present a conceptually simple and effective method to train a strong bilingual/multilingual multimodal representation model.
The validation and deployment of novel research ideas in the field of Deep Learning is often limited by the availability of efficient compute kernels for certain basic primitives.
In order to better understand how ICL works, this paper explains language models as meta-optimizers and understands ICL as a kind of implicit finetuning.
In this work, we propose DeepMatcher, a deep Transformer-based network built upon our investigation of local feature matching in detector-free methods.
Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain.
Ranked #1 on Question Answering on PubMedQA
We present a simple, efficient, and scalable unfolding network, SAUNet, to simplify the network design with an adaptive alternate optimization framework for hyperspectral image (HSI) reconstruction.
To reproduce the success of text-to-image (T2I) generation, recent works in text-to-video (T2V) generation employ large-scale text-video dataset for fine-tuning.