In this work, we propose a new conversational framework that comprehensively integrates these information sources, collect data to train our models and evaluate their performance.
We hope that our study can facilitate the research community and LLM vendors in promoting safer and regulated LLMs.
Training several popular base models with this corpus significantly improves their mathematical abilities, leading to the creation of the MathCoder2 family of models.
Through the combined effects of these measures, our network acquires robust NTK properties, ensuring optimal convergence and stability of the NTK matrix and minimizing the NTK-related generalization loss, significantly enhancing its theoretical generalization.
class-incremental learning Few-Shot Class-Incremental Learning +2
Alongside PhyGenBench, we propose a novel evaluation framework called PhyGenEval.
As large language models (LLMs) advance, their inability to autonomously execute tasks by directly interacting with external tools remains a critical limitation.
In this work, we propose Quantization-aware Training for Domain Generalization (QT-DoG) and demonstrate that weight quantization effectively leads to flatter minima in the loss landscape, thereby enhancing domain generalization.
Ranked #14 on Domain Generalization on TerraIncognita
In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i. e., the denoising network and the sampling process.
We present the Qwen2-VL Series, an advanced upgrade of the previous Qwen-VL models that redefines the conventional predetermined-resolution approach in visual processing.
Ranked #3 on Temporal Relation Extraction on Vinoground
It can be used to obtain complete information, so that train-from-scratch models can achieve better results than state-of-the-art models pre-trained using large datasets, the comparison results are shown in Figure 1.
Ranked #7 on Real-Time Object Detection on MS COCO