1 code implementation • 17 Oct 2024 • Yiming Shi, Jiwei Wei, Yujia Wu, Ran Ran, ChengWei Sun, Shiyuan He, Yang Yang
However, LoRA utilize random initialization and optimization of low-rank matrices to approximate updated weights, which can result in suboptimal convergence and an accuracy gap compared to full fine-tuning.
no code implementations • 14 Oct 2024 • Hongjian Yu, Yiming Shi, Zherui Zhou, Christopher Haberland
We introduce a FLORES+ dataset as an evaluation benchmark for modern Wu Chinese machine translation models and showcase its compatibility with existing Wu data.
no code implementations • 9 Sep 2024 • ChengWei Sun, Jiwei Wei, Yujia Wu, Yiming Shi, Shiyuan He, Zeyu Ma, Ning Xie, Yang Yang
Large pre-trained models (LPMs) have demonstrated exceptional performance in diverse natural language processing and computer vision tasks.
no code implementations • 19 Aug 2024 • Yixiao Yuan, Yangchen Huang, Yu Ma, Xinjin Li, Zhenglin Li, Yiming Shi, Huapeng Zhou
Neural language representation models such as GPT, pre-trained on large-scale corpora, can effectively capture rich semantic patterns from plain text and be fine-tuned to consistently improve natural language generation performance.
no code implementations • 13 Aug 2024 • Yujia Wu, Yiming Shi, Jiwei Wei, ChengWei Sun, Yuyang Zhou, Yang Yang, Heng Tao Shen
Personalized text-to-image generation has gained significant attention for its capability to generate high-fidelity portraits of specific identities conditioned on user-defined prompts.
1 code implementation • 25 May 2024 • Hong Liu, Xiuxiu Qiu, Yiming Shi, Zelin Zang
Unsupervised fault detection in multivariate time series is critical for maintaining the integrity and efficiency of complex systems, with current methodologies largely focusing on statistical and machine learning techniques.
2 code implementations • 20 May 2024 • Junlong Jia, Ying Hu, Xi Weng, Yiming Shi, Miao Li, Xingjian Zhang, Baichuan Zhou, Ziyu Liu, Jie Luo, Lei Huang, Ji Wu
We present TinyLLaVA Factory, an open-source modular codebase for small-scale large multimodal models (LMMs) with a focus on simplicity of code implementations, extensibility of new features, and reproducibility of training results.