no code implementations • 22 Apr 2024 • Mingjie Ma, zhihuan yu, Yichao Ma, GuoHui Li
First, by emulating the cognitive process of human reasoning, an Event-Aware Pretraining auxiliary task is introduced to better activate LLM's global comprehension of intricate scenarios.
1 code implementation • 13 Dec 2023 • Zhiyuan Ma, zhihuan yu, Jianjun Li, BoWen Zhou
Then, we combine the advantages of MAEs and DPMs to design a progressive masking diffusion model, which gradually increases the masking proportion by three different schedulers and reconstructs the latent features from simple to difficult, without sequentially performing denoising diffusion as in DPMs or using fixed high masking ratio as in MAEs, so as to alleviate the high training time-consumption predicament.