In the fast-changing realm of information, the capacity to construct coherent timelines from extensive event-related content has become increasingly significant and challenging.
Reinforcement Learning from Human Feedback (RLHF) has emerged as a critical approach for aligning large language models with human preferences, witnessing rapid algorithmic evolution through methods such as Proximal Policy Optimization (PPO), Direct Preference Optimization (DPO), REINFORCE Leave One-Out (RLOO), ReMax, and Group Relative Policy Optimization (GRPO).
We study the roots of algorithmic progress in deep policy gradient algorithms through a case study on two popular algorithms: Proximal Policy Optimization (PPO) and Trust Region Policy Optimization (TRPO).
We present Infinity, a Bitwise Visual AutoRegressive Modeling capable of generating high-resolution, photorealistic images following language instruction.
Understanding protein-ligand binding affinity is crucial for drug discovery, enabling the identification of promising drug candidates efficiently.
Ranked #1 on Protein-Ligand Affinity Prediction on PDBbind
The retrieval module employs BM25 along with a lightweight LLM model to achieve coarse-to-fine file retrieval.
The integrated system achieves state-of-the-art (SOTA) performance on ImageNet 256x256 generation with an FID score of 1. 35 while demonstrating remarkable training efficiency by reaching an FID score of 2. 11 in just 64 epochs--representing an over 21 times convergence speedup compared to the original DiT.
Ranked #1 on Image Generation on ImageNet 256x256
Foundation models have become popular in forecasting due to their ability to make accurate predictions, even with minimal fine-tuning on specific datasets.
To address these issues, we propose a novel Self-supervised Hierarchical Makeup Transfer (SHMT) method via latent diffusion models.
In this article, a LiDAR-based SLAM method is presented to improve the accuracy of pose estimations for ground vehicles in rough terrains, which is termed Rotation-Optimized LiDAR-Only (ROLO) SLAM.