To cover language, image, and video at the same time for different scenarios, a 3D transformer encoder-decoder framework is designed, which can not only deal with videos as 3D data but also adapt to texts and images as 1D and 2D data, respectively.
Inspired by BERT, we devise a Masked Point Modeling (MPM) task to pre-train point cloud Transformers.
Ranked #1 on 3D Point Cloud Classification on ScanObjectNN
Recent advances show that semi-supervised implicit representation learning can be achieved through physical constraints like Eikonal equations.
We introduce a prototype model and provide an open-source and extensible toolkit called OpenUE for various extraction tasks.
We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance.
Ranked #1 on Video Matting on VideoMatte240K
We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation.
Ranked #1 on Multi-Object Tracking and Segmentation on BDD100K
Our experiments indicate that the VQ-Diffusion model with the reparameterization is fifteen times faster than traditional AR methods while achieving a better image quality.
Long-tailed relation classification is a challenging problem as the head classes may dominate the training phase, thereby leading to the deterioration of the tail performance.
大规模推荐算法库，包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、DeepWalk、SSR、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、ListWise等，包含经典推荐系统数据集criteo 、movielens等
We hope VaxNeRF -- a careful combination of a classic technique with a deep method (that arguably replaced it) -- can empower and accelerate new NeRF extensions and applications, with its simplicity, portability, and reliable performance gains.