1 code implementation • 3 Oct 2023 • Heejun Lee, Jina Kim, Jeffrey Willette, Sung Ju Hwang
SEA estimates the attention matrix with linear complexity via kernel-based linear attention, then subsequently creates a sparse attention matrix with a top-k selection to perform a sparse attention operation.
1 code implementation • 5 Oct 2022 • Youngwan Lee, Jeffrey Willette, Jonghee Kim, Juho Lee, Sung Ju Hwang
Masked image modeling (MIM) has become a popular strategy for self-supervised learning~(SSL) of visual representations with Vision Transformers.
1 code implementation • 26 Aug 2022 • Jeffrey Willette, Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang
Recent work on mini-batch consistency (MBC) for set functions has brought attention to the need for sequentially processing and aggregating chunks of a partitioned set while guaranteeing the same output for all partitions.
no code implementations • 12 Oct 2021 • Jeffrey Willette, Hae Beom Lee, Juho Lee, Sung Ju Hwang
Numerous recent works utilize bi-Lipschitz regularization of neural network layers to preserve relative distances between data instances in the feature spaces of each layer.
no code implementations • NeurIPS 2021 • Bruno Andreis, Jeffrey Willette, Juho Lee, Sung Ju Hwang
The proposed method adheres to the required symmetries of invariance and equivariance as well as maintaining MBC for any partition of the input set.
no code implementations • 22 Feb 2021 • Jeffrey Willette, Juho Lee, Sung Ju Hwang
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.