no code implementations • 9 Jan 2025 • Aniruddha Mahapatra, Long Mai, Yitian Zhang, David Bourgin, Feng Liu
Evaluation of video benchmarks shows that our method significantly improves reconstruction quality while increasing temporal compression compared to direct extensions of existing video tokenizers.
1 code implementation • 6 Dec 2024 • Yitian Zhang, Huseyin Coskun, Xu Ma, Huan Wang, Ke Ma, Xi, Chen, Derek Hao Hu, Yun Fu
Thus, we propose a general framework, named Scala, to enable a single network to represent multiple smaller ViTs with flexible inference capability, which aligns with the inherent design of ViT to vary from widths.
1 code implementation • 30 Oct 2024 • Kun Hu, Qingle Zhang, Maoxun Yuan, Yitian Zhang
Next, to introduce frequency domain information, we construct a Frequency Domain Fusion Module (FDFM) that transforms the spatial domain to the frequency domain through Fast Fourier Transform (FFT) and then integrates frequency domain information.
1 code implementation • 15 Jul 2024 • Yitian Zhang, Xu Ma, Yue Bai, Huan Wang, Yun Fu
Vision foundation models are renowned for their generalization ability due to massive training data.
1 code implementation • 21 Apr 2024 • Liheng Ma, Soumyasundar Pal, Yitian Zhang, Jiaming Zhou, Yingxue Zhang, Mark Coates
In this work, we propose a novel and general graph convolution framework by parameterizing the kernels as continuous functions of pseudo-coordinates derived via graph positional encoding.
Ranked #1 on
Graph Classification
on CIFAR-10
1 code implementation • 14 Mar 2024 • Yitian Zhang, Yue Bai, Huan Wang, Yizhou Wang, Yun Fu
Current training pipelines in object recognition neglect Hue Jittering when doing data augmentation as it not only brings appearance changes that are detrimental to classification, but also the implementation is inefficient in practice.
2 code implementations • 7 Nov 2023 • Yitian Zhang, Liheng Ma, Soumyasundar Pal, Yingxue Zhang, Mark Coates
Recent architectures learn complex temporal patterns by segmenting a time-series into patches and using the patches as tokens.
2 code implementations • CVPR 2023 • Yitian Zhang, Yue Bai, Chang Liu, Huan Wang, Sheng Li, Yun Fu
To fix this issue, we propose a general framework, named Frame Flexible Network (FFN), which not only enables the model to be evaluated at different frames to adjust its computation, but also reduces the memory costs of storing multiple models significantly.
1 code implementation • 18 Nov 2022 • Yitian Zhang, Yue Bai, Huan Wang, Yi Xu, Yun Fu
To tackle this problem, we propose Ample and Focal Network (AFNet), which is composed of two branches to utilize more frames but with less computation.
1 code implementation • 13 Oct 2022 • Yue Bai, Huan Wang, Xu Ma, Yitian Zhang, Zhiqiang Tao, Yun Fu
We validate the potential of PEMN learning masks on random weights with limited unique values and test its effectiveness for a new compression paradigm based on different network architectures.
no code implementations • 21 Sep 2022 • Yitian Zhang, Florence Regol, Antonios Valkanas, Mark Coates
We propose a framework called GraphTNC for unsupervised learning of joint representations of the graph and the time-series.