1 code implementation • 17 Oct 2024 • Tianyu Guo, Druv Pai, Yu Bai, Jiantao Jiao, Michael I. Jordan, Song Mei
Next, we extend our analysis to pretrained LLMs, including Llama and OLMo, showing that many attention heads exhibit a similar active-dormant mechanism as in the BB task, and that the mutual reinforcement mechanism also governs the emergence of extreme-token phenomena during LLM pretraining.
1 code implementation • 27 Sep 2024 • Jialong Zuo, Ying Nie, Hanyu Zhou, Huaxin Zhang, Haoyu Wang, Tianyu Guo, Nong Sang, Changxin Gao
For example, compared with the previous state-of-the-art~\cite{ISR}, CION with the same ResNet50-IBN achieves higher mAP of 93. 3\% and 74. 3\% on Market1501 and MSMT17, while only utilizing 8\% training samples.
no code implementations • 2 Jul 2024 • Ying Nie, Binwei Yan, Tianyu Guo, Hao liu, Haoyu Wang, wei he, Binfan Zheng, WeiHao Wang, Qiang Li, Weijian Sun, Yunhe Wang, DaCheng Tao
(3) Financial Practice: whether LLMs can fulfill the practical financial jobs, such as tax consultant, junior accountant and securities analyst.
1 code implementation • 3 Jun 2024 • Danni Yang, Jiayi Ji, Yiwei Ma, Tianyu Guo, Haowei Wang, Xiaoshuai Sun, Rongrong Ji
These strategies are designed to extract the most accurate masks from SAM's output, thus guiding the training of the student model with enhanced precision.
no code implementations • 24 Apr 2024 • Tianyu Guo, Sai Praneeth Karimireddy, Michael I. Jordan
Instead of adjusting the distribution shift separately, we use weighted propensity score models to collaboratively adjust for the distribution shift.
1 code implementation • 26 Feb 2024 • wei he, Kai Han, Yehui Tang, Chengcheng Wang, Yujie Yang, Tianyu Guo, Yunhe Wang
Large language models (LLMs) face a daunting challenge due to the excessive computational and memory requirements of the commonly used Transformer architecture.
no code implementations • 27 Dec 2023 • Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao
We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.
1 code implementation • CVPR 2024 • Jialong Zuo, Hanyu Zhou, Ying Nie, Feng Zhang, Tianyu Guo, Nong Sang, Yunhe Wang, Changxin Gao
Firstly, we construct a new \textbf{dataset} named UFine6926.
no code implementations • 1 Dec 2023 • Ying Nie, wei he, Kai Han, Yehui Tang, Tianyu Guo, Fanyi Du, Yunhe Wang
Moreover, based on the observation that the accuracy of CLIP model does not increase correspondingly as the parameters of text encoder increase, an extra objective of masked language modeling (MLM) is leveraged for maximizing the potential of the shortened text encoder.
1 code implementation • NeurIPS 2023 • Yuchuan Tian, Hanting Chen, Tianyu Guo, Chao Xu, Yunhe Wang
To this end, we propose a Rank-based PruninG (RPG) method to maintain the ranks of sparse weights in an adversarial manner.
no code implementations • 3 Nov 2023 • Zheyuan Bai, Xinduo Liu, Hailin Hu, Tianyu Guo, Qinghua Zhang, Yunhe Wang
Data-Free Knowledge Distillation (DFKD) plays a vital role in compressing the model when original training data is unavailable.
1 code implementation • 17 Oct 2023 • Haowei Wang, Jiayi Ji, Tianyu Guo, Yilong Yang, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji
To address this, we introduce two cascading modules based on the barycenter of the mask, which are Coordinate Guided Aggregation (CGA) and Barycenter Driven Localization (BDL), responsible for segmentation and detection, respectively.
no code implementations • 16 Oct 2023 • Tianyu Guo, Wei Hu, Song Mei, Huan Wang, Caiming Xiong, Silvio Savarese, Yu Bai
Through extensive probing and a new pasting experiment, we further reveal several mechanisms within the trained transformers, such as concrete copying behaviors on both the inputs and the representations, linear ICL capability of the upper layers alone, and a post-ICL representation selection mechanism in a harder mixture setting.
2 code implementations • 15 Jul 2023 • Mengyuan Liu, Hong Liu, Tianyu Guo
Inspired by SkeletonBYOL, this paper further presents a Cross-Model and Cross-Stream (CMCS) framework.
no code implementations • ICCV 2023 • Jingwen Guo, Hong Liu, Shitong Sun, Tianyu Guo, Min Zhang, Chenyang Si
Existing skeleton-based action recognition methods typically follow a centralized learning paradigm, which can pose privacy concerns when exposing human-related videos.
1 code implementation • 7 Jul 2022 • Zhan Chen, Hong Liu, Tianyu Guo, Zhengyan Chen, Pinhao Song, Hao Tang
First, SkeleMix utilizes the topological information of skeleton data to mix two skeleton sequences by randomly combing the cropped skeleton fragments (the trimmed view) with the remaining skeleton sequences (the truncated view).
1 code implementation • 13 Jun 2022 • Wenhao Li, Mengyuan Liu, Hong Liu, Tianyu Guo, Ti Wang, Hao Tang, Nicu Sebe
To the best of our knowledge, this is the first MLP-Like architecture for 3D human pose estimation in a single frame and a video sequence.
Ranked #61 on 3D Human Pose Estimation on Human3.6M
1 code implementation • 7 Dec 2021 • Tianyu Guo, Hong Liu, Zhan Chen, Mengyuan Liu, Tao Wang, Runwei Ding
In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed.
Contrastive Learning Few-Shot Skeleton-Based Action Recognition +5
1 code implementation • 5 Dec 2021 • Tao Wang, Hong Liu, Pinhao Song, Tianyu Guo, Wei Shi
Therefore, we propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components (e. g. human body or joint parts) and selectively match non-occluded parts correspondingly.
1 code implementation • CVPR 2021 • Hanting Chen, Tianyu Guo, Chang Xu, Wenshuo Li, Chunjing Xu, Chao Xu, Yunhe Wang
Experiments on various datasets demonstrate that the student networks learned by the proposed method can achieve comparable performance with those using the original dataset.
6 code implementations • CVPR 2021 • Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao
To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.
Ranked #1 on Single Image Deraining on Rain100L (using extra training data)
1 code implementation • CVPR 2020 • Tianyu Guo, Chang Xu, Jiajun Huang, Yunhe Wang, Boxin Shi, Chao Xu, DaCheng Tao
In contrast, it is more reasonable to treat the generated data as unlabeled, which could be positive or negative according to their quality.
no code implementations • NeurIPS 2019 • Tianyu Guo, Chang Xu, Boxin Shi, Chao Xu, DaCheng Tao
A worst-case formulation can be developed over this distribution set, and then be interpreted as a generation task in an adversarial manner.
no code implementations • 30 Jul 2018 • Tianyu Guo, Chang Xu, Shiyi He, Boxin Shi, Chao Xu, DaCheng Tao
In this way, a portable student network with significantly fewer parameters can achieve a considerable accuracy which is comparable to that of teacher network.