Search Results for author: Tianyu Guo

Found 24 papers, 15 papers with code

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs

1 code implementation17 Oct 2024 Tianyu Guo, Druv Pai, Yu Bai, Jiantao Jiao, Michael I. Jordan, Song Mei

Next, we extend our analysis to pretrained LLMs, including Llama and OLMo, showing that many attention heads exhibit a similar active-dormant mechanism as in the BB task, and that the mutual reinforcement mechanism also governs the emergence of extreme-token phenomena during LLM pretraining.

Quantization

Cross-video Identity Correlating for Person Re-identification Pre-training

1 code implementation27 Sep 2024 Jialong Zuo, Ying Nie, Hanyu Zhou, Huaxin Zhang, Haoyu Wang, Tianyu Guo, Nong Sang, Changxin Gao

For example, compared with the previous state-of-the-art~\cite{ISR}, CION with the same ResNet50-IBN achieves higher mAP of 93. 3\% and 74. 3\% on Market1501 and MSMT17, while only utilizing 8\% training samples.

Denoising Person Re-Identification

CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models

no code implementations2 Jul 2024 Ying Nie, Binwei Yan, Tianyu Guo, Hao liu, Haoyu Wang, wei he, Binfan Zheng, WeiHao Wang, Qiang Li, Weijian Sun, Yunhe Wang, DaCheng Tao

(3) Financial Practice: whether LLMs can fulfill the practical financial jobs, such as tax consultant, junior accountant and securities analyst.

Multiple-choice

SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation

1 code implementation3 Jun 2024 Danni Yang, Jiayi Ji, Yiwei Ma, Tianyu Guo, Haowei Wang, Xiaoshuai Sun, Rongrong Ji

These strategies are designed to extract the most accurate masks from SAM's output, thus guiding the training of the student model with enhanced precision.

Pseudo Label Referring Expression +1

Collaborative Heterogeneous Causal Inference Beyond Meta-analysis

no code implementations24 Apr 2024 Tianyu Guo, Sai Praneeth Karimireddy, Michael I. Jordan

Instead of adjusting the distribution shift separately, we use weighted propensity score models to collaboratively adjust for the distribution shift.

Causal Inference Density Estimation +1

DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models

1 code implementation26 Feb 2024 wei he, Kai Han, Yehui Tang, Chengcheng Wang, Yujie Yang, Tianyu Guo, Yunhe Wang

Large language models (LLMs) face a daunting challenge due to the excessive computational and memory requirements of the commonly used Transformer architecture.

Mamba State Space Models

PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

no code implementations27 Dec 2023 Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao

We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.

Language Modelling

LightCLIP: Learning Multi-Level Interaction for Lightweight Vision-Language Models

no code implementations1 Dec 2023 Ying Nie, wei he, Kai Han, Yehui Tang, Tianyu Guo, Fanyi Du, Yunhe Wang

Moreover, based on the observation that the accuracy of CLIP model does not increase correspondingly as the parameters of text encoder increase, an extra objective of masked language modeling (MLM) is leveraged for maximizing the potential of the shortened text encoder.

Image Classification Image-text Retrieval +4

Towards Higher Ranks via Adversarial Weight Pruning

1 code implementation NeurIPS 2023 Yuchuan Tian, Hanting Chen, Tianyu Guo, Chao Xu, Yunhe Wang

To this end, we propose a Rank-based PruninG (RPG) method to maintain the ranks of sparse weights in an adversarial manner.

Model Compression Network Pruning

Data-Free Distillation of Language Model by Text-to-Text Transfer

no code implementations3 Nov 2023 Zheyuan Bai, Xinduo Liu, Hailin Hu, Tianyu Guo, Qinghua Zhang, Yunhe Wang

Data-Free Knowledge Distillation (DFKD) plays a vital role in compressing the model when original training data is unavailable.

Data-free Knowledge Distillation Diversity +5

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning

1 code implementation17 Oct 2023 Haowei Wang, Jiayi Ji, Tianyu Guo, Yilong Yang, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji

To address this, we introduce two cascading modules based on the barycenter of the mask, which are Coordinate Guided Aggregation (CGA) and Barycenter Driven Localization (BDL), responsible for segmentation and detection, respectively.

Segmentation Visual Grounding

How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations

no code implementations16 Oct 2023 Tianyu Guo, Wei Hu, Song Mei, Huan Wang, Caiming Xiong, Silvio Savarese, Yu Bai

Through extensive probing and a new pasting experiment, we further reveal several mechanisms within the trained transformers, such as concrete copying behaviors on both the inputs and the representations, linear ICL capability of the upper layers alone, and a post-ICL representation selection mechanism in a harder mixture setting.

In-Context Learning

Cross-Model Cross-Stream Learning for Self-Supervised Human Action Recognition

2 code implementations15 Jul 2023 Mengyuan Liu, Hong Liu, Tianyu Guo

Inspired by SkeletonBYOL, this paper further presents a Cross-Model and Cross-Stream (CMCS) framework.

Contrastive Learning Ensemble Learning +7

FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation

no code implementations ICCV 2023 Jingwen Guo, Hong Liu, Shitong Sun, Tianyu Guo, Min Zhang, Chenyang Si

Existing skeleton-based action recognition methods typically follow a centralized learning paradigm, which can pose privacy concerns when exposing human-related videos.

Action Recognition Federated Learning +3

Contrastive Learning from Spatio-Temporal Mixed Skeleton Sequences for Self-Supervised Skeleton-Based Action Recognition

1 code implementation7 Jul 2022 Zhan Chen, Hong Liu, Tianyu Guo, Zhengyan Chen, Pinhao Song, Hao Tang

First, SkeleMix utilizes the topological information of skeleton data to mix two skeleton sequences by randomly combing the cropped skeleton fragments (the trimmed view) with the remaining skeleton sequences (the truncated view).

Action Recognition Contrastive Learning +3

GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation

1 code implementation13 Jun 2022 Wenhao Li, Mengyuan Liu, Hong Liu, Tianyu Guo, Ti Wang, Hao Tang, Nicu Sebe

To the best of our knowledge, this is the first MLP-Like architecture for 3D human pose estimation in a single frame and a video sequence.

3D Human Pose Estimation Representation Learning

Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition

1 code implementation7 Dec 2021 Tianyu Guo, Hong Liu, Zhan Chen, Mengyuan Liu, Tao Wang, Runwei Ding

In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed.

Contrastive Learning Few-Shot Skeleton-Based Action Recognition +5

Pose-guided Feature Disentangling for Occluded Person Re-identification Based on Transformer

1 code implementation5 Dec 2021 Tao Wang, Hong Liu, Pinhao Song, Tianyu Guo, Wei Shi

Therefore, we propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components (e. g. human body or joint parts) and selectively match non-occluded parts correspondingly.

Decoder Occluded Person Re-Identification

Learning Student Networks in the Wild

1 code implementation CVPR 2021 Hanting Chen, Tianyu Guo, Chang Xu, Wenshuo Li, Chunjing Xu, Chao Xu, Yunhe Wang

Experiments on various datasets demonstrate that the student networks learned by the proposed method can achieve comparable performance with those using the original dataset.

Knowledge Distillation Model Compression

Pre-Trained Image Processing Transformer

6 code implementations CVPR 2021 Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao

To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.

 Ranked #1 on Single Image Deraining on Rain100L (using extra training data)

Color Image Denoising Contrastive Learning +2

On Positive-Unlabeled Classification in GAN

1 code implementation CVPR 2020 Tianyu Guo, Chang Xu, Jiajun Huang, Yunhe Wang, Boxin Shi, Chao Xu, DaCheng Tao

In contrast, it is more reasonable to treat the generated data as unlabeled, which could be positive or negative according to their quality.

Classification General Classification

Learning from Bad Data via Generation

no code implementations NeurIPS 2019 Tianyu Guo, Chang Xu, Boxin Shi, Chao Xu, DaCheng Tao

A worst-case formulation can be developed over this distribution set, and then be interpreted as a generation task in an adversarial manner.

Robust Student Network Learning

no code implementations30 Jul 2018 Tianyu Guo, Chang Xu, Shiyi He, Boxin Shi, Chao Xu, DaCheng Tao

In this way, a portable student network with significantly fewer parameters can achieve a considerable accuracy which is comparable to that of teacher network.

Cannot find the paper you are looking for? You can Submit a new open access paper.