Search Results for author: Yunhai Tong

Found 52 papers, 37 papers with code

Enhancing Self-Attention with Knowledge-Assisted Attention Maps

no code implementations • NAACL 2022 • Jiangang Bai, Yujing Wang, Hong Sun, Ruonan Wu, Tianmeng Yang, Pengfei Tang, Defu Cao, Mingliang Zhang1, Yunhai Tong, Yaming Yang, Jing Bai, Ruofei Zhang, Hao Sun, Wei Shen

Large-scale pre-trained language models have attracted extensive attentions in the research community and shown promising results on various tasks of natural language processing.

Multi-Task Learning Natural Language Understanding

Paper
Add Code

SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow

1 code implementation • 30 May 2024 • Chaoyang Wang, Xiangtai Li, Lu Qi, Henghui Ding, Yunhai Tong, Ming-Hsuan Yang

For image synthesis, we propose a finite perturbation approach to enhance the diversity of generated results without changing the semantic categories.

Paper
Code

VG4D: Vision-Language Model Goes 4D Video Recognition

1 code implementation • 17 Apr 2024 • Zhichao Deng, Xiangtai Li, Xia Li, Yunhai Tong, Shen Zhao, Mengyuan Liu

By transferring the knowledge of the VLM to the 4D encoder and combining the VLM, our VG4D achieves improved recognition performance.

Action Recognition Autonomous Driving +2

Paper
Code

Explore In-Context Segmentation via Latent Diffusion Models

no code implementations • 14 Mar 2024 • Chaoyang Wang, Xiangtai Li, Henghui Ding, Lu Qi, Jiangning Zhang, Yunhai Tong, Chen Change Loy, Shuicheng Yan

In-context segmentation has drawn more attention with the introduction of vision foundation models.

Metric Learning Segmentation

Paper
Add Code

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

no code implementations • 18 Jan 2024 • Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy

We introduce a new task -- language-driven video inpainting, which uses natural language instructions to guide the inpainting process.

Video Inpainting

Paper
Add Code

RAP-SAM: Towards Real-Time All-Purpose Segment Anything

1 code implementation • 18 Jan 2024 • Shilin Xu, Haobo Yuan, Qingyu Shi, Lu Qi, Jingbo Wang, Yibo Yang, Yining Li, Kai Chen, Yunhai Tong, Bernard Ghanem, Xiangtai Li, Ming-Hsuan Yang

Segment Anything Model (SAM) is one remarkable model that can achieve generalized segmentation.

Decoder Interactive Segmentation +4

196

Paper
Code

DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection

1 code implementation • 2 Oct 2023 • Shilin Xu, Xiangtai Li, Size Wu, Wenwei Zhang, Yunhai Tong, Chen Change Loy

We refer to this approach as the self-training strategy, which enhances recall and accuracy for novel classes without requiring extra annotations, datasets, and re-training.

Novel Object Detection Object +5

Paper
Code

Mitigating Semantic Confusion from Hostile Neighborhood for Graph Active Learning

1 code implementation • 17 Aug 2023 • Tianmeng Yang, Min Zhou, Yujing Wang, Zhengjie Lin, Lujia Pan, Bin Cui, Yunhai Tong

Graph Active Learning (GAL), which aims to find the most informative nodes in graphs for annotation to maximize the Graph Neural Networks (GNNs) performance, has attracted many research efforts but remains non-trivial challenges.

Active Learning Node Classification

Paper
Code

Towards Open Vocabulary Learning: A Survey

1 code implementation • 28 Jun 2023 • Jianzong Wu, Xiangtai Li, Shilin Xu, Haobo Yuan, Henghui Ding, Yibo Yang, Xia Li, Jiangning Zhang, Yunhai Tong, Xudong Jiang, Bernard Ghanem, DaCheng Tao

To our knowledge, this is the first comprehensive literature review of open vocabulary learning.

Open Set Learning Out-of-Distribution Detection +3

702

Paper
Code

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation

1 code implementation • 3 Jan 2023 • Xiangtai Li, Shilin Xu, Yibo Yang, Haobo Yuan, Guangliang Cheng, Yunhai Tong, Zhouchen Lin, Ming-Hsuan Yang, DaCheng Tao

Third, inspired by Mask2Former, based on our meta-architecture, we propose Panoptic-PartFormer++ and design a new part-whole cross-attention scheme to boost part segmentation qualities further.

Panoptic Segmentation Segmentation

Paper
Code

Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

2 code implementations • ICCV 2023 • Jianzong Wu, Xiangtai Li, Henghui Ding, Xia Li, Guangliang Cheng, Yunhai Tong, Chen Change Loy

Experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS) demonstrate the superiority of the CGG.

Caption Generation Instance Segmentation +2

Paper
Code

Label-Efficient Interactive Time-Series Anomaly Detection

no code implementations • 30 Dec 2022 • Hong Guo, Yujing Wang, Jieyu Zhang, Zhengjie Lin, Yunhai Tong, Lei Yang, Luoxing Xiong, Congrui Huang

Time-series anomaly detection is an important task and has been widely applied in the industry.

Active Learning Time Series +2

Paper
Add Code

Convolution-enhanced Evolving Attention Networks

1 code implementation • 16 Dec 2022 • Yujing Wang, Yaming Yang, Zhuo Li, Jiangang Bai, Mingliang Zhang, Xiangtai Li, Jing Yu, Ce Zhang, Gao Huang, Yunhai Tong

To the best of our knowledge, this is the first work that explicitly models the layer-wise evolution of attention maps.

Image Classification Machine Translation +3

Paper
Code

Towards Robust Referring Image Segmentation

1 code implementation • 20 Sep 2022 • Jianzong Wu, Xiangtai Li, Xia Li, Henghui Ding, Yunhai Tong, DaCheng Tao

It considers the negative sentence inputs besides the regular positive text inputs.

Image Segmentation Segmentation +2

Paper
Code

SFNet: Faster, Accurate, and Domain Agnostic Semantic Segmentation via Semantic Flow

1 code implementation • 10 Jul 2022 • Xiangtai Li, Jiangning Zhang, Yibo Yang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, DaCheng Tao

In this paper, we focus on exploring effective methods for faster, accurate, and domain agnostic semantic segmentation.

Real-Time Semantic Segmentation

356

Paper
Code

Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation

1 code implementation • 10 Apr 2022 • Xiangtai Li, Shilin Xu, Yibo Yang, Guangliang Cheng, Yunhai Tong, DaCheng Tao

To the best of our knowledge, we are the first to solve the PPS problem via \textit{a unified and end-to-end transformer model.

Ranked #2 on Part-aware Panoptic Segmentation on Pascal Panoptic Parts

Panoptic Segmentation Part-aware Panoptic Segmentation +1

Paper
Code

Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation

1 code implementation • CVPR 2022 • Xiangtai Li, Wenwei Zhang, Jiangmiao Pang, Kai Chen, Guangliang Cheng, Yunhai Tong, Chen Change Loy

We hope this simple, yet effective method can serve as a new, flexible baseline in unified video segmentation design.

Ranked #1 on Video Panoptic Segmentation on KITTI-STEP (using extra training data)

Image Segmentation Instance Segmentation +5

150

Paper
Code

Fashionformer: A simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition

1 code implementation • 10 Apr 2022 • Shilin Xu, Xiangtai Li, Jingbo Wang, Guangliang Cheng, Yunhai Tong, DaCheng Tao

This focus on joint human fashion segmentation and attribute recognition.

Attribute Fashion Understanding +1

Paper
Code

TransVOD: End-to-End Video Object Detection with Spatial-Temporal Transformers

3 code implementations • 13 Jan 2022 • Qianyu Zhou, Xiangtai Li, Lu He, Yibo Yang, Guangliang Cheng, Yunhai Tong, Lizhuang Ma, DaCheng Tao

Detection Transformer (DETR) and Deformable DETR have been proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance as previous complex hand-crafted detectors.

Ranked #4 on Video Object Detection on ImageNet VID (using extra training data)

Object object-detection +2

199

Paper
Code

PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation

1 code implementation • 5 Dec 2021 • Haobo Yuan, Xiangtai Li, Yibo Yang, Guangliang Cheng, Jing Zhang, Yunhai Tong, Lefei Zhang, DaCheng Tao

The Depth-aware Video Panoptic Segmentation (DVPS) is a new challenging vision problem that aims to predict panoptic segmentation and depth in a video simultaneously.

Ranked #1 on Depth-aware Video Panoptic Segmentation on SemKITTI-DVPS

Depth-aware Video Panoptic Segmentation Depth Estimation +4

Paper
Code

Graph Pointer Neural Networks

no code implementations • 3 Oct 2021 • Tianmeng Yang, Yujing Wang, Zhihan Yue, Yaming Yang, Yunhai Tong, Jing Bai

On the one hand, multi-hop-based approaches do not explicitly distinguish relevant nodes from a large number of multi-hop neighborhoods, leading to a severe over-smoothing problem.

Node Classification

Paper
Add Code

Competence-based Curriculum Learning for Multilingual Machine Translation

no code implementations • Findings (EMNLP) 2021 • Mingliang Zhang, Fandong Meng, Yunhai Tong, Jie zhou

Therefore, we focus on balancing the learning competencies of different languages and propose Competence-based Curriculum Learning for Multilingual Machine Translation, named CCL-M.

Machine Translation Translation

Paper
Add Code

Improving Video Instance Segmentation via Temporal Pyramid Routing

1 code implementation • 28 Jul 2021 • Xiangtai Li, Hao He, Yibo Yang, Henghui Ding, Kuiyuan Yang, Guangliang Cheng, Yunhai Tong, DaCheng Tao

To incorporate both temporal and scale information, we propose a Temporal Pyramid Routing (TPR) strategy to conditionally align and conduct pixel-level aggregation from a feature pyramid pair of two adjacent frames.

Instance Segmentation Panoptic Segmentation +2

Paper
Code

Global Aggregation then Local Distribution for Scene Parsing

1 code implementation • 28 Jul 2021 • Xiangtai Li, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Xiatian Zhu, Tao Xiang

Modelling long-range contextual relationships is critical for pixel-wise prediction tasks such as semantic segmentation.

Scene Parsing Segmentation +1

344

Paper
Code

Customizing Graph Neural Networks using Path Reweighting

2 code implementations • 21 Jun 2021 • Jianpeng Chen, Yujing Wang, Ming Zeng, Zongyi Xiang, Bitan Hou, Yunhai Tong, Ole J. Mengshoel, Yazhou Ren

Specifically, the proposed CustomGNN can automatically learn the high-level semantics for specific downstream tasks to highlight semantically relevant paths as well to filter out task-irrelevant noises in a graph.

Data Augmentation Graph Attention +1

Paper
Code

TS2Vec: Towards Universal Representation of Time Series

2 code implementations • 19 Jun 2021 • Zhihan Yue, Yujing Wang, Juanyong Duan, Tianmeng Yang, Congrui Huang, Yunhai Tong, Bixiong Xu

Furthermore, to obtain the representation of an arbitrary sub-sequence in the time series, we can apply a simple aggregation over the representations of corresponding timestamps.

Contrastive Learning Time Series +3

552

Paper
Code

BoundarySqueeze: Image Segmentation as Boundary Squeezing

1 code implementation • 25 May 2021 • Hao He, Xiangtai Li, Yibo Yang, Guangliang Cheng, Yunhai Tong, Lubin Weng, Zhouchen Lin, Shiming Xiang

This module is used to squeeze the object boundary from both inner and outer directions, which contributes to precise mask representation.

Image Segmentation Instance Segmentation +2

Paper
Code

Fast and Accurate Scene Parsing via Bi-direction Alignment Networks

1 code implementation • 25 May 2021 • Yanran Wu, Xiangtai Li, Chen Shi, Yunhai Tong, Yang Hua, Tao Song, Ruhui Ma, Haibing Guan

Motivated by this, we propose a novel network by aligning two-path information into each other through a learned flow field.

Scene Parsing

Paper
Code

Dynamic Dual Sampling Module for Fine-Grained Semantic Segmentation

no code implementations • 25 May 2021 • Chen Shi, Xiangtai Li, Yanran Wu, Yunhai Tong, Yi Xu

Representation of semantic context and local details is the essential issue for building modern semantic segmentation models.

Segmentation Semantic Segmentation

Paper
Add Code

End-to-End Video Object Detection with Spatial-Temporal Transformers

1 code implementation • 23 May 2021 • Lu He, Qianyu Zhou, Xiangtai Li, Li Niu, Guangliang Cheng, Xiao Li, Wenxuan Liu, Yunhai Tong, Lizhuang Ma, Liqing Zhang

Recently, DETR and Deformable DETR have been proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance as previous complex hand-crafted detectors.

Object object-detection +2

199

Paper
Code

Enhanced Boundary Learning for Glass-like Object Segmentation

1 code implementation • ICCV 2021 • Hao He, Xiangtai Li, Guangliang Cheng, Jianping Shi, Yunhai Tong, Gaofeng Meng, Véronique Prinet, Lubin Weng

We use these two modules to design a decoder that generates accurate and clean segmentation results, especially on the object contours.

Ranked #20 on Thermal Image Segmentation on RGB-T-Glass-Segmentation

Decoder Object +4

Paper
Code

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting

3 code implementations • NeurIPS 2020 • Defu Cao, Yujing Wang, Juanyong Duan, Ce Zhang, Xia Zhu, Conguri Huang, Yunhai Tong, Bixiong Xu, Jing Bai, Jie Tong, Qi Zhang

In this paper, we propose Spectral Temporal Graph Neural Network (StemGNN) to further improve the accuracy of multivariate time-series forecasting.

Multivariate Time Series Forecasting Time Series

749

Paper
Code

PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation

1 code implementation • CVPR 2021 • Xiangtai Li, Hao He, Xia Li, Duo Li, Guangliang Cheng, Jianping Shi, Lubin Weng, Yunhai Tong, Zhouchen Lin

Experimental results on three different aerial segmentation datasets suggest that the proposed method is more effective and efficient than state-of-the-art general semantic segmentation methods.

Image Segmentation Segmentation +1

121

Paper
Code

Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees

1 code implementation • EACL 2021 • Jiangang Bai, Yujing Wang, Yiren Chen, Yaming Yang, Jing Bai, Jing Yu, Yunhai Tong

Pre-trained language models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information.

Natural Language Understanding

Paper
Code

Evolving Attention with Residual Convolutions

2 code implementations • 20 Feb 2021 • Yujing Wang, Yaming Yang, Jiangang Bai, Mingliang Zhang, Jing Bai, Jing Yu, Ce Zhang, Gao Huang, Yunhai Tong

In this paper, we propose a novel and generic mechanism based on evolving attention to improve the performance of transformers.

Image Classification Machine Translation +2

Paper
Code

Predictive Attention Transformer: Improving Transformer with Attention Map Prediction

no code implementations • 1 Jan 2021 • Yujing Wang, Yaming Yang, Jiangang Bai, Mingliang Zhang, Jing Bai, Jing Yu, Ce Zhang, Yunhai Tong

Instead, we model their dependencies via a chain of prediction models that take previous attention maps as input to predict the attention maps of a new layer through convolutional neural networks.

Machine Translation

Paper
Add Code

Towards Efficient Scene Understanding via Squeeze Reasoning

1 code implementation • 6 Nov 2020 • Xiangtai Li, Xia Li, Ansheng You, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Zhouchen Lin

Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector and perform reasoning within the single vector where the computation cost can be significantly reduced.

Instance Segmentation object-detection +4

356

Paper
Code

AutoADR: Automatic Model Design for Ad Relevance

no code implementations • 14 Oct 2020 • Yiren Chen, Yaming Yang, Hong Sun, Yujing Wang, Yu Xu, Wei Shen, Rong Zhou, Yunhai Tong, Jing Bai, Ruofei Zhang

We add the model designed by AutoADR as a sub-model into the production Ad Relevance model.

Knowledge Distillation Neural Architecture Search

Paper
Add Code

Multivariate Time-series Anomaly Detection via Graph Attention Network

2 code implementations • 4 Sep 2020 • Hang Zhao, Yujing Wang, Juanyong Duan, Congrui Huang, Defu Cao, Yunhai Tong, Bixiong Xu, Jing Bai, Jie Tong, Qi Zhang

Anomaly detection on multivariate time-series is of great importance in both data mining research and industrial applications.

Anomaly Detection Graph Attention +3

308

Paper
Code

Boundary Content Graph Neural Network for Temporal Action Proposal Generation

no code implementations • ECCV 2020 • Yueran Bai, Yingying Wang, Yunhai Tong, Yang Yang, Qiyue Liu, Junhui Liu

To address this issue, we propose a novel Boundary Content Graph Neural Network (BC-GNN) to model the insightful relations between the boundary and action content of temporal proposals by the graph neural networks.

Ranked #25 on Temporal Action Localization on ActivityNet-1.3

Action Detection Action Understanding +1

Paper
Add Code

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

2 code implementations • ECCV 2020 • Xiangtai Li, Xia Li, Li Zhang, Guangliang Cheng, Jianping Shi, Zhouchen Lin, Shaohua Tan, Yunhai Tong

Our insight is that appealing performance of semantic segmentation requires \textit{explicitly} modeling the object \textit{body} and \textit{edge}, which correspond to the high and low frequency of the image.

Object Segmentation +1

8,335

Paper
Code

Improving BERT with Self-Supervised Attention

1 code implementation • 8 Apr 2020 • Yiren Chen, Xiaoyu Kou, Jiangang Bai, Yunhai Tong

One of the most popular paradigms of applying large pre-trained NLP models such as BERT is to fine-tune it on a smaller dataset.

Sentence

Paper
Code

LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression

no code implementations • COLING 2020 • Yihuan Mao, Yujing Wang, Chufan Wu, Chen Zhang, Yang Wang, Yaming Yang, Quanlu Zhang, Yunhai Tong, Jing Bai

BERT is a cutting-edge language representation model pre-trained by a large corpus, which achieves superior performances on various natural language understanding tasks.

Blocking Knowledge Distillation +2

Paper
Add Code

Semantic Flow for Fast and Accurate Scene Parsing

6 code implementations • ECCV 2020 • Xiangtai Li, Ansheng You, Zhen Zhu, Houlong Zhao, Maoke Yang, Kuiyuan Yang, Yunhai Tong

A common practice to improve the performance is to attain high resolution feature maps with strong semantic representation.

Ranked #2 on Real-Time Semantic Segmentation on Cityscapes test

Optical Flow Estimation Real-Time Semantic Segmentation +1

8,335

Paper
Code

TextNAS: A Neural Architecture Search Space tailored for Text Representation

no code implementations • 23 Dec 2019 • Yujing Wang, Yaming Yang, Yiren Chen, Jing Bai, Ce Zhang, Guinan Su, Xiaoyu Kou, Yunhai Tong, Mao Yang, Lidong Zhou

Learning text representation is crucial for text classification and other language related tasks.

General Classification Natural Language Inference +3

Paper
Add Code

Customized Graph Embedding: Tailoring Embedding Vectors to different Applications

no code implementations • 21 Nov 2019 • Bitan Hou, Yujing Wang, Ming Zeng, Shan Jiang, Ole J. Mengshoel, Yunhai Tong, Jing Bai

For these applications, graph embedding is crucial as it provides vector representations of the graph.

Graph Embedding Graph Mining +1

Paper
Add Code

Global Aggregation then Local Distribution in Fully Convolutional Networks

2 code implementations • 16 Sep 2019 • Xiangtai Li, Li Zhang, Ansheng You, Maoke Yang, Kuiyuan Yang, Yunhai Tong

GALD is end-to-end trainable and can be easily plugged into existing FCNs with various global aggregation modules for a wide range of vision tasks, and consistently improves the performance of state-of-the-art object detection and instance segmentation approaches.

Ranked #1 on Semantic Segmentation on PASCAL VOC 2007

Instance Segmentation object-detection +4

344

Paper
Code

Dual Graph Convolutional Network for Semantic Segmentation

6 code implementations • 13 Sep 2019 • Li Zhang, Xiangtai Li, Anurag Arnab, Kuiyuan Yang, Yunhai Tong, Philip H. S. Torr

Exploiting long-range contextual information is key for pixel-wise prediction tasks such as semantic segmentation.

Ranked #32 on Semantic Segmentation on Cityscapes test

Semantic Segmentation

344

Paper
Code

Reprojection R-CNN: A Fast and Accurate Object Detector for 360° Images

no code implementations • 27 Jul 2019 • Pengyu Zhao, Ansheng You, Yuanxing Zhang, Jiaying Liu, Kaigui Bian, Yunhai Tong

Specifically, we adapt the terminologies of the traditional object detection task to the omnidirectional scenarios, and propose a novel two-stage object detector, i. e., Reprojection R-CNN by combining both ERP and perspective projection.

ERP Object +3