Search Results for author: Ming Tang

Found 92 papers, 27 papers with code

Large Batch Optimization for Object Detection: Training COCO in 12 Minutes

no code implementations ECCV 2020 Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Yao-Wei Wang, Jinqiao Wang, Ming Tang

Most of existing object detectors usually adopt a small training batch size ( ~16), which severely hinders the whole community from exploring large-scale datasets due to the extremely long training procedure.

object-detection Object Detection

Blended Grammar Network for Human Parsing

no code implementations ECCV 2020 Xiaomei Zhang, Yingying Chen, Bingke Zhu, Jinqiao Wang, Ming Tang

Although human parsing has made great progress, it still faces a challenge, i. e., how to extract the whole foreground from similar or cluttered scenes effectively.

Human Parsing

FedDAA: Dynamic Client Clustering for Concept Drift Adaptation in Federated Learning

no code implementations26 Jun 2025 Fu Peng, Ming Tang

A key challenge lies in distinguishing different sources of drift, as they require distinct adaptation strategies: real drift calls for discarding outdated data, while virtual or label drift benefits from retaining historical data.

Drift Detection Federated Learning

An Information-Theoretic Analysis for Federated Learning under Concept Drift

no code implementations26 Jun 2025 Fu Peng, Meng Zhang, Ming Tang

Inspired by this, we propose an algorithm that regularizes the empirical risk minimization approach with KL divergence and mutual information, thereby enhancing long-term performance.

Federated Learning

Referring Expression Instance Retrieval and A Strong End-to-End Baseline

no code implementations23 Jun 2025 Xiangzhao Hao, Kuan Zhu, Hongyu Guo, Haiyun Guo, Ning Jiang, Quan Lu, Ming Tang, Jinqiao Wang

Text-Image Retrieval (TIR) retrieves a target image from a gallery based on an image-level description, while Referring Expression Comprehension (REC) localizes a target object within a given image using an instance-level description.

Image Retrieval Referring Expression +2

SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought

no code implementations30 May 2025 Guanghao Li, Wenhao Jiang, Mingfeng Chen, Yan Li, Hao Yu, Shuting Dong, Tao Ren, Ming Tang, Chun Yuan

We address this gap by introducing Flow Chain of Thought (Flow CoT), a reasoning paradigm that models recursive inference as a progressive trajectory of latent cognitive states.

Understand, Think, and Answer: Advancing Visual Reasoning with Large Multimodal Models

1 code implementation27 May 2025 Yufei Zhan, Hongyin Zhao, Yousong Zhu, Shurong Zheng, Fan Yang, Ming Tang, Jinqiao Wang

Large Multimodal Models (LMMs) have recently demonstrated remarkable visual understanding performance on both vision-language and vision-centric tasks.

Question Answering Visual Reasoning

One-for-All Pruning: A Universal Model for Customized Compression of Large Language Models

no code implementations18 May 2025 Rongguang Ye, Ming Tang

Since the gradient of the Gaussian process is computable, we can use it to approximate the gradient of the non-differentiable pruning process, thereby enabling StratNet updates.

All

Integrating Single-Cell Foundation Models with Graph Neural Networks for Drug Response Prediction

no code implementations19 Apr 2025 Till Rossner, Ziteng Li, Jonas Balke, Nikoo Salehfard, Tom Seifert, Ming Tang

Our approach builds on the DeepCDR framework, which encodes drug representations from graph structures and cell representations from multi-omics profiles.

Drug Response Prediction Prediction

MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection

no code implementations17 Apr 2025 Long Qian, Bingke Zhu, Yingying Chen, Ming Tang, Jinqiao Wang

Anomaly detection is a crucial task in computer vision, yet collecting real-world defect images is inherently difficult due to the rarity and unpredictability of anomalies.

Anomaly Detection Data Augmentation +1

Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning

1 code implementation23 Mar 2025 Yufei Zhan, Yousong Zhu, Shurong Zheng, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao Wang

Large Vision-Language Models (LVLMs) typically follow a two-stage training paradigm-pretraining and supervised fine-tuning.

PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability

1 code implementation CVPR 2025 Weijie Zhou, Manli Tao, Chaoyang Zhao, Haiyun Guo, Honghui Dong, Ming Tang, Jinqiao Wang

Specifically, the S-P Map abstracts a robot's physical reachability into a generalized spatial representation, independent of specific robot configurations, allowing the model to focus on reachability features rather than robot-specific parameters.

Visual Reasoning

MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation

no code implementations10 Mar 2025 Guanghao Li, Mingzhi Chen, Hao Yu, Shuting Dong, Wenhao Jiang, Ming Tang, Chun Yuan

In this paper, we propose Mutual Information-Guided Attack (MIGA), the first method designed to directly attack deep denoising models by strategically disrupting their ability to preserve semantic content via adversarial perturbations.

Denoising Semantic Similarity +1

FLARE: A Framework for Stellar Flare Forecasting using Stellar Physical Properties and Historical Records

no code implementations25 Feb 2025 Bingke Zhu, Xiaoxiao Wang, Minghui Jia, Yihan Tao, Xiao Kong, Ali Luo, Yingying Chen, Ming Tang, Jinqiao Wang

Stellar flare events are critical observational samples for astronomical research; however, recorded flare events remain limited.

Zero Token-Driven Deep Thinking in LLMs: Unlocking the Full Potential of Existing Parameters via Cyclic Refinement

no code implementations17 Feb 2025 Guanghao Li, Wenhao Jiang, Li Shen, Ming Tang, Chun Yuan

Resource limitations often constrain the parameter counts of Large Language Models (LLMs), hindering their performance.

Systematic Outliers in Large Language Models

1 code implementation10 Feb 2025 Yongqi An, Xu Zhao, Tao Yu, Ming Tang, Jinqiao Wang

Outliers have been widely observed in Large Language Models (LLMs), significantly impacting model performance and posing challenges for model compression.

Model Compression

Label Anything: An Interpretable, High-Fidelity and Prompt-Free Annotator

no code implementations5 Feb 2025 Wei-Bin Kou, Guangxu Zhu, Rongguang Ye, Shuai Wang, Ming Tang, Yik-Chung Wu

To mitigate this cost of manual labeling, we propose a Label Anything Model (denoted as LAM), serving as an interpretable, high-fidelity, and prompt-free data annotator.

Autonomous Driving

FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization

1 code implementation17 Jan 2025 Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang

However, their handcrafted generic descriptions fail to capture the diverse range of anomalies that may emerge in different objects, and simple patch-level image-text matching often struggles to localize anomalous regions of varying shapes and sizes.

Anomaly Detection Image-text matching +4

Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence

no code implementations18 Dec 2024 Jinghan He, Kuan Zhu, Haiyun Guo, Junfeng Fang, Zhenglin Hua, Yuheng Jia, Ming Tang, Tat-Seng Chua, Jinqiao Wang

Large vision-language models (LVLMs) have made substantial progress in integrating large language models (LLMs) with visual inputs, enabling advanced multimodal reasoning.

Hallucination Multimodal Reasoning

UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection

no code implementations CVPR 2025 Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang

Visual Anomaly Detection (VAD) aims to identify abnormal samples in images that deviate from normal patterns, covering multiple domains, including industrial, logical, and medical fields.

Anomaly Detection Patch Matching

Friend or Foe? Harnessing Controllable Overfitting for Anomaly Detection

no code implementations30 Nov 2024 Long Qian, Bingke Zhu, Yingying Chen, Ming Tang, Jinqiao Wang

Overfitting has long been stigmatized as detrimental to model performance, especially in the context of anomaly detection.

Multi-class Anomaly Detection

SEEKR: Selective Attention-Guided Knowledge Retention for Continual Learning of Large Language Models

1 code implementation9 Nov 2024 Jinghan He, Haiyun Guo, Kuan Zhu, Zihan Zhao, Ming Tang, Jinqiao Wang

In this work, we first explore and emphasize the importance of attention weights in knowledge retention, and then propose a SElective attEntion-guided Knowledge Retention method (SEEKR) for data-efficient replay-based continual learning of large language models (LLMs).

Continual Learning

Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models

1 code implementation21 Oct 2024 Yufei Zhan, Hongyin Zhao, Yousong Zhu, Fan Yang, Ming Tang, Jinqiao Wang

None of the LMMs have yet comprehensively unified both types of tasks within a single model, as seen in Large Language Models in the natural language processing field.

Instruction Following object-detection +6

Asynchronous Fractional Multi-Agent Deep Reinforcement Learning for Age-Minimal Mobile Edge Computing

no code implementations25 Sep 2024 Lyudong Jin, Ming Tang, JiaYu Pan, Meng Zhang, Hao Wang

In the realm of emerging real-time networked applications like cyber-physical systems (CPS), the Age of Information (AoI) has merged as a pivotal metric for evaluating the timeliness.

Deep Reinforcement Learning Edge-computing +2

MROVSeg: Breaking the Resolution Curse of Vision-Language Models in Open-Vocabulary Image Segmentation

no code implementations27 Aug 2024 Yuanbing Zhu, Bingke Zhu, Yingying Chen, Yunfang Niu, Ming Tang, Jinqiao Wang

Pretrained vision-language models (VLMs), \eg CLIP, are increasingly used to bridge the gap between open- and close-vocabulary recognition in open-vocabulary image segmentation.

Image Segmentation Open Vocabulary Semantic Segmentation +2

Optical Semantic Communication through Multimode Fiber: From Symbol Transmission to Sentiment Analysis

no code implementations23 Aug 2024 Zheng Gao, Ting Jiang, Mingming Zhang, Hao Wu, Ming Tang

By encoding semantically similar symbols to adjacent frequencies, the system's noise tolerance is effectively improved, facilitating accurate sentiment analysis.

Semantic Communication Sentiment Analysis

Real-time Event Recognition of Long-distance Distributed Vibration Sensing with Knowledge Distillation and Hardware Acceleration

1 code implementation7 Aug 2024 Zhongyao Luo, Hao Wu, Zhao Ge, Ming Tang

The proposed method greatly enhances the efficiency of vibration pattern recognition, promoting the use of DVS as a smart IoT system.

GPU Intrusion Detection +1

Fermat Number Transform Based Chromatic Dispersion Compensation and Adaptive Equalization Algorithm

no code implementations7 May 2024 Siyu Chen, Zheli Liu, Weihao Li, Zihe Hu, Mingming Zhang, Sheng Cui, Ming Tang

By introducing the Fermat number transform into chromatic dispersion compensation and adaptive equalization, the computational complexity has been reduced by 68% compared with the con? ventional implementation.

FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization

1 code implementation21 Apr 2024 Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Hao Li, Ming Tang, Jinqiao Wang

Zero-shot anomaly detection (ZSAD) methods entail detecting anomalies directly without access to any known normal or abnormal samples within the target item categories.

Anomaly Detection Position +1

Optimization of Prompt Learning via Multi-Knowledge Representation for Vision-Language Models

no code implementations16 Apr 2024 Enming Zhang, Bingke Zhu, Yingying Chen, Qinghai Miao, Ming Tang, Jinqiao Wang

This limitation restricts the capabilities of pretrained VLMs and can result in incorrect predictions in downstream tasks.

Diversity Prompt Learning

PraFFL: A Preference-Aware Scheme in Fair Federated Learning

1 code implementation13 Apr 2024 Rongguang Ye, Wei-Bin Kou, Ming Tang

We theoretically prove that PraFFL can offer the optimal model tailored to an arbitrary preference of each client, and show its linear convergence.

Fairness Federated Learning

Efficient and Generalizable Certified Unlearning: A Hessian-free Recollection Approach

no code implementations2 Apr 2024 Xinbao Qiao, Meng Zhang, Ming Tang, Ermin Wei

Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data.

Machine Unlearning

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

1 code implementation14 Mar 2024 Yufei Zhan, Yousong Zhu, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao Wang

Large Vision Language Models have achieved fine-grained object perception, but the limitation of image resolution remains a significant obstacle to surpass the performance of task-specific experts in complex and dense scenarios.

Object Object Counting +3

Convergence Analysis of Split Federated Learning on Heterogeneous Data

no code implementations23 Feb 2024 Pengchao Han, Chao Huang, Geng Tian, Ming Tang, Xin Liu

We further extend the analysis to non-convex objectives and the scenario where some clients may be unavailable during training.

Federated Learning

Self-Supervised Representation Learning from Arbitrary Scenarios

no code implementations CVPR 2024 Zhaowen Li, Yousong Zhu, Zhiyang Chen, Zongxin Gao, Rui Zhao, Chaoyang Zhao, Ming Tang, Jinqiao Wang

To address this conflict this work abandons the non-generalizable global-level constraints and proposes explicit patch-level contrastive learning as a solution.

Contrastive Learning Data Augmentation +2

Fractional Deep Reinforcement Learning for Age-Minimal Mobile Edge Computing

no code implementations16 Dec 2023 Lyudong Jin, Ming Tang, Meng Zhang, Hao Wang

The uncertain edge load dynamics, the nature of the fractional objective, and hybrid continuous-discrete action space (due to the joint optimization) make this problem challenging and existing approaches not directly applicable.

Autonomous Driving Deep Reinforcement Learning +4

Mitigating Hallucination in Visual Language Models with Visual Supervision

no code implementations27 Nov 2023 Zhiyang Chen, Yousong Zhu, Yufei Zhan, Zhaowen Li, Chaoyang Zhao, Jinqiao Wang, Ming Tang

Large vision-language models (LVLMs) suffer from hallucination a lot, generating responses that apparently contradict to the image content occasionally.

Hallucination

Continual Instruction Tuning for Large Multimodal Models

no code implementations27 Nov 2023 Jinghan He, Haiyun Guo, Ming Tang, Jinqiao Wang

2) Are the existing three classes of continual learning methods still applicable to the continual instruction tuning of LMMs?

Continual Learning

Price of Stability in Quality-Aware Federated Learning

no code implementations13 Oct 2023 Yizhou Yan, Xinyu Tang, Chao Huang, Ming Tang

The presence of label noise can severely degrade the FL performance, and some existing studies have focused on algorithm design for label denoising.

Denoising Federated Learning

AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

1 code implementation29 Aug 2023 Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang

Large Vision-Language Models (LVLMs) such as MiniGPT-4 and LLaVA have demonstrated the capability of understanding images and achieved remarkable performance in various visual tasks.

Anomaly Detection In-Context Learning

FrFT based estimation of linear and nonlinear impairments using Vision Transformer

no code implementations25 Aug 2023 Ting Jiang, Zheng Gao, Yizhao Chen, Zihe Hu, Ming Tang

To comprehensively assess optical fiber communication system conditions, it is essential to implement joint estimation of the following four critical impairments: nonlinear signal-to-noise ratio (SNRNL), optical signal-to-noise ratio (OSNR), chromatic dispersion (CD) and differential group delay (DGD).

When MiniBatch SGD Meets SplitFed Learning:Convergence Analysis and Performance Evaluation

no code implementations23 Aug 2023 Chao Huang, Geng Tian, Ming Tang

SplitFed learning (SFL) is a recent distributed approach that alleviates computation workload at the client device by splitting the model at a cut layer into two parts, where clients only need to train part of the model.

Federated Learning

Real-time FPGA Implementation of CNN-based Distributed Fiber Optic Vibration Event Recognition Method

no code implementations9 Aug 2023 Zhongyao Luo, Zhao Ge, Hao Wu, Ming Tang

Utilizing optical fibers to detect and pinpoint vibrations, Distributed Optical Fiber Vibration Sensing (DVS) technology provides real-time monitoring and surveillance of wide-reaching areas.

CPU Edge-computing +1

Multi-objective Deep Reinforcement Learning for Mobile Edge Computing

1 code implementation5 Jul 2023 Ning Yang, Junrui Wen, Meng Zhang, Ming Tang

In this study, we address this issue by formulating a multi-objective offloading problem for MEC with multiple edges to minimize expected long-term energy consumption and transmission delay while considering unknown preferences as parameters.

Deep Reinforcement Learning Edge-computing +2

Fast Segment Anything

1 code implementation21 Jun 2023 Xu Zhao, Wenchao Ding, Yongqi An, Yinglong Du, Tao Yu, Min Li, Ming Tang, Jinqiao Wang

In this paper, we propose a speed-up alternative method for this fundamental task with comparable performance.

Edge Detection Image Segmentation +6

FreConv: Frequency Branch-and-Integration Convolutional Networks

no code implementations10 Apr 2023 Zhaowen Li, Xu Zhao, Peigeng Ding, Zongxin Gao, Yuting Yang, Ming Tang, Jinqiao Wang

In the high-frequency branch, a derivative-filter-like architecture is designed to extract the high-frequency information while a light extractor is employed in the low-frequency branch because the low-frequency information is usually redundant.

ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground Selection

1 code implementation CVPR 2023 Yongqi An, Xu Zhao, Tao Yu, Haiyun Guo, Chaoyang Zhao, Ming Tang, Jinqiao Wang

However, previous unsupervised deep learning BGS algorithms perform poorly in sophisticated scenarios such as shadows or night lights, and they cannot detect objects outside the pre-defined categories.

Foreground Segmentation Object +2

Efficient Masked Autoencoders with Self-Consistency

no code implementations28 Feb 2023 Zhaowen Li, Yousong Zhu, Zhiyang Chen, Wei Li, Chaoyang Zhao, Rui Zhao, Ming Tang, Jinqiao Wang

Besides, we design the self-consistency learning to further maintain the consistency of predictions of overlapping masked patches among parts.

image-classification Image Classification +6

Deep Learning for Human Parsing: A Survey

no code implementations29 Jan 2023 Xiaomei Zhang, Xiangyu Zhu, Ming Tang, Zhen Lei

Human parsing is a key topic in image processing with many applications, such as surveillance analysis, human-robot interaction, person search, and clothing category classification, among many others.

Deep Learning Human Parsing +2

Design of the PID temperature controller for an alkaline electrolysis system with time delays

no code implementations3 Oct 2022 Ruomei Qi, Jiarong Li, Jin Lin, Yonghua Song, Jiepeng Wang, Qiangqiang Cui, Yiwei Qiu, Ming Tang, Jian Wang

This paper focuses on the design of the PID temperature controller for an alkaline electrolysis system to achieve fast and stable temperature control.

Transfering Low-Frequency Features for Domain Adaptation

no code implementations31 Aug 2022 Zhaowen Li, Xu Zhao, Chaoyang Zhao, Ming Tang, Jinqiao Wang

Previous unsupervised domain adaptation methods did not handle the cross-domain problem from the perspective of frequency for computer vision.

image-classification Image Classification +3

Plug-and-Play Pseudo Label Correction Network for Unsupervised Person Re-identification

no code implementations14 Jun 2022 Tianyi Yan, Kuan Zhu, Haiyun Guo, Guibo Zhu, Ming Tang, Jinqiao Wang

Clustering-based methods, which alternate between the generation of pseudo labels and the optimization of the feature extraction network, play a dominant role in both unsupervised learning (USL) and unsupervised domain adaptive (UDA) person re-identification (Re-ID).

Clustering Pseudo Label +1

Beyond the Limitation of Pulse Width in Optical Time-domain Reflectometry

no code implementations14 Mar 2022 Hao Wu, Ming Tang

Here, we propose and experimentally demonstrate an OTDR deconvolution neural network based on deep convolutional neural networks.

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

no code implementations CVPR 2022 Zhaowen Li, Yousong Zhu, Fan Yang, Wei Li, Chaoyang Zhao, Yingying Chen, Zhiyang Chen, Jiahao Xie, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang

Furthermore, our method can also exploit single-centric-object dataset such as ImageNet and outperforms BYOL by 2. 5% with the same pre-training epochs in linear probing, and surpass current self-supervised object detection methods on COCO dataset, demonstrating its universality and potential.

image-classification Image Classification +5

Thermal Modelling and Controller Design of an Alkaline Electrolysis System under Dynamic Operating Conditions

no code implementations27 Feb 2022 Ruomei Qi, Jiarong Li, Jin Lin, Yonghua Song, Jiepeng Wang, Qiangqiang Cui, Yiwei Qiu, Ming Tang, Jian Wang

A control-oriented thermal model is established in the form of a third-order time-delay process, which is used for simulation and controller design.

Management

Pruning-aware Sparse Regularization for Network Pruning

1 code implementation18 Jan 2022 Nanfei Jiang, Xu Zhao, Chaoyang Zhao, Yongqi An, Ming Tang, Jinqiao Wang

MaskSparsity imposes the fine-grained sparse regularization on the specific filters selected by a pruning mask, rather than all the filters of the model.

Network Pruning

Multi-initialization Optimization Network for Accurate 3D Human Pose and Shape Estimation

no code implementations24 Dec 2021 Zhiwei Liu, Xiangyu Zhu, Lu Yang, Xiang Yan, Ming Tang, Zhen Lei, Guibo Zhu, Xuetao Feng, Yan Wang, Jinqiao Wang

In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism.

Ranked #75 on 3D Human Pose Estimation on 3DPW (MPJPE metric)

3D human pose and shape estimation 3D Reconstruction

Enabling variable high spatial resolution retrieval from a long pulse BOTDA sensor

no code implementations9 Sep 2021 Zhao Ge, Li Shen, Can Zhao, Hao Wu, Zhiyong Zhao, Ming Tang

We propose a convolutional neural network (CNN) to process the data of conventional Brillouin optical time domain analysis (BOTDA) sensors, which achieves unprecedented performance improvement that allows to directly retrieve higher spatial resolution (SR) from the sensing system that use long pump pulses.

Retrieval

DPT: Deformable Patch-based Transformer for Visual Recognition

1 code implementation30 Jul 2021 Zhiyang Chen, Yousong Zhu, Chaoyang Zhao, Guosheng Hu, Wei Zeng, Jinqiao Wang, Ming Tang

To address this problem, we propose a new Deformable Patch (DePatch) module which learns to adaptively split the images into patches with different positions and scales in a data-driven way rather than using predefined fixed patches.

image-classification Image Classification +3

Improving Multiple Object Tracking With Single Object Tracking

no code implementations CVPR 2021 Linyu Zheng, Ming Tang, Yingying Chen, Guibo Zhu, Jinqiao Wang, Hanqing Lu

Despite considerable similarities between multiple object tracking (MOT) and single object tracking (SOT) tasks, modern MOT methods have not benefited from the development of SOT ones to achieve satisfactory performance.

Multiple Object Tracking Object +2

MST: Masked Self-Supervised Transformer for Visual Representation

no code implementations NeurIPS 2021 Zhaowen Li, Zhiyang Chen, Fan Yang, Wei Li, Yousong Zhu, Chaoyang Zhao, Rui Deng, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang

More importantly, the masked tokens together with the remaining tokens are further recovered by a global image decoder, which preserves the spatial information of the image and is more friendly to the downstream dense prediction tasks.

Language Modeling Language Modelling +5

AAformer: Auto-Aligned Transformer for Person Re-Identification

no code implementations2 Apr 2021 Kuan Zhu, Haiyun Guo, Shiliang Zhang, YaoWei Wang, Jing Liu, Jinqiao Wang, Ming Tang

In this article, we introduce an alignment scheme in transformer architecture for the first time and propose the auto-aligned transformer (AAformer) to automatically locate both the human parts and nonhuman ones at patch level.

Human Parsing Image Classification +3

Adaptive Class Suppression Loss for Long-Tail Object Detection

1 code implementation CVPR 2021 Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Jinqiao Wang, Ming Tang

To address the problem of long-tail distribution for the large vocabulary object detection task, existing methods usually divide the whole categories into several groups and treat each group with different strategies.

Object object-detection +1

Identify Influential Spreaders in Asymmetrically Interacting Multiplex Networks

no code implementations5 Jan 2021 Qi Zeng, Ying Liu, Liming Pan, Ming Tang

Our work provides insights on the importance of nodes in the multiplex network and gives a feasible framework to investigate influential spreaders in the asymmetrically coevolving dynamics.

Physics and Society

High-Performance Discriminative Tracking With Transformers

no code implementations ICCV 2021 Bin Yu, Ming Tang, Linyu Zheng, Guibo Zhu, Jinqiao Wang, Hao Feng, Xuetao Feng, Hanqing Lu

End-to-end discriminative trackers improve the state of the art significantly, yet the improvement in robustness and efficiency is restricted by the conventional discriminative model, i. e., least-squares based regression.

Decoder Object +2

Task Decoupled Knowledge Distillation For Lightweight Face Detectors

1 code implementation14 Oct 2020 Xiaoqing Liang, Xu Zhao, Chaoyang Zhao, Nanfei Jiang, Ming Tang, Jinqiao Wang

This method decouples the distillation task of face detection into two subtasks, i. e., the classification distillation subtask and the regression distillation subtask.

Face Detection Knowledge Distillation +1

Improving the spatial resolution of a BOTDA sensor using deconvolution algorithm

no code implementations15 Sep 2020 Li Shen, Zhiyong Zhao, Can Zhao, Hao Wu, Chao Lu, Ming Tang

The frequency dependency of Brillouin gain temporal envelope is investigated by simulation, and its impact on the recovered results of deconvolution algorithm is thoroughly analyzed.

Denoising

Identity-Guided Human Semantic Parsing for Person Re-Identification

1 code implementation ECCV 2020 Kuan Zhu, Haiyun Guo, Zhiwei Liu, Ming Tang, Jinqiao Wang

In this paper, we propose the identity-guided human semantic parsing approach (ISP) to locate both the human body parts and personal belongings at pixel-level for aligned person re-ID only with person identity labels.

Clustering Human Parsing +3

A Quantitative Analytical Model for Predicting and Optimizing the Rate Performance of Battery Cells

1 code implementation20 Apr 2020 Fan Wang, Ming Tang

An important objective of designing lithium-ion rechargeable battery cells is to maximize their rate performance without compromising the energy density, which is mainly achieved through computationally expensive numerical simulations at present.

Materials Science Applied Physics

Repositioning Bikes with Carrier Vehicles and Bike Trailers in Bike Sharing Systems

no code implementations20 Sep 2019 Xinghua Zheng, Ming Tang, Hankz Hankui Zhuo, Kevin X. Wen

Bike Sharing Systems (BSSs) have been adopted in many major cities of the world due to traffic congestion and carbon emissions.

Learning Feature Embeddings for Discriminant Model based Tracking

no code implementations ECCV 2020 Linyu Zheng, Ming Tang, Yingying Chen, Jinqiao Wang, Hanqing Lu

After observing that the features used in most online discriminatively trained trackers are not optimal, in this paper, we propose a novel and effective architecture to learn optimal feature embeddings for online discriminative tracking.

Visual Tracking

Fast Kernelized Correlation Filters without Boundary Effect

no code implementations17 Jun 2018 Ming Tang, Linyu Zheng, Bin Yu, Jinqiao Wang

To achieve the fast training and detection, a set of cyclic bases is introduced to construct the filter.

Visual Tracking

PCN: Part and Context Information for Pedestrian Detection with CNNs

no code implementations12 Apr 2018 Shiguang Wang, Jian Cheng, Haijun Liu, Ming Tang

To take advantage of the body parts and context information for pedestrian detection, we propose the part and context network (PCN) in this work.

Occlusion Handling Pedestrian Detection

On the Relations of Correlation Filter Based Trackers and Struck

no code implementations25 Nov 2017 Jinqiao Wang, Ming Tang, Linyu Zheng, Jiayi Feng

In recent years, two types of trackers, namely correlation filter based tracker (CF tracker) and structured output tracker (Struck), have exhibited the state-of-the-art performance.

Relation

Fast Deep Matting for Portrait Animation on Mobile Phone

1 code implementation26 Jul 2017 Bingke Zhu, Yingying Chen, Jinqiao Wang, Si Liu, Bo Zhang, Ming Tang

Finally, an automatic portrait animation system based on fast deep matting is built on mobile devices, which does not need any interaction and can realize real-time matting with 15 fps.

Image Matting Portrait Animation +1

Joint Background Reconstruction and Foreground Segmentation via A Two-stage Convolutional Neural Network

no code implementations24 Jul 2017 Xu Zhao, Yingying Chen, Ming Tang, Jinqiao Wang

In the first stage, a convolutional encoder-decoder sub-network is employed to reconstruct the background images and encode rich prior knowledge of background scenes.

Decoder Foreground Segmentation +1

Multi-Kernel Correlation Filter for Visual Tracking

no code implementations ICCV 2015 Ming Tang, Jiayi Feng

In this paper, we will derive a multi-kernel correlation filter (MKCF) based tracker which fully takes advantage of the invariance-discriminative power spectrums of various features to further improve the performance.

Visual Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.