Search Results for author: Xu Tang

Found 31 papers, 14 papers with code

Unified Video-Language Pre-training with Synchronized Audio

no code implementations12 May 2024 Shentong Mo, Haofan Wang, Huaxia Li, Xu Tang

Video-language pre-training is a typical and challenging problem that aims at learning visual and textual representations from large-scale data in a self-supervised way.

Multimodal Sense-Informed Prediction of 3D Human Motions

no code implementations5 May 2024 Zhenyu Lou, Qiongjie Cui, Haofan Wang, Xu Tang, Hong Zhou

Predicting future human pose is a fundamental application for machine intelligence, which drives robots to plan their behavior and paths ahead of time to seamlessly accomplish human-robot collaboration in real-world 3D scenarios.

motion prediction Trajectory Prediction

StableGarment: Garment-Centric Generation via Stable Diffusion

no code implementations16 Mar 2024 Rui Wang, Hailong Guo, Jiaming Liu, Huaxia Li, Haibo Zhao, Xu Tang, Yao Hu, Hao Tang, Peipei Li

In this paper, we introduce StableGarment, a unified framework to tackle garment-centric(GC) generation tasks, including GC text-to-image, controllable GC text-to-image, stylized GC text-to-image, and robust virtual try-on.

Denoising Image Generation +1

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model

no code implementations12 Mar 2024 Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhao

Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios.

Text-to-Image Generation

ZONE: Zero-Shot Instruction-Guided Local Editing

1 code implementation28 Dec 2023 Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.

Image Generation

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

1 code implementation26 Dec 2023 Yuxuan Zhang, Yiren Song, Jiaming Liu, Rui Wang, Jinpeng Yu, Hao Tang, Huaxia Li, Xu Tang, Yao Hu, Han Pan, Zhongliang Jing

Recent advancements in subject-driven image generation have led to zero-shot generation, yet precise selection and focus on crucial subject representations remain challenging.

Image Generation

Remote Sensing Object Detection Meets Deep Learning: A Meta-review of Challenges and Advances

no code implementations13 Sep 2023 Xiangrong Zhang, Tianyang Zhang, Guanchun Wang, Peng Zhu, Xu Tang, Xiuping Jia, Licheng Jiao

In this era of rapid technical evolution, this review aims to present a comprehensive review of the recent achievements in deep learning based RSOD methods.

Object object-detection +1

Synthesizing Physically Plausible Human Motions in 3D Scenes

1 code implementation17 Aug 2023 Liang Pan, Jingbo Wang, Buzhen Huang, Junyu Zhang, Haofan Wang, Xu Tang, Yangang Wang

Experimental results demonstrate that our framework can synthesize physically plausible long-term human motions in complex 3D scenes.

Controllable Mind Visual Diffusion Model

1 code implementation17 May 2023 Bohan Zeng, Shanglin Li, Xuhui Liu, Sicheng Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.

Attribute Image Generation

PiClick: Picking the desired mask in click-based interactive segmentation

1 code implementation23 Apr 2023 Cilin Yan, Haochen Wang, Jie Liu, XiaoLong Jiang, Yao Hu, Xu Tang, Guoliang Kang, Efstratios Gavves

Click-based interactive segmentation aims to generate target masks via human clicking, which facilitates efficient pixel-level annotation and image editing.

Interactive Segmentation Segmentation

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation

no code implementations14 Apr 2023 Jie Guo, Qimeng Wang, Yan Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Baochang Zhang

CLIP (Contrastive Language-Image Pretraining) is well-developed for open-vocabulary zero-shot image-level recognition, while its applications in pixel-level tasks are less investigated, where most efforts directly adopt CLIP features without deliberative adaptations.

GPR Open Vocabulary Semantic Segmentation +3

Towards Open-Vocabulary Video Instance Segmentation

1 code implementation ICCV 2023 Haochen Wang, Cilin Yan, Shuai Wang, XiaoLong Jiang, Xu Tang, Yao Hu, Weidi Xie, Efstratios Gavves

Video Instance Segmentation (VIS) aims at segmenting and categorizing objects in videos from a closed set of training categories, lacking the generalization ability to handle novel categories in real-world videos.

Instance Segmentation Segmentation +3

SoftMatch Distance: A Novel Distance for Weakly-Supervised Trend Change Detection in Bi-Temporal Images

no code implementations8 Mar 2023 Yuqun Yang, Xu Tang, Xiangrong Zhang, Jingjing Ma, Licheng Jiao

Therefore, there is a novel solution that intuitively dividing changes into three trends (``appear'', ``disappear'' and ``transform'') instead of semantic categories, named it trend change detection (TCD) in this paper.

Change Detection

OvarNet: Towards Open-vocabulary Object Attribute Recognition

1 code implementation CVPR 2023 Keyan Chen, XiaoLong Jiang, Yao Hu, Xu Tang, Yan Gao, Jianqi Chen, Weidi Xie

In this paper, we consider the problem of simultaneously detecting objects and inferring their visual attributes in an image, even for those with no manual annotations provided at the training stage, resembling an open-vocabulary scenario.

 Ranked #1 on Open Vocabulary Attribute Detection on OVAD benchmark (using extra training data)

Attribute Knowledge Distillation +5

Absolute Wrong Makes Better: Boosting Weakly Supervised Object Detection via Negative Deterministic Information

no code implementations21 Apr 2022 Guanchun Wang, Xiangrong Zhang, Zelin Peng, Xu Tang, Huiyu Zhou, Licheng Jiao

In the exploiting stage, we utilize the extracted NDI to construct a novel negative contrastive learning mechanism and a negative guided instance selection strategy for dealing with the issues of part domination and missing instances, respectively.

Contrastive Learning Multiple Instance Learning +2

Decoupled IoU Regression for Object Detection

no code implementations2 Feb 2022 Yan Gao, Qimeng Wang, Xu Tang, Haochen Wang, Fei Ding, Jing Li, Yao Hu

Prior works propose to predict Intersection-over-Union (IoU) between bounding boxes and corresponding ground-truths to improve NMS, while accurately predicting IoU is still a challenging problem.

Object object-detection +2

SVIP: Sequence VerIfication for Procedures in Videos

1 code implementation CVPR 2022 Yicheng Qian, Weixin Luo, Dongze Lian, Xu Tang, Peilin Zhao, Shenghua Gao

In this paper, we propose a novel sequence verification task that aims to distinguish positive video pairs performing the same action sequence from negative ones with step-level transformations but still conducting the same task.

Action Detection Action Recognition

Semantic Attention and Scale Complementary Network for Instance Segmentation in Remote Sensing Images

no code implementations25 Jul 2021 Tianyang Zhang, Xiangrong Zhang, Peng Zhu, Xu Tang, Chen Li, Licheng Jiao, Huiyu Zhou

To address the above problems, we propose an end-to-end multi-category instance segmentation model, namely Semantic Attention and Scale Complementary Network, which mainly consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB).

Instance Segmentation Segmentation +1

End-to-end Temporal Action Detection with Transformer

1 code implementation18 Jun 2021 Xiaolong Liu, Qimeng Wang, Yao Hu, Xu Tang, Shiwei Zhang, Song Bai, Xiang Bai

Temporal action detection (TAD) aims to determine the semantic label and the temporal interval of every action instance in an untrimmed video.

Action Detection Temporal Action Localization +1

Learning Global Structure Consistency for Robust Object Tracking

no code implementations26 Aug 2020 Bi Li, Chengquan Zhang, Zhibin Hong, Xu Tang, Jingtuo Liu, Junyu Han, Errui Ding, Wenyu Liu

Unlike many existing trackers that focus on modeling only the target, in this work, we consider the \emph{transient variations of the whole scene}.

Object Visual Object Tracking

Evolving Metric Learning for Incremental and Decremental Features

no code implementations27 Jun 2020 Jiahua Dong, Yang Cong, Gan Sun, Tao Zhang, Xu Tang, Xiaowei Xu

Online metric learning has been widely exploited for large-scale data classification due to the low computational cost.

Metric Learning

HAMBox: Delving into Online High-quality Anchors Mining for Detecting Outer Faces

no code implementations19 Dec 2019 Yang Liu, Xu Tang, Xiang Wu, Junyu Han, Jingtuo Liu, Errui Ding

In this paper, we propose an Online High-quality Anchor Mining Strategy (HAMBox), which explicitly helps outer faces compensate with high-quality anchors.

Face Detection Multi-Task Learning +2

DuBox: No-Prior Box Objection Detection via Residual Dual Scale Detectors

no code implementations15 Apr 2019 Shuai Chen, Jinpeng Li, Chuanqi Yao, Wenbo Hou, Shuo Qin, Wenyao Jin, Xu Tang

Working with multi-scale features, the designed dual scale residual unit makes dual scale detectors no longer run independently.

object-detection Object Detection

PyramidBox++: High Performance Detector for Finding Tiny Face

4 code implementations31 Mar 2019 Zhihang Li, Xu Tang, Junyu Han, Jingtuo Liu, Ran He

With the rapid development of deep convolutional neural network, face detection has made great progress in recent years.

Data Augmentation Face Detection +1

Deep Adaptive Proposal Network for Object Detection in Optical Remote Sensing Images

no code implementations19 Jul 2018 Lin Cheng, Xu Liu, Lingling Li, Licheng Jiao, Xu Tang

More recently, a two-stage detector Faster R-CNN is proposed and demonstrated to be a promising tool for object detection in optical remote sensing images, while the sparse and dense characteristic of objects in remote sensing images is complexity.

Object object-detection +2

Polarimetric Convolutional Network for PolSAR Image Classification

1 code implementation9 Jul 2018 Xu Liu, Licheng Jiao, Xu Tang, Qigong Sun, Dan Zhang

Based on sparse scattering coding and convolution neural network, the polarimetric convolutional network is proposed to classify PolSAR images by making full use of polarimetric information.

Classification General Classification +1

Face Aging With Identity-Preserved Conditional Generative Adversarial Networks

2 code implementations CVPR 2018 Zongwei Wang, Xu Tang, Weixin Luo, Shenghua Gao

By grouping faces with target age together, the objective of face aging is equivalent to transferring aging patterns of faces within the target age group to the face whose aged face is to be synthesized.

PyramidBox: A Context-assisted Single Shot Face Detector

5 code implementations ECCV 2018 Xu Tang, Daniel K. Du, Zeqiang He, Jingtuo Liu

This paper proposes a novel context-assisted single shot face detector, named \emph{PyramidBox} to handle the hard face detection problem.

Face Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.