no code implementations • 26 Dec 2024 • Haitao Meng, Chonghao Zhong, Sheng Tang, Lian JunJia, Wenwei Lin, Zhenshan Bing, Yi Chang, Gang Chen, Alois Knoll
To achieve this, we propose a Focus Cost Discrimination (FCD) module that measures the clarity of edges as an essential indicator of focus level and integrates spatial surroundings to facilitate cost estimation.
no code implementations • 14 Oct 2024 • Zhang Wan, Sheng Tang, Jiawei Wei, Ruize Zhang, Juan Cao
In recent years, diffusion models have achieved tremendous success in the field of video generation, with controllable video generation receiving significant attention.
1 code implementation • 29 Nov 2023 • Xiaoyue Mi, Fan Tang, Yepeng Weng, Danding Wang, Juan Cao, Sheng Tang, Peng Li, Yang Liu
Despite the effectiveness in improving the robustness of neural networks, adversarial training has suffered from the natural accuracy degradation problem, i. e., accuracy on natural samples has reduced significantly.
no code implementations • 20 Oct 2023 • Haipeng Fang, Zhihao Sun, Ziyao Huang, Fan Tang, Juan Cao, Sheng Tang
The advancement of generative AI has extended to the realm of Human Dance Generation, demonstrating superior generative capacities.
1 code implementation • CVPR 2023 • Tianyun Yang, Danding Wang, Fan Tang, Xinying Zhao, Juan Cao, Sheng Tang
In this study, we focus on a challenging task, namely Open-Set Model Attribution (OSMA), to simultaneously attribute images to known models and identify those from unknown ones.
no code implementations • 16 Jun 2021 • Tianyun Yang, Juan Cao, Qiang Sheng, Lei LI, Jiaqi Ji, Xirong Li, Sheng Tang
Adopting a multi-task framework, we propose a GAN Fingerprint Disentangling Network (GFD-Net) to simultaneously disentangle the fingerprint from GAN-generated images and produce a content-irrelevant representation for fake image attribution.
1 code implementation • ECCV 2020 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng
Specifically, we systematically investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset, and unveil that a major cause is the inaccurate classification of object proposals.
1 code implementation • ECCV 2020 • Junbin Xiao, Xindi Shang, Xun Yang, Sheng Tang, Tat-Seng Chua
In this paper, we explore a novel task named visual Relation Grounding in Videos (vRGV).
2 code implementations • CVPR 2020 • Yu Li, Tao Wang, Bingyi Kang, Sheng Tang, Chunfeng Wang, Jintao Li, Jiashi Feng
Solving long-tail large vocabulary object detection with deep learning based models is a challenging and demanding task, which is however under-explored. In this work, we provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution.
no code implementations • 25 Dec 2019 • Yu Li, Sheng Tang, Rui Zhang, Yongdong Zhang, Jintao Li, Shuicheng Yan
While in situations where two domains are asymmetric in complexity, i. e., the amount of information between two domains is different, these approaches pose problems of poor generation quality, mapping ambiguity, and model sensitivity.
1 code implementation • 29 Oct 2019 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Jun Hao Liew, Sheng Tang, Steven Hoi, Jiashi Feng
In this report, we investigate the performance drop phenomenon of state-of-the-art two-stage instance segmentation models when processing extreme long-tail training data based on the LVIS [5] dataset, and find a major cause is the inaccurate classification of object proposals.
no code implementations • 29 Jul 2019 • Tianyi Wu, Sheng Tang, Rui Zhang, Guodong Guo, Yongdong Zhang
However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category.
no code implementations • 12 Dec 2018 • Tianyi Wu, Sheng Tang, Rui Zhang, Juan Cao, Jintao Li
Therefore, it can capture partial information and enlarge the receptive field of filters simultaneously without introducing extra parameters.
4 code implementations • 20 Nov 2018 • Tianyi Wu, Sheng Tang, Rui Zhang, Yongdong Zhang
To tackle this problem, we propose a novel Context Guided Network (CGNet), which is a light-weight and efficient network for semantic segmentation.
Ranked #8 on
Semantic Segmentation
on EventScape
2 code implementations • 7 Nov 2018 • Rui Zhang, Sheng Tang, Yu Li, Junbo Guo, Yongdong Zhang, Jintao Li, Shuicheng Yan
The S3-GAN consists of an encoder network, a generator network, and an adversarial network.
no code implementations • ICCV 2017 • Rui Zhang, Sheng Tang, Yongdong Zhang, Jintao Li, Shuicheng Yan
Through adding a new scale regression layer, we can dynamically infer the position-adaptive scale coefficients which are adopted to resize the convolutional patches.