no code implementations • 3 Nov 2023 • Xing Di, Yiyu Zheng, Xiaoming Liu, Yu Cheng
This paper presents a novel approach, called Prototype-based Self-Distillation (ProS), for unsupervised face representation learning.
no code implementations • 6 May 2023 • Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou, Zichuan Xu, Haozhao Wang, Xing Di, Weining Lu, Yu Cheng
This paper addresses the temporal sentence grounding (TSG).
no code implementations • 5 Jan 2023 • Daizong Liu, Xiang Fang, Pan Zhou, Xing Di, Weining Lu, Yu Cheng
Given an untrimmed video, temporal sentence localization (TSL) aims to localize a specific segment according to a given sentence query.
no code implementations • 2 Jan 2023 • Jiahao Zhu, Daizong Liu, Pan Zhou, Xing Di, Yu Cheng, Song Yang, Wenzheng Xu, Zichuan Xu, Yao Wan, Lichao Sun, Zeyu Xiong
All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning.
1 code implementation • 1 Jan 2023 • Huaizheng Zhang, Yuanming Li, Wencong Xiao, Yizheng Huang, Xing Di, Jianxiong Yin, Simon See, Yong Luo, Chiew Tong Lau, Yang You
The vision of this paper is to provide a more comprehensive and practical benchmark study for MIG in order to eliminate the need for tedious manual benchmarking and tuning efforts.
1 code implementation • 12 Jul 2022 • Yuhua Sun, Tailai Zhang, Xingjun Ma, Pan Zhou, Jian Lou, Zichuan Xu, Xing Di, Yu Cheng, Lichao
In this paper, we propose two novel Density Manipulation Backdoor Attacks (DMBA$^{-}$ and DMBA$^{+}$) to attack the model to produce arbitrarily large or small density estimations.
no code implementations • 14 Jan 2022 • Daizong Liu, Xiaoye Qu, Yinzhen Wang, Xing Di, Kai Zou, Yu Cheng, Zichuan Xu, Pan Zhou
Temporal video grounding (TVG) aims to localize a target segment in a video according to a given sentence query.
no code implementations • 3 Jan 2022 • Daizong Liu, Xiaoye Qu, Xing Di, Yu Cheng, Zichuan Xu, Pan Zhou
To tackle this issue, we propose a memory-augmented network, called Memory-Guided Semantic Learning Network (MGSL-Net), that learns and memorizes the rarely appeared content in TSG tasks.
no code implementations • 20 Sep 2021 • Dazhou Guo, Xianghua Ye, Jia Ge, Xing Di, Le Lu, Lingyun Huang, Guotong Xie, Jing Xiao, Zhongjie Liu, Ling Peng, Senxiang Yan, Dakai Jin
Lymph node station (LNS) delineation from computed tomography (CT) scans is an indispensable step in radiation oncology workflow.
no code implementations • 17 Jul 2021 • Xing Di, Shuowen Hu, Vishal M. Patel
We propose a domain agnostic learning-based generative adversarial network (DAL-GAN) which can synthesize frontal views in the visible domain from thermal faces with pose variations.
1 code implementation • 9 Apr 2021 • Xing Di, Vishal M. Patel
Extensive experiments and comparisons with several state-of-the-art methods are performed to verify the effectiveness of the proposed attribute-based multimodal synthesis method.
no code implementations • 7 Jan 2021 • Domenick Poster, Matthew Thielke, Robert Nguyen, Srinivasan Rajaraman, Xing Di, Cedric Nimpa Fondje, Vishal M. Patel, Nathaniel J. Short, Benjamin S. Riggan, Nasser M. Nasrabadi, Shuowen Hu
Thermal face imagery, which captures the naturally emitted heat from the face, is limited in availability compared to face imagery in the visible spectrum.
no code implementations • 20 Apr 2020 • Xing Di, Benjamin S. Riggan, Shuowen Hu, Nathaniel J. Short, Vishal M. Patel
Finally, a pre-trained VGG-Face network is leveraged to extract features from the synthesized image and the input visible image for verification.
no code implementations • 17 Dec 2019 • Xing Di, Vishal M. Patel
In this paper, we take a different approach, where we formulate the original problem as a stage-wise learning problem.
no code implementations • 15 Apr 2019 • Xing Di, Benjamin S. Riggan, Shuowen Hu, Nathaniel J. Short, Vishal M. Patel
Polarimetric thermal to visible face verification entails matching two images that contain significant domain differences.
no code implementations • 3 Jan 2019 • Xing Di, He Zhang, Vishal M. Patel
A pre-trained VGG-Face network is used to extract the attributes from the visible image.
1 code implementation • 30 Dec 2017 • Xing Di, Vishal M. Patel
In this paper, we take a different approach, where we formulate the original problem as a stage-wise learning problem.
2 code implementations • 3 Oct 2017 • Xing Di, Vishwanath A. Sindagi, Vishal M. Patel
The primary aim of this work is to demonstrate that information preserved by landmarks (gender in particular) can be further accentuated by leveraging generative models to synthesize corresponding faces.