no code implementations • 8 Oct 2024 • Jiangfan Deng, Zhuang Jia, Zhaoxue Wang, Xiang Long, Daniel K. Du
To achieve accurate parsing of the eye-region, we first leverage the pretrained foundation model Segment Anything (SAM) in an automatic way to refine the eye indications.
no code implementations • 30 Aug 2024 • Zhuang Jia, Jiangfan Deng, Liying Chi, Xiang Long, Daniel K. Du
Parsing of eye components (i. e. pupil, iris and sclera) is fundamental for eye tracking and gaze estimation for AR/VR products.
3 code implementations • 9 Apr 2024 • Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zheng Leng Thai, Kaihuo Zhang, Chongyi Wang, Yuan YAO, Chenyang Zhao, Jie zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun
For data scaling, we introduce a Warmup-Stable-Decay (WSD) learning rate scheduler (LRS), conducive to continuous training and domain adaptation.
no code implementations • 4 Nov 2023 • Haili Sun, Yan Huang, Lansheng Han, Cai Fu, Hongle Liu, Xiang Long
Then, by exploiting the distribution property and modeling the normal patterns of multivariate time series, a variational autoencoder is introduced to force the generative adversarial network (GAN) to generate diverse samples.
2 code implementations • 18 Oct 2022 • Xiangyang Li, Bo Chen, Huifeng Guo, Jingjie Li, Chenxu Zhu, Xiang Long, Sujian Li, Yichao Wang, Wei Guo, Longxia Mao, JinXing Liu, Zhenhua Dong, Ruiming Tang
FE-Block module performs fine-grained and early feature interactions to capture the interactive signals between user and item towers explicitly and CIR module leverages a contrastive interaction regularization to further enhance the interactions implicitly.
4 code implementations • 13 Oct 2022 • Jian Wang, Xiang Long, Guowei Chen, Zewu Wu, Zeyu Chen, Errui Ding
Therefore, we designed a U-shaped High-Resolution Network (U-HRNet), which adds more stages after the feature map with strongest semantic representation and relaxes the constraint in HRNet that all resolutions need to be calculated parallel for a newly added stage.
no code implementations • 26 Sep 2022 • Haili Sun, Yan Huang, Lansheng Han, Xiang Long, Hongle Liu, Chunjie Zhou
TOR (The Onion Router) network is a widely used open source anonymous communication tool, the abuse of TOR makes it difficult to monitor the proliferation of online crimes such as to access criminal websites.
no code implementations • NAACL 2022 • Xiangyang Li, Xiang Long, Yu Xia, Sujian Li
Text style transfer (TST) without parallel data has achieved some practical success.
no code implementations • CVPR 2022 • Xin Dong, Fuwei Zhao, Zhenyu Xie, Xijin Zhang, Daniel K. Du, Min Zheng, Xiang Long, Xiaodan Liang, Jianchao Yang
While significant progress has been made in garment transfer, one of the most applicable directions of human-centric image generation, existing works overlook the in-the-wild imagery, presenting severe garment-person misalignment as well as noticeable degradation in fine texture details.
1 code implementation • 21 Apr 2021 • Xin Huang, Xinxin Wang, Wenyu Lv, Xiaying Bai, Xiang Long, Kaipeng Deng, Qingqing Dang, Shumin Han, Qiwen Liu, Xiaoguang Hu, dianhai yu, Yanjun Ma, Osamu Yoshie
To meet these two concerns, we comprehensively evaluate a collection of existing refinements to improve the performance of PP-YOLO while almost keep the infer time unchanged.
1 code implementation • 7 Jan 2021 • Xiangyang Li, Yu Xia, Xiang Long, Zheng Li, Sujian Li
In this paper, we describe our system for the AAAI 2021 shared task of COVID-19 Fake News Detection in English, where we achieved the 3rd position with the weighted F1 score of 0. 9859 on the test set.
Ranked #1 on Fake News Detection on Grover-Mega
1 code implementation • 27 Oct 2020 • Peihao Chen, Deng Huang, Dongliang He, Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan
We study unsupervised video representation learning that seeks to learn both motion and appearance features from unlabeled video only, which can be reused for downstream tasks such as action recognition.
Ranked #11 on Self-Supervised Action Recognition on UCF101
5 code implementations • 23 Jul 2020 • Xiang Long, Kaipeng Deng, Guanzhong Wang, Yang Zhang, Qingqing Dang, Yuan Gao, Hui Shen, Jianguo Ren, Shumin Han, Errui Ding, Shilei Wen
We mainly try to combine various existing tricks that almost not increase the number of model parameters and FLOPs, to achieve the goal of improving the accuracy of detector as much as possible while ensuring that the speed is almost unchanged.
no code implementations • ECCV 2020 • Jian Wang, Xiang Long, Yuan Gao, Errui Ding, Shilei Wen
In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled.
no code implementations • 14 Jun 2020 • Pu Li, Xiang-Yang Li, Xiang Long
It is based on the 'simulation of object occlusion' strategy, which aim to achieve the balance between object occlusion and information retention of the input data.
no code implementations • 17 Dec 2019 • Renchun You, Zhiyao Guo, Lei Cui, Xiang Long, Yingze Bao, Shilei Wen
In order to overcome these challenges, we propose to use cross-modality attention with semantic graph embedding for multi label classification.
Ranked #8 on Multi-Label Classification on NUS-WIDE
2 code implementations • 21 Nov 2019 • Ya Wang, Dongliang He, Fu Li, Xiang Long, Zhichao Zhou, Jinwen Ma, Shilei Wen
In this paper, we propose a label graph superimposing framework to improve the conventional GCN+CNN framework developed for multi-label recognition in the following two aspects.
Ranked #30 on Multi-Label Classification on MS-COCO
2 code implementations • 26 Aug 2019 • Xin Li, Tianwei Lin, Xiao Liu, Chuang Gan, WangMeng Zuo, Chao Li, Xiang Long, Dongliang He, Fu Li, Shilei Wen
In this paper, we empirically find that stacking more conventional temporal convolution layers actually deteriorates action classification performance, possibly ascribing to that all channels of 1D feature map, which generally are highly abstract and can be regarded as latent concepts, are excessively recombined in temporal convolution.
no code implementations • 27 Jun 2018 • Dongliang He, Fu Li, Qijie Zhao, Xiang Long, Yi Fu, Shilei Wen
In this challenge, we propose spatial-temporal network (StNet) for better joint spatial-temporal modelling and comprehensively video understanding.
5 code implementations • CVPR 2018 • Xiang Long, Chuang Gan, Gerard de Melo, Jiajun Wu, Xiao Liu, Shilei Wen
In this paper, however, we show that temporal information, especially longer-term patterns, may not be necessary to achieve competitive results on common video classification datasets.
no code implementations • 12 Aug 2017 • Yunlong Bian, Chuang Gan, Xiao Liu, Fu Li, Xiang Long, Yandong Li, Heng Qi, Jie zhou, Shilei Wen, Yuanqing Lin
Experiment results on the challenging Kinetics dataset demonstrate that our proposed temporal modeling approaches can significantly improve existing approaches in the large-scale video recognition tasks.
Ranked #167 on Action Classification on Kinetics-400
1 code implementation • 14 Jul 2017 • Fu Li, Chuang Gan, Xiao Liu, Yunlong Bian, Xiang Long, Yandong Li, Zhichao Li, Jie zhou, Shilei Wen
This paper describes our solution for the video recognition task of the Google Cloud and YouTube-8M Video Understanding Challenge that ranked the 3rd place.
no code implementations • 9 May 2017 • Xu Jiang, Nan Guan, Xiang Long, Wang Yi
In this paper we propose the semi-federate scheduling approach, which only grants $x$ dedicated processors to a heavy task with processing capacity requirement $x + \epsilon$, and schedules the remaining $\epsilon$ part together with light tasks on shared processors.
Distributed, Parallel, and Cluster Computing
no code implementations • TACL 2018 • Xiang Long, Chuang Gan, Gerard de Melo
Recently, video captioning has been attracting an increasing amount of interest, due to its potential for improving accessibility and information retrieval.