2 code implementations • 14 Mar 2022 • Qiang Wang, Yanhao Zhang, Yun Zheng, Pan Pan, Xian-Sheng Hua
Cross-modality interaction is a critical component in Text-Video Retrieval (TVR), yet there has been little examination of how different influencing factors for computing interaction affect performance.
Ranked #3 on
Video Retrieval
on MSR-VTT-1kA
(using extra training data)
no code implementations • CVPR 2022 • Qiang Wang, Yanhao Zhang, Yun Zheng, Pan Pan
Temporal representation is the cornerstone of modern action detection techniques.
no code implementations • CVPR 2021 • Qiang Wang, Yun Zheng, Pan Pan, Yinghui Xu
Recent works have shown that convolutional networks have substantially improved the performance of multiple object tracking by simultaneously learning detection and appearance features.
no code implementations • CVPR 2021 • Chi Zhang, Nan Song, Guosheng Lin, Yun Zheng, Pan Pan, Yinghui Xu
First, we adopt a simple but effective decoupled learning strategy of representations and classifiers that only the classifiers are updated in each incremental session, which avoids knowledge forgetting in the representations.
no code implementations • 9 Feb 2021 • Yanhao Zhang, Qiang Wang, Pan Pan, Yun Zheng, Cheng Da, Siyang Sun, Yinghui Xu
Nowadays, live-stream and short video shopping in E-commerce have grown exponentially.
no code implementations • 9 Feb 2021 • Xiangzeng Zhou, Pan Pan, Yun Zheng, Yinghui Xu, Rong Jin
In this paper, we present a novel side information based large scale visual recognition co-training~(SICoT) system to deal with the long tail problem by leveraging the image related side information.
no code implementations • 9 Feb 2021 • Kang Zhao, Pan Pan, Yun Zheng, Yanhao Zhang, Changxu Wang, Yingya Zhang, Yinghui Xu, Rong Jin
For a deployed visual search system with several billions of online images in total, building a billion-scale offline graph in hours is essential, which is almost unachievable by most existing methods.
no code implementations • 9 Feb 2021 • Yanhao Zhang, Pan Pan, Yun Zheng, Kang Zhao, Jianmin Wu, Yinghui Xu, Rong Jin
Benefiting from exploration of user click data, our networks are more effective to encode richer supervision and better distinguish real-shot images in terms of category and feature.
no code implementations • 9 Feb 2021 • Yanhao Zhang, Pan Pan, Yun Zheng, Kang Zhao, Yingya Zhang, Xiaofeng Ren, Rong Jin
We hope visual search at Alibaba becomes more widely incorporated into today's commercial applications.
no code implementations • ECCV 2020 • Lele Cheng, Xiangzeng Zhou, Liming Zhao, Dangwei Li, Hong Shang, Yun Zheng, Pan Pan, Yinghui Xu
In many real-world datasets, like WebVision, the performance of DNN based classifier is often limited by the noisy labeled data.
no code implementations • 15 Jul 2020 • Hao Cheng, Bao-Hua Sun, Li-Hua Zhu, Tian-Xiao Li, Guang-Shuai Li, Cong-Bo Li, Xiao-Guang Wu, Yun Zheng
The LaBr$_3$(Ce) detector has attracted much attention in recent years for its superior characteristics to other scintillating materials in terms of resolution and efficiency.
Instrumentation and Detectors Nuclear Experiment