no code implementations • 17 Apr 2024 • Zhenhua Liu, Zhiwei Hao, Kai Han, Yehui Tang, Yunhe Wang
In this paper, by systematically investigating the impact of different training ingredients, we introduce a strong training strategy for compact models.
no code implementations • 10 Apr 2024 • Jianxiang Xiang, Zhenhua Liu, Haodong Liu, Yin Bai, Jia Cheng, Wenliang Chen
Previous studies attempted to introduce discrete or Gaussian-based continuous latent variables to address the one-to-many problem, but the diversity is limited.
no code implementations • 30 Mar 2024 • Zhenhua Liu, Tong Zhu, Jianxiang Xiang, Wenliang Chen
To evaluate the efficacy of data augmentation methods for open-domain dialogue, we designed a clustering-based metric to characterize the semantic diversity of the augmented dialogue data.
1 code implementation • 25 May 2023 • Zhenhua Liu, Feipeng Ma, Tianyi Wang, Fengyun Rao
We propose a Similarity Alignment Model(SAM) for video copy segment matching.
1 code implementation • 21 May 2023 • Tianyi Wang, Feipeng Ma, Zhenhua Liu, Fengyun Rao
With the development of multimedia technology, Video Copy Detection has been a crucial problem for social media platforms.
1 code implementation • ICCV 2023 • Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Zhao Wang, Kai Han, Shanshe Wang, Siwei Ma, Wen Gao
On the other hand, JPMA is proposed to assemble multiple hypotheses generated by D3DP into a single 3D pose for practical use.
1 code implementation • 15 Mar 2022 • Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
In Stage II, the pre-trained encoder is loaded to STMO model and fine-tuned.
Ranked #10 on Monocular 3D Human Pose Estimation on Human3.6M
no code implementations • CVPR 2022 • Zhaoyang Zeng, Yongsheng Luo, Zhenhua Liu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen
In this paper, we propose the Tencent-MVSE dataset, which is the first benchmark dataset for the multi-modal video similarity evaluation task.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
4 code implementations • CVPR 2022 • Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao
However, natural images are of huge diversity with abundant content and using such a universal quantization configuration for all samples is not an optimal strategy.
no code implementations • 13 Oct 2021 • Mingkang Tang, Zhanyu Wang, Zhenhua Liu, Fengyun Rao, Dian Li, Xiu Li
It is noted that our model is only trained on the MSR-VTT dataset.
1 code implementation • Findings (ACL) 2021 • Weidong Guo, Mingjun Zhao, Lusheng Zhang, Di Niu, Jinwen Luo, Zhenhua Liu, Zhenyang Li, Jianbo Tang
Language model pre-training based on large corpora has achieved tremendous success in terms of constructing enriched contextual representations and has led to significant performance gains on a diverse range of Natural Language Understanding (NLU) tasks.
no code implementations • NeurIPS 2021 • Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao
Recently, transformer has achieved remarkable performance on a variety of computer vision applications.
4 code implementations • 21 Jan 2021 • Ying Nie, Kai Han, Zhenhua Liu, Chuanjian Liu, Yunhe Wang
Based on the observation that many features in SISR models are also similar to each other, we propose to use shift operation to generate the redundant features (i. e., ghost features).
no code implementations • 23 Dec 2020 • Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, Zhaohui Yang, Yiman Zhang, DaCheng Tao
Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism.
6 code implementations • CVPR 2021 • Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao
To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.
Ranked #1 on Single Image Deraining on Rain100L (using extra training data)
no code implementations • 25 Sep 2019 • Tianxiao Gao, Ruiqin Xiong, Zhenhua Liu, Siwei Ma, Feng Wu, Tiejun Huang, Wen Gao
One way to compress these heavy models is knowledge transfer (KT), in which a light student network is trained through absorbing the knowledge from a powerful teacher network.
no code implementations • NeurIPS 2018 • Zhenhua Liu, Jizheng Xu, Xiulian Peng, Ruiqin Xiong
Deep convolutional neural networks have demonstrated their powerfulness in a variety of applications.