1 code implementation • 12 Apr 2016 • Wenying Ma, Liangliang Cao, Lei Yu, Guoping Long, Yucheng Li
We also applied GPU-FV for realtime video monitoring tasks and found that GPU-FV outperforms a number of previous works.
no code implementations • ACL 2017 • Xiaoyu Shen, Hui Su, Yan-ran Li, Wenjie Li, Shuzi Niu, Yang Zhao, Akiko Aizawa, Guoping Long
Deep latent variable models have been shown to facilitate the response generation for open-domain dialog systems.
no code implementations • 13 Nov 2018 • Guoping Long, Jun Yang, Kai Zhu, Wei. Lin
In recent years, there is a surge on machine learning applications in industry.
Distributed, Parallel, and Cluster Computing Mathematical Software
no code implementations • 10 Oct 2019 • Changying Du, Fuzhen Zhuang, Jia He, Qing He, Guoping Long
In real world machine learning applications, testing data may contain some meaningful new categories that have not been seen in labeled training data.
no code implementations • 11 Oct 2019 • Changying Du, Jia He, Changde Du, Fuzhen Zhuang, Qing He, Guoping Long
Existing multi-view learning methods based on kernel function either require the user to select and tune a single predefined kernel or have to compute and store many Gram matrices to perform multiple kernel learning.
no code implementations • 14 Oct 2019 • Mengdi Wang, Chen Meng, Guoping Long, Chuan Wu, Jun Yang, Wei. Lin, Yangqing Jia
One critical issue for efficiently operating practical AI clouds, is to characterize the computing and data transfer demands of these workloads, and more importantly, the training performance given the underlying software framework and hardware configurations.
no code implementations • 8 Jul 2020 • Siyu Wang, Yi Rong, Shiqing Fan, Zhen Zheng, Lansong Diao, Guoping Long, Jun Yang, Xiaoyong Liu, Wei. Lin
The last decade has witnessed growth in the computational requirements for training deep neural networks.
no code implementations • 23 Sep 2020 • Zhen Zheng, Pengzhan Zhao, Guoping Long, Feiwen Zhu, Kai Zhu, Wenyi Zhao, Lansong Diao, Jun Yang, Wei. Lin
We show in this work that memory intensive computations can result in severe performance problems due to off-chip memory access and CPU-GPU context switch overheads in a wide range of deep learning models.