no code implementations • 6 Sep 2024 • Jiangxia Cao, Shen Wang, Gaode Chen, Rui Huang, Shuang Yang, Zhaojie Liu, Guorui Zhou
In addressing the persistent challenges of data-sparsity and cold-start issues in domain-expert recommender systems, Cross-Domain Recommendation (CDR) emerges as a promising methodology.
no code implementations • 11 Aug 2024 • Jiangxia Cao, Shen Wang, Yue Li, ShengHui Wang, Jian Tang, Shiyao Wang, Shuang Yang, Zhaojie Liu, Guorui Zhou
Kuaishou, is one of the largest short-video and live-streaming platform, compared with short-video recommendations, live-streaming recommendation is more complex because of: (1) temporarily-alive to distribution, (2) user may watch for a long time with feedback delay, (3) content is unpredictable and changes over time.
no code implementations • 19 Mar 2024 • Mingqi Shao, Feng Xiong, Hang Zhang, Shuang Yang, Mu Xu, Wei Bian, Xueqian Wang
The global stage obtains a continuous representation of the entire scene while the focal stage decomposes the scene into multiple blocks and further processes them with distinct sub-encoders.
no code implementations • CVPR 2024 • Yuanhang Zhang, Shuang Yang, Shiguang Shan, Xilin Chen
While many recent approaches for this task primarily rely on guiding the learning process using the audio modality alone to capture information shared between audio and video we reframe the problem as the acquisition of shared unique (modality-specific) and synergistic speech information to address the inherent asymmetry between the modalities.
no code implementations • 5 Dec 2023 • Bo Ding, Zhenfeng Fan, Shuang Yang, Shihong Xia
We incorporate personalized prior in a monocular video and morphable prior in 3D face morphable space for generating personalized details under novel controllable parameters.
no code implementations • 24 Nov 2023 • Feixiang Wang, Shuang Yang, Shiguang Shan, Xilin Chen
By integrating cooperative dual attention in the visual encoder and audio-visual fusion strategy, our model effectively extracts beneficial speech information from both audio and visual cues for AVSE.
1 code implementation • 8 Oct 2023 • Songtao Luo, Shuang Yang, Shiguang Shan, Xilin Chen
For deep layers where both the speaker's features and the speech content features are all expressed well, we introduce the speaker-adaptive features to learn for suppressing the speech content irrelevant noise for robust lip reading.
1 code implementation • ICCV 2023 • Zhenfeng Fan, Zhiheng Zhang, Shuang Yang, Chongyang Zhong, Min Cao, Shihong Xia
We propose a learning framework for 3D facial attribute translation to relieve these limitations.
no code implementations • 24 May 2023 • Zhuokai Zhao, Yang Yang, Wenyu Wang, Chihuang Liu, Yu Shi, Wenjie Hu, Haotian Zhang, Shuang Yang
A key puzzle in search, ads, and recommendation is that the ranking model can only utilize a small portion of the vastly available user interaction data.
no code implementations • 3 Mar 2023 • Shuai Xiao, Zaifan Jiang, Shuang Yang
Finding optimal configurations in a geometric space is a key challenge in many technological disciplines.
no code implementations • 2 Mar 2023 • Shuai Xiao, Le Guo, Zaifan Jiang, Lei Lv, Yuanbo Chen, Jun Zhu, Shuang Yang
Furthermore, we show that the dual problem can be solved by policy learning, with the optimal dual variable being found efficiently via bisection search (i. e., by taking advantage of the monotonicity).
no code implementations • 22 Jun 2022 • Yuanhang Zhang, Susan Liang, Shuang Yang, Shiguang Shan
This report presents a brief description of our winning solution to the AVA Active Speaker Detection (ASD) task at ActivityNet Challenge 2022.
Active Speaker Detection Audio-Visual Active Speaker Detection
no code implementations • 5 Aug 2021 • Yuanhang Zhang, Susan Liang, Shuang Yang, Xiao Liu, Zhongqin Wu, Shiguang Shan, Xilin Chen
Our solution is a novel, unified framework that focuses on jointly modeling multiple types of contextual information: spatial context to indicate the position and scale of each candidate's face, relational context to capture the visual relationships among the candidates and contrast audio-visual affinities with each other, and temporal context to aggregate long-term information and smooth out local uncertainties.
Active Speaker Detection Audio-Visual Active Speaker Detection
1 code implementation • 10 Jun 2021 • Jiawei Zhang, Linyi Li, Huichen Li, Xiaolu Zhang, Shuang Yang, Bo Li
In this paper, we show that such efficiency highly depends on the scale at which the attack is applied, and attacking at the optimal scale significantly improves the efficiency.
1 code implementation • NeurIPS 2021 • Runzhong Wang, Zhigang Hua, Gan Liu, Jiayi Zhang, Junchi Yan, Feng Qi, Shuang Yang, Jun Zhou, Xiaokang Yang
Combinatorial Optimization (CO) has been a long-standing challenging research topic featured by its NP-hard nature.
no code implementations • The ActivityNet Large-Scale Activity Recognition Challenge Workshop, CVPR 2021 • Yuanhang Zhang, Susan Liang, Shuang Yang, Xiao Liu, Zhongqin Wu, Shiguang Shan
This report presents a brief description of our method for the AVA Active Speaker Detection (ASD) task at ActivityNet Challenge 2021.
Active Speaker Detection Audio-Visual Active Speaker Detection
no code implementations • 5 Mar 2021 • Zhigang Hua, Feng Qi, Gan Liu, Shuang Yang
Scheduling computational tasks represented by directed acyclic graphs (DAGs) is challenging because of its complexity.
1 code implementation • 25 Feb 2021 • Huichen Li, Linyi Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li
We aim to bridge the gap between the two by investigating how to efficiently estimate gradient based on a projected low-dimensional space.
1 code implementation • 15 Nov 2020 • Dalu Feng, Shuang Yang, Shiguang Shan, Xilin Chen
Considering the non-negligible effects of these strategies and the existing tough status to train an effective lip reading model, we perform a comprehensive quantitative study and comparative analysis, for the first time, to show the effects of several different choices for lip reading.
Ranked #2 on Lipreading on CAS-VSR-W1k (LRW-1000)
no code implementations • 8 Aug 2020 • Xingwen Zhang, Shuang Yang
A key challenge in solving a combinatorial optimization problem is how to guide the agent (i. e., solver) to efficiently explore the enormous search space.
no code implementations • 10 Jun 2020 • Jian Du, Zhigang Hua, Shuang Yang
We examine the \emph{submodular maximum coverage problem} (SMCP), which is related to a wide range of applications.
2 code implementations • NeurIPS 2020 • Ziqi Liu, Zhengwei Wu, Zhiqiang Zhang, Jun Zhou, Shuang Yang, Le Song, Yuan Qi
However, due to the intractable computation of optimal sampling distribution, these sampling algorithms are suboptimal for GCNs and are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT).
Ranked #1 on Node Property Prediction on ogbn-proteins
no code implementations • CVPR 2020 • Huichen Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li
Such adversarial attacks can be achieved by adding a small magnitude of perturbation to the input to mislead model prediction.
no code implementations • 14 May 2020 • Weiwei Chen, Ying Wang, Shuang Yang, Chen Liu, Lei Zhang
DNN/Accelerator co-design has shown great potential in improving QoR and performance.
1 code implementation • 8 May 2020 • Mingshuang Luo, Shuang Yang, Xilin Chen, Zitao Liu, Shiguang Shan
Based on this idea, we try to explore the synergized learning of multilingual lip reading in this paper, and further propose a synchronous bidirectional learning (SBL) framework for effective synergy of multilingual lip reading.
1 code implementation • ICLR 2020 • Hao Lu, Xingwen Zhang, Shuang Yang
This paper is concerned with solving combinatorial optimization problems, in particular, the capacitated vehicle routing problems (CVRP).
no code implementations • 10 Apr 2020 • Chaochao Chen, Liang Li, Wenjing Fang, Jun Zhou, Li Wang, Lei Wang, Shuang Yang, Alex Liu, Hao Wang
Nowadays, the utilization of the ever expanding amount of data has made a huge impact on web technologies while also causing various types of security concerns.
1 code implementation • 13 Mar 2020 • Xing Zhao, Shuang Yang, Shiguang Shan, Xilin Chen
By combining these two advantages together, the proposed method is expected to be both discriminative and robust for effective lip reading.
Ranked #8 on Lipreading on CAS-VSR-W1k (LRW-1000)
1 code implementation • 12 Mar 2020 • Jing-Yun Xiao, Shuang Yang, Yuan-Hang Zhang, Shiguang Shan, Xilin Chen
Observing on the continuity in adjacent frames in the speaking process, and the consistency of the motion patterns among different speakers when they pronounce the same phoneme, we model the lip movements in the speaking process as a sequence of apparent deformations in the lip region.
Ranked #7 on Lipreading on CAS-VSR-W1k (LRW-1000)
no code implementations • 11 Mar 2020 • Longfei Zheng, Chaochao Chen, Yingting Liu, Bingzhe Wu, Xibin Wu, Li Wang, Lei Wang, Jun Zhou, Shuang Yang
Deep Neural Network (DNN) has been showing great potential in kinds of real-world applications such as fraud detection and distress prediction.
no code implementations • 10 Mar 2020 • Yankun Ren, Jianbin Lin, Siliang Tang, Jun Zhou, Shuang Yang, Yuan Qi, Xiang Ren
It can attack text classification models with a higher success rate than existing methods, and provide acceptable quality for humans in the meantime.
no code implementations • 9 Mar 2020 • Mingshuang Luo, Shuang Yang, Shiguang Shan, Xilin Chen
On the one hand, we introduce the evaluation metric (refers to the character error rate in this paper) as a form of reward to optimize the model together with the original discriminative target.
Ranked #9 on Lipreading on CAS-VSR-W1k (LRW-1000)
1 code implementation • 6 Mar 2020 • Yuan-Hang Zhang, Shuang Yang, Jing-Yun Xiao, Shiguang Shan, Xilin Chen
Recent advances in deep learning have heightened interest among researchers in the field of visual speech recognition (VSR).
Ranked #2 on Lipreading on GRID corpus (mixed-speech)
1 code implementation • 28 Feb 2020 • Daixin Wang, Jianbin Lin, Peng Cui, Quanhui Jia, Zhen Wang, Yanming Fang, Quan Yu, Jun Zhou, Shuang Yang, Yuan Qi
Additionally, among the network, only very few of the users are labelled, which also poses a great challenge for only utilizing labeled data to achieve a satisfied performance on fraud detection.
no code implementations • 27 Feb 2020 • Ziqi Liu, Dong Wang, Qianyu Yu, Zhiqiang Zhang, Yue Shen, Jian Ma, Wenliang Zhong, Jinjie Gu, Jun Zhou, Shuang Yang, Yuan Qi
In this paper, we present a graph representation learning method atop of transaction networks for merchant incentive optimization in mobile payment marketing.
no code implementations • 27 Feb 2020 • Chen Liang, Ziqi Liu, Bin Liu, Jun Zhou, Xiaolong Li, Shuang Yang, Yuan Qi
In order to detect and prevent fraudulent insurance claims, we developed a novel data-driven procedure to identify groups of organized fraudsters, one of the major contributions to financial losses, by learning network information.
no code implementations • 6 Feb 2020 • Yingting Liu, Chaochao Chen, Longfei Zheng, Li Wang, Jun Zhou, Guiquan Liu, Shuang Yang
In this paper, we present a general multiparty modeling paradigm with Privacy Preserving Principal Component Analysis (PPPCA) for horizontally partitioned data.
no code implementations • 2 Feb 2020 • Xingwen Zhang, Feng Qi, Zhigang Hua, Shuang Yang
Knapsack problems (KPs) are common in industry, but solving KPs is known to be NP-hard and has been tractable only at a relatively small scale.
1 code implementation • 21 Jul 2019 • Xinlei Pan, Chaowei Xiao, Warren He, Shuang Yang, Jian Peng, MingJie Sun, JinFeng Yi, Zijiang Yang, Mingyan Liu, Bo Li, Dawn Song
To the best of our knowledge, we are the first to apply adversarial attacks on DRL systems to physical robots.
no code implementations • 9 Jul 2019 • Jiahui Li, Zhiqiang Hu, Shuang Yang
Nuclear segmentation is important and frequently demanded for pathology image analysis, yet is also challenging due to nuclear crowdedness and possible occlusion.
1 code implementation • 9 Jul 2019 • Jiahui Li, Shuang Yang, Xiaodi Huang, Qian Da, Xiaoqun Yang, Zhiqiang Hu, Qi Duan, Chaofu Wang, Hongsheng Li
Our framework achieves accurate signet ring cell detection and can be readily applied in the clinical trails.
no code implementations • The ActivityNet Large-Scale Activity Recognition Challenge Workshop, CVPR 2019 • Yuanhang Zhang, Jingyun Xiao, Shuang Yang, Shiguang Shan
This report describes the approach underlying our submission to the active speaker detection task (task B-2) of ActivityNet Challenge 2019.
Ranked #18 on Audio-Visual Active Speaker Detection on AVA-ActiveSpeaker (using extra training data)
Active Speaker Detection Audio-Visual Active Speaker Detection +3
2 code implementations • 24th International Conference on Pattern Recognition (ICPR) 2018 • Shuang Yang, Bo Yang
TENE learns the representations of nodes under the guidance of both proximity matrix which captures the network structure and text cluster membership matrix derived from clustering for text information.
2 code implementations • 16 Oct 2018 • Shuang Yang, Yuan-Hang Zhang, Dalu Feng, Mingmin Yang, Chenhao Wang, Jing-Yun Xiao, Keyu Long, Shiguang Shan, Xilin Chen
It has shown a large variation in this benchmark in several aspects, including the number of samples in each class, video resolution, lighting conditions, and speakers' attributes such as pose, age, gender, and make-up.
Ranked #2 on Lipreading on LRW-1000
no code implementations • CVPR 2015 • Shuang Yang, Chunfeng Yuan, Baoxin Wu, Weiming Hu, Fangshi Wang
In this paper, a multi-feature max-margin hierarchical Bayesian model (M3HBM) is proposed for action recognition.
no code implementations • CVPR 2013 • Chunfeng Yuan, Weiming Hu, Guodong Tian, Shuang Yang, Haoran Wang
In this paper, we formulate human action recognition as a novel Multi-Task Sparse Learning(MTSL) framework which aims to construct a test sample with multiple features from as few bases as possible.