Search Results for author: Sen yang

Found 51 papers, 20 papers with code

LPFS: Learnable Polarizing Feature Selection for Click-Through Rate Prediction

no code implementations1 Jun 2022 Yi Guo, Zhaocheng Liu, Jianchao Tan, Chao Liao, Daqing Chang, Qiang Liu, Sen yang, Ji Liu, Dongying Kong, Zhi Chen, Chengru Song

When training is finished, some gates are exact zero, while others are around one, which is particularly favored by the practical hot-start training in the industry, due to no damage to the model performance before and after removing the features corresponding to exact-zero gates.

Click-Through Rate Prediction feature selection

Nuclear Norm Maximization Based Curiosity-Driven Learning

no code implementations21 May 2022 Chao Chen, Zijian Gao, Kele Xu, Sen yang, Yiying Li, Bo Ding, Dawei Feng, Huaimin Wang

To handle the sparsity of the extrinsic rewards in reinforcement learning, researchers have proposed intrinsic reward which enables the agent to learn the skills that might come in handy for pursuing the rewards in the future, such as encouraging the agent to visit novel states.

Atari Games

Pan-cancer computational histopathology reveals tumor mutational burden status through weakly-supervised deep learning

no code implementations7 Apr 2022 Siteng Chen, Jinxi Xiang, Xiyue Wang, Jun Zhang, Sen yang, Junzhou Huang, Wei Yang, Junhua Zheng, Xiao Han

In comparison with the state-of-the-art TMB prediction model from previous publications, our multiscale model achieved better performance over previously reported models.

whole slide images

Generating Privacy-Preserving Process Data with Deep Generative Models

1 code implementation15 Mar 2022 Keyi Li, Sen yang, Travis M. Sullivan, Randall S. Burd, Ivan Marsic

We experimented with different models of representation learning and used the learned model to generate synthetic process data.

Privacy Preserving Representation Learning

Do Prompts Solve NLP Tasks Using Natural Language?

no code implementations2 Mar 2022 Sen yang, Yunchen Zhang, Leyang Cui, Yue Zhang

Thanks to the advanced improvement of large pre-trained language models, prompt-based fine-tuning is shown to be effective on a variety of downstream tasks.

Adversarial Contrastive Self-Supervised Learning

no code implementations26 Feb 2022 Wentao Zhu, Hang Shang, Tingxun Lv, Chao Liao, Sen yang, Ji Liu

Recently, learning from vast unlabeled data, especially self-supervised learning, has been emerging and attracted widespread attention.

Self-Supervised Learning

Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation

1 code implementation21 Feb 2022 Jiawei Liu, Yuxiang Wei, Sen yang, Yinlin Deng, Lingming Zhang

Our results show that Tzer substantially outperforms existing fuzzing techniques on tensor compiler testing, with 75% higher coverage and 50% more valuable tests than the 2nd-best technique.

Multi-task Pre-training Language Model for Semantic Network Completion

no code implementations13 Jan 2022 Da Li, Sen yang, Kele Xu, Ming Yi, Yukai He, Huaimin Wang

To demonstrate the effectiveness of our method, we conduct extensive experiments on three widely-used datasets, WN18RR, FB15k-237, and UMLS.

 Ranked #1 on Link Prediction on WN18RR (using extra training data)

Data Augmentation Knowledge Graph Completion +4

Node-Aligned Graph Convolutional Network for Whole-Slide Image Representation and Classification

1 code implementation CVPR 2022 Yonghang Guan, Jun Zhang, Kuan Tian, Sen yang, Pei Dong, Jinxi Xiang, Wei Yang, Junzhou Huang, Yuyao Zhang, Xiao Han

In this paper, we propose a hierarchical global-to-local clustering strategy to build a Node-Aligned GCN (NAGCN) to represent WSI with rich local structural information as well as global distribution.

graph construction Multiple Instance Learning +1

TNASP: A Transformer-based NAS Predictor with a Self-evolution Framework

no code implementations NeurIPS 2021 Shun Lu, Jixiang Li, Jianchao Tan, Sen yang, Ji Liu

Predictor-based Neural Architecture Search (NAS) continues to be an important topic because it aims to mitigate the time-consuming search procedure of traditional NAS methods.

Neural Architecture Search

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters

1 code implementation10 Nov 2021 Xiangru Lian, Binhang Yuan, XueFeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen yang, Ce Zhang, Ji Liu

Specifically, in order to ensure both the training efficiency and the training accuracy, we design a novel hybrid training algorithm, where the embedding layer and the dense neural network are handled by different synchronization mechanisms; then we build a system called Persia (short for parallel recommendation training system with hybrid acceleration) to support this hybrid training algorithm.

Recommendation Systems

Joint Channel and Weight Pruning for Model Acceleration on Moblie Devices

1 code implementation15 Oct 2021 Tianli Zhao, Xi Sheryl Zhang, Wentao Zhu, Jiaxing Wang, Sen yang, Ji Liu, Jian Cheng

In this paper, we present a unified framework with Joint Channel pruning and Weight pruning (JCW), and achieves a better Pareto-frontier between the latency and accuracy than previous model compression approaches.

Model Compression

Investigating Non-local Features for Neural Constituency Parsing

1 code implementation ACL 2022 Leyang Cui, Sen yang, Yue Zhang

Besides, our method achieves state-of-the-art BERT-based performance on PTB (95. 92 F1) and strong performance on CTB (92. 31 F1).

Constituency Parsing

Deep Learning Model for Demodulation Reference Signal based Channel Estimation

no code implementations22 Sep 2021 Yu Tian, Chengguang Li, Sen yang

In this paper, we propose a deep learning model for Demodulation Reference Signal (DMRS) based channel estimation task.

SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification

1 code implementation18 Sep 2021 Wentao Zhu, Tianlong Kong, Shun Lu, Jixiang Li, Dawei Zhang, Feng Deng, Xiaorui Wang, Sen yang, Ji Liu

Recently, x-vector has been a successful and popular approach for speaker verification, which employs a time delay neural network (TDNN) and statistics pooling to extract speaker characterizing embedding from variable-length utterances.

Neural Architecture Search Speaker Recognition +2

GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization

no code implementations ICCV 2021 Yi Guo, Huan Yuan, Jianchao Tan, Zhangyang Wang, Sen yang, Ji Liu

During the training process, the polarization effect will drive a subset of gates to smoothly decrease to exact zero, while other gates gradually stay away from zero by a large margin.

Model Compression Network Pruning

Sk-Unet Model with Fourier Domain for Mitosis Detection

no code implementations1 Sep 2021 Sen yang, Feng Luo, Jun Zhang, Xiyue Wang

Mitotic count is the most important morphological feature of breast cancer grading.

Mitosis Detection

Shifted Chunk Transformer for Spatio-Temporal Representational Learning

no code implementations NeurIPS 2021 Xuefan Zha, Wentao Zhu, Tingxun Lv, Sen yang, Ji Liu

However, the pure-Transformer based spatio-temporal learning can be prohibitively costly on memory and computation to extract fine-grained features from a tiny patch.

Action Anticipation Action Recognition +5

PASTO: Strategic Parameter Optimization in Recommendation Systems -- Probabilistic is Better than Deterministic

no code implementations20 Aug 2021 Weicong Ding, Hanlin Tang, Jingshuo Feng, Lei Yuan, Sen yang, Guangxu Yang, Jie Zheng, Jing Wang, Qiang Su, Dong Zheng, Xuezhong Qiu, Yongqi Liu, Yuxuan Chen, Yang Liu, Chao Song, Dongying Kong, Kai Ren, Peng Jiang, Qiao Lian, Ji Liu

In this setting with multiple and constrained goals, this paper discovers that a probabilistic strategic parameter regime can achieve better value compared to the standard regime of finding a single deterministic parameter.

Recommendation Systems

POSO: Personalized Cold Start Modules for Large-scale Recommender Systems

no code implementations10 Aug 2021 Shangfeng Dai, Haobin Lin, Zhichen Zhao, Jianying Lin, Honghuan Wu, Zhe Wang, Sen yang, Ji Liu

Moreover, POSO can be further generalized to regular users, inactive users and returning users (+2%-3% on Watch Time), as well as item cold start (+3. 8% on Watch Time).

Recommendation Systems

FINT: Field-aware INTeraction Neural Network For CTR Prediction

1 code implementation5 Jul 2021 Zhishan Zhao, Sen yang, Guohui Liu, Dawei Feng, Kele Xu

As a critical component for online advertising and marking, click-through rate (CTR) prediction has draw lots of attentions from both industry and academia field.

Click-Through Rate Prediction

Template-Based Named Entity Recognition Using BART

1 code implementation Findings (ACL) 2021 Leyang Cui, Yu Wu, Jian Liu, Sen yang, Yue Zhang

To address the issue, we propose a template-based method for NER, treating NER as a language model ranking problem in a sequence-to-sequence framework, where original sentences and statement templates filled by candidate named entity span are regarded as the source sequence and the target sequence, respectively.

few-shot-ner Few-shot NER +4

TokenPose: Learning Keypoint Tokens for Human Pose Estimation

1 code implementation ICCV 2021 YanJie Li, Shoukui Zhang, Zhicheng Wang, Sen yang, Wankou Yang, Shu-Tao Xia, Erjin Zhou

Most existing CNN-based methods do well in visual representation, however, lacking in the ability to explicitly learn the constraint relationships between keypoints.

Pose Estimation

TransPose: Keypoint Localization via Transformer

1 code implementation ICCV 2021 Sen yang, Zhibin Quan, Mu Nie, Wankou Yang

While CNN-based models have made remarkable progress on human pose estimation, what spatial dependencies they capture to localize keypoints remains unclear.

Ranked #3 on Pose Estimation on MPII Human Pose (using extra training data)

Keypoint Detection

Ensemble Chinese End-to-End Spoken Language Understanding for Abnormal Event Detection from audio stream

no code implementations19 Oct 2020 Haoran Wei, Fei Tao, Runze Su, Sen yang, Ji Liu

Previous end-to-end SLU models are primarily used for English environment due to lacking large scale SLU dataset in Chines, and use only one ASR model to extract features from speech.

Automatic Speech Recognition Event Detection +2

What Have We Achieved on Text Summarization?

1 code implementation EMNLP 2020 Dandan Huang, Leyang Cui, Sen yang, Guangsheng Bao, Kun Wang, Jun Xie, Yue Zhang

Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years.

Text Summarization

Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification

no code implementations29 Jul 2020 Xin Dong, Yaxin Zhu, Yupeng Zhang, Zuohui Fu, Dongkuan Xu, Sen yang, Gerard de Melo

The resulting model then serves as a teacher to induce labels for unlabeled target language samples that can be used during further adversarial training, allowing us to gradually adapt our model to the target language.

Classification General Classification +3

Event Arguments Extraction via Dilate Gated Convolutional Neural Network with Enhanced Local Features

no code implementations2 Jun 2020 Zhigang Kan, Linbo Qiao, Sen yang, Feng Liu, Feng Huang

However, the F-Score of event arguments extraction is much lower than that of event trigger extraction, i. e. in the most recent work, event trigger extraction achieves 80. 7%, while event arguments extraction achieves only 58%.

Event Extraction

Pose Neural Fabrics Search

2 code implementations16 Sep 2019 Sen Yang, Wankou Yang, Zhen Cui

Neural Architecture Search (NAS) technologies have emerged in many domains to jointly learn the architectures and weights of the neural network.

Image Classification Keypoint Detection +3

Exploring Pre-trained Language Models for Event Extraction and Generation

no code implementations ACL 2019 Sen Yang, Dawei Feng, Linbo Qiao, Zhigang Kan, Dongsheng Li

Traditional approaches to the task of ACE event extraction usually depend on manually annotated data, which is often laborious to create and limited in size.

Event Extraction General Classification

On the Linear Speedup Analysis of Communication Efficient Momentum SGD for Distributed Non-Convex Optimization

no code implementations9 May 2019 Hao Yu, Rong Jin, Sen yang

Recent developments on large-scale distributed machine learning applications, e. g., deep neural networks, benefit enormously from the advances in distributed non-convex optimization techniques, e. g., distributed Stochastic Gradient Descent (SGD).

Shrinking the Upper Confidence Bound: A Dynamic Product Selection Problem for Urban Warehouses

no code implementations19 Mar 2019 Rong Jin, David Simchi-Levi, Li Wang, Xinshang Wang, Sen Yang

In this paper, we study algorithms for dynamically identifying a large number of products (i. e., SKUs) with top customer purchase probabilities on the fly, from an ocean of potential products to offer on retailers' ultra-fast delivery platforms.

online learning

Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning

no code implementations17 Jul 2018 Hao Yu, Sen yang, Shenghuo Zhu

Ideally, parallel mini-batch SGD can achieve a linear speed-up of the training time (with respect to the number of workers) compared with SGD over a single worker.

Learning with Non-Convex Truncated Losses by SGD

no code implementations21 May 2018 Yi Xu, Shenghuo Zhu, Sen yang, Chi Zhang, Rong Jin, Tianbao Yang

Learning with a {\it convex loss} function has been a dominating paradigm for many years.

Process-oriented Iterative Multiple Alignment for Medical Process Mining

no code implementations16 Sep 2017 Shuhong Chen, Sen yang, Moliang Zhou, Randall S. Burd, Ivan Marsic

We applied PIMA to analyzing medical workflow data, showing how iterative alignment can better represent the data and facilitate the extraction of insights from data visualization.

Data Visualization

Multi-task Vector Field Learning

no code implementations NeurIPS 2012 Binbin Lin, Sen yang, Chiyuan Zhang, Jieping Ye, Xiaofei He

MTVFL has the following key properties: (1) the vector fields we learned are close to the gradient fields of the prediction functions; (2) within each task, the vector field is required to be as parallel as possible which is expected to span a low dimensional subspace; (3) the vector fields from all tasks share a low dimensional subspace.

Multi-Task Learning

Fused Multiple Graphical Lasso

no code implementations10 Sep 2012 Sen Yang, Zhaosong Lu, Xiaotong Shen, Peter Wonka, Jieping Ye

We expect the two brain networks for NC and MCI to share common structures but not to be identical to each other; similarly for the two brain networks for MCI and AD.

Cannot find the paper you are looking for? You can Submit a new open access paper.