Search Results for author: Ping Yang

Found 14 papers, 6 papers with code

融合自编码器和对抗训练的中文新词发现方法(Finding Chinese New Word By Combining Self-encoder and Adversarial Training)

no code implementations CCL 2021 Wei Pan, Tianyuan Liu, Yuqing Sun, Bin Gong, Yongman Zhang, Ping Yang

“新词的不断涌现是语言的自然规律, 如在专业领域中新概念和实体名称代表了专业领域中某些共同特征集合的抽象概括, 经常作为关键词在句子中承担一定的角色。新词发现问题直接影响中文分词结果和后继文本语义理解任务的性能, 是自然语言处理研究领域的重要任务。本文提出了融合自编码器和对抗训练的中文新词发现模型, 采用字符级别的自编码器和无监督自学习的方式进行预训练, 可以有效提取语义信息, 不受分词结果影响, 适用于不同领域的文本;同时为了引入通用语言学知识, 添加了先验句法分析结果, 借助领域共享编码器融合语义和语法信息, 以提升划分歧义词的准确性;采用对抗训练机制, 以提取领域无关特征, 减少对于人工标注语料的依赖。实验选择六个不同的专业领域数据集评估新词发现任务, 结果显示本文模型优于其他现有方法;结合模型析构实验, 详细验证了各个模块的有效性。同时通过选择不同类型的源域数据和不同数量的目标域数据进行对比实验, 验证了模型的鲁棒性。最后以可视化的方式对比了自编码器和共享编码器对不同领域数据的编码结果, 显示了对抗训练方法能够有效地提取两者之间的相关性和差异性信息。”

Ziya2: Data-centric Learning is All LLMs Need

no code implementations6 Nov 2023 Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, XiaoJun Wu, Dixiang Zhang, Kunhao Pan, Ping Yang, Qi Yang, Jiaxing Zhang, Yan Song

Although many such issues are addressed along the line of research on LLMs, an important yet practical limitation is that many studies overly pursue enlarging model sizes without comprehensively analyzing and optimizing the use of pre-training data in their learning process, as well as appropriate organization and leveraging of such data in training LLMs under cost-effective settings.

Hawkeye: Change-targeted Testing for Android Apps based on Deep Reinforcement Learning

no code implementations4 Sep 2023 Chao Peng, Zhengwei Lv, Jiarong Fu, Jiayuan Liang, Zhao Zhang, Ajitha Rajan, Ping Yang

We find that Hawkeye is able to generate GUI event sequences targeting changed functions more reliably than FastBot2 and ARES for the open source Apps and the large commercial App.

reinforcement-learning

Learning Weakly Supervised Audio-Visual Violence Detection in Hyperbolic Space

1 code implementation30 May 2023 Xiaogang Peng, Hao Wen, Yikai Luo, Xiao Zhou, Keyang Yu, Ping Yang, Zizhao Wu

To overcome this, we propose HyperVD, a novel framework that learns snippet embeddings in hyperbolic space to improve model discrimination.

Anomaly Detection In Surveillance Videos

UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective

no code implementations17 May 2023 Ping Yang, Junyu Lu, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Jiaxing Zhang, Pingjian Zhang

We propose a new paradigm for universal information extraction (IE) that is compatible with any schema format and applicable to a list of IE tasks, such as named entity recognition, relation extraction, event extraction and sentiment analysis.

Event Extraction named-entity-recognition +3

Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective

1 code implementation16 Oct 2022 Ping Yang, Junjie Wang, Ruyi Gan, Xinyu Zhu, Lin Zhang, Ziwei Wu, Xinyu Gao, Jiaxing Zhang, Tetsuya Sakai

We propose a new paradigm for zero-shot learners that is format agnostic, i. e., it is compatible with any format and applicable to a list of language tasks, such as text classification, commonsense reasoning, coreference resolution, and sentiment analysis.

Multiple-choice Natural Language Inference +4

Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

1 code implementation5 Aug 2022 Junjie Wang, Yuxiang Zhang, Ping Yang, Ruyi Gan

This report describes a pre-trained language model Erlangshen with propensity-corrected loss, the No. 1 in CLUE Semantic Matching Challenge.

Language Modelling Masked Language Modeling

Unified BERT for Few-shot Natural Language Understanding

no code implementations24 Jun 2022 Junyu Lu, Ping Yang, Ruyi Gan, Jing Yang, Jiaxing Zhang

Even as pre-trained language models share a semantic encoder, natural language understanding suffers from a diversity of output schemas.

Natural Language Understanding

Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval

1 code implementation7 Mar 2022 Dingkun Long, Qiong Gao, Kuan Zou, Guangwei Xu, Pengjun Xie, Ruijie Guo, Jian Xu, Guanjun Jiang, Luxi Xing, Ping Yang

We find that the performance of retrieval models trained on dataset from general domain will inevitably decrease on specific domain.

Passage Retrieval Retrieval

Machine Learning Applications in Lung Cancer Diagnosis, Treatment and Prognosis

no code implementations5 Mar 2022 Yawei Li, Xin Wu, Ping Yang, Guoqian Jiang, Yuan Luo

The recent development of imaging and sequencing technologies enables systematic advances in the clinical study of lung cancer.

BIG-bench Machine Learning Lung Cancer Diagnosis

Image Denoising Using Sparsifying Transform Learning and Weighted Singular Values Minimization

no code implementations2 Apr 2020 Yanwei Zhao, Ping Yang, Qiu Guan, Jianwei Zheng, Wanliang Wang

By taking both advantages of image domain and transform domain in a general framework, we propose a sparsity transform learning and weighted singular values minimization method (STLWSM) for IDN problems.

Image Denoising

Cannot find the paper you are looking for? You can Submit a new open access paper.