no code implementations • 10 Apr 2025 • Tingwei Lu, Yangning Li, Liyuan Wang, Binghuai Lin, Jiwei Tang, Wanshi Xu, Hai-Tao Zheng, Yinghui Li, Bingxu An, Zhao Wei, Yong Xu
The emergence of large language models (LLMs) has significantly promoted the development of code generation task, sparking a surge in pertinent literature.
no code implementations • 9 Apr 2025 • Lv Qingsong, Yangning Li, Zihua Lan, Zishan Xu, Jiwei Tang, Yinghui Li, Wenhao Jiang, Hai-Tao Zheng, Philip S. Yu
So we designed a dynamic, task-objective-driven instruction selection framework RAISE(Reinforenced Adaptive Instruction SElection), which incorporates the entire instruction fine-tuning process into optimization, selecting instruction at each step based on the expected impact of instruction on model performance improvement.
no code implementations • 9 Apr 2025 • Yangning Li, Zihua Lan, Lv Qingsong, Yinghui Li, Hai-Tao Zheng
As Large Language Models (LLMs) are increasingly applied across various tasks, instruction tuning has emerged as a critical method for enhancing model performance.
no code implementations • 9 Mar 2025 • Yanyu Zhu, Licheng Bai, Jintao Xu, Jiwei Tang, Hai-Tao Zheng
Recent advances in diffusion-based lip-syncing generative models have demonstrated their ability to produce highly synchronized talking face videos for visual dubbing.
no code implementations • 21 Feb 2025 • Jingheng Ye, Shang Qin, Yinghui Li, Hai-Tao Zheng, Shen Wang, Qingsong Wen
Grammatical Error Correction (GEC) faces a critical challenge concerning explainability, notably when GEC systems are designed for language learners.
no code implementations • 17 Feb 2025 • Deqing Zou, Jingheng Ye, Yulu Liu, Yu Wu, Zishan Xu, Yinghui Li, Hai-Tao Zheng, Bingxu An, Zhao Wei, Yong Xu
Grammatical error classification plays a crucial role in language learning systems, but existing classification taxonomies often lack rigorous validation, leading to inconsistencies and unreliable feedback.
no code implementations • 17 Feb 2025 • Shaoshen Chen, Yangning Li, Zishan Xu, Yinghui Li, Xin Su, Zifei Shan, Hai-Tao Zheng
Large Language Models (LLMs) face computational inefficiencies and redundant processing when handling long context inputs, prompting a focus on compression techniques.
no code implementations • 12 Feb 2025 • Yinghui Li, Jiayi Kuang, Haojing Huang, Zhikun Xu, Xinnian Liang, Yi Yu, Wenlian Lu, Yangning Li, Xiaoyu Tan, Chao Qu, Ying Shen, Hai-Tao Zheng, Philip S. Yu
Inspired by the pedagogical method of "proof by counterexamples" commonly used in human mathematics education, our work aims to enhance LLMs' ability to conduct mathematical reasoning and proof through counterexamples.
no code implementations • 11 Feb 2025 • Yinghui Li, Haojing Huang, Jiayi Kuang, Yangning Li, Shu-Yu Guo, Chao Qu, Xiaoyu Tan, Hai-Tao Zheng, Ying Shen, Philip S. Yu
In our work, by imitating the human learning process, we design an Adaptive Contrastive Learning strategy.
no code implementations • 31 Dec 2024 • Hebin Wang, Yangning Li, Yinghui Li, Hai-Tao Zheng, Wenhao Jiang, Hong-Gee Kim
The rapid development of multimodal large language models (MLLMs) has brought significant improvements to a wide range of tasks in real-world applications.
no code implementations • 31 Dec 2024 • Ding Zhang, Yangning Li, Lichen Bai, Hao Zhang, Yinghui Li, Haiye Lin, Hai-Tao Zheng, Xin Su, Zifei Shan
Chinese grammatical error correction (CGEC) aims to detect and correct errors in the input Chinese sentences.
no code implementations • 7 Nov 2024 • Xingyu Lu, Yuhang Hu, Changyi Liu, Tianke Zhang, Zhenyu Yang, Zhixiang Ding, Shengsheng Qian, Meng Du, Ruiwen Kang, Kaiyu Tang, Fan Yang, Tingting Gao, Di Zhang, Hai-Tao Zheng, Bin Wen
In this work, we define mathematical problem-solving as a process of transiting from an initial unsolved state to the final resolved state, and propose Kwai-STaR framework, which transforms LLMs into State-Transition Reasoners to improve their intuitive reasoning capabilities.
1 code implementation • 5 Nov 2024 • Yangning Li, Yinghui Li, Xinyu Wang, Yong Jiang, Zhen Zhang, Xinran Zheng, Hui Wang, Hai-Tao Zheng, Pengjun Xie, Philip S. Yu, Fei Huang, Jingren Zhou
To bridge the dataset gap, we first construct Dyn-VQA dataset, consisting of three types of "dynamic" questions, which require complex knowledge retrieval strategies variable in query, tool, and time: (1) Questions with rapidly changing answers.
no code implementations • 28 Sep 2024 • Jiwei Tang, Jin Xu, Tingwei Lu, Zhicheng Zhang, Yiming Zhao, Lin Hai, Hai-Tao Zheng
Large language models (LLMs) demonstrate exceptional capabilities in various scenarios.
no code implementations • 29 Jul 2024 • Hongming Tan, Shaoxiong Zhan, Hai Lin, Hai-Tao Zheng, Wai Kin Chan
In dense retrieval, embedding long texts into dense vectors can result in information loss, leading to inaccurate query-text matching.
no code implementations • 1 Jul 2024 • Jingheng Ye, Yong Jiang, Xiaobin Wang, Yinghui Li, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang
To address this task, we propose ProductAgent, a conversational information seeking agent equipped with abilities of strategic clarification question generation and dynamic product retrieval.
no code implementations • 1 Jul 2024 • Jingheng Ye, Shang Qin, Yinghui Li, Xuxin Cheng, Libo Qin, Hai-Tao Zheng, Peng Xing, Zishan Xu, Guo Cheng, Zhao Wei
Existing studies explore the explainability of Grammatical Error Correction (GEC) in a limited scenario, where they ignore the interaction between corrections and explanations.
no code implementations • 1 Jul 2024 • Jingheng Ye, Zishan Xu, Yinghui Li, Xuxin Cheng, Linlin Song, Qingyu Zhou, Hai-Tao Zheng, Ying Shen, Xin Su
The paper focuses on improving the interpretability of Grammatical Error Correction (GEC) metrics, which receives little attention in previous studies.
1 code implementation • 13 Mar 2024 • Xingyu Lu, He Cao, Zijing Liu, Shengyuan Bai, Leqing Chen, Yuan YAO, Hai-Tao Zheng, Yu Li
Large language models are playing an increasingly significant role in molecular research, yet existing models often generate erroneous information, posing challenges to accurate molecular comprehension.
1 code implementation • 7 Mar 2024 • Yangning Li, Qingsong Lv, Tianyu Yu, Yinghui Li, Shulin Huang, Tingwei Lu, Xuming Hu, Wenhao Jiang, Hai-Tao Zheng, Hui Wang
To solve this issue, we first introduce negative seed entities in the inputs, which belong to the same fine-grained semantic class as the positive seed entities but differ in certain attributes.
1 code implementation • 29 Feb 2024 • Zhikun Xu, Yinghui Li, Ruixue Ding, Xinyu Wang, Boli Chen, Yong Jiang, Hai-Tao Zheng, Wenlian Lu, Pengjun Xie, Fei Huang
To promote the improvement of Chinese LLMs' ability to answer dynamic questions, in this paper, we introduce CDQA, a Chinese Dynamic QA benchmark containing question-answer pairs related to the latest news on the Chinese Internet.
no code implementations • 18 Feb 2024 • Yinghui Li, Shang Qin, Haojing Huang, Yangning Li, Libo Qin, Xuming Hu, Wenhao Jiang, Hai-Tao Zheng, Philip S. Yu
To promote the CGEC field to better adapt to the era of LLMs, we rethink the roles of LLMs in the CGEC task so that they can be better utilized and explored in CGEC.
no code implementations • 18 Feb 2024 • Peng Xing, Yinghui Li, Shirong Ma, Xinnian Liang, Haojing Huang, Yangning Li, Hai-Tao Zheng, Wenhao Jiang, Ying Shen
Chinese Spelling Correction (CSC) aims to detect and correct spelling errors in given sentences.
1 code implementation • 16 Feb 2024 • Yinghui Li, Qingyu Zhou, Yuanzhen Luo, Shirong Ma, Yangning Li, Hai-Tao Zheng, Xuming Hu, Philip S. Yu
In this paper, we challenge the reasoning and understanding abilities of LLMs by proposing a FaLlacy Understanding Benchmark (FLUB) containing cunning texts that are easy for humans to understand but difficult for models to grasp.
no code implementations • 25 Dec 2023 • Shirong Ma, Shen Huang, Shulin Huang, Xiaobin Wang, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang
Experimental results demonstrate the effectiveness of continual pre-training of E-commerce LLMs and the efficacy of our devised data mixing strategy.
3 code implementations • CVPR 2024 • Tianyu Yu, Yuan YAO, Haoye Zhang, Taiwen He, Yifeng Han, Ganqu Cui, Jinyi Hu, Zhiyuan Liu, Hai-Tao Zheng, Maosong Sun, Tat-Seng Chua
Multimodal Large Language Models (MLLMs) have recently demonstrated impressive capabilities in multimodal understanding, reasoning, and interaction.
Ranked #1 on
Visual Question Answering
on VQA v2
1 code implementation • 19 Nov 2023 • Yinghui Li, Zishan Xu, Shaoshen Chen, Haojing Huang, Yangning Li, Yong Jiang, Zhongli Li, Qingyu Zhou, Hai-Tao Zheng, Ying Shen
To the best of our knowledge, Visual-C$^3$ is the first real-world visual and the largest human-crafted dataset for the Chinese character checking scenario.
1 code implementation • 18 Oct 2023 • Jingheng Ye, Yinghui Li, Yangning Li, Hai-Tao Zheng
In this paper, we aim to clarify how data augmentation improves GEC models.
1 code implementation • 13 Oct 2023 • Haojing Huang, Jingheng Ye, Qingyu Zhou, Yinghui Li, Yangning Li, Feng Zhou, Hai-Tao Zheng
In recent years, Chinese Spelling Check (CSC) has been greatly improved by designing task-specific pre-training methods or introducing auxiliary tasks, which mostly solve this task in an end-to-end fashion.
2 code implementations • 1 Oct 2023 • Tianyu Yu, Jinyi Hu, Yuan YAO, Haoye Zhang, Yue Zhao, Chongyi Wang, Shan Wang, Yinxv Pan, Jiao Xue, Dahai Li, Zhiyuan Liu, Hai-Tao Zheng, Maosong Sun
The capabilities of MLLMs depend on two crucial factors: the model architecture to facilitate the feature alignment of visual modules and large language models; the multimodal instruction tuning datasets for human instruction following.
no code implementations • 15 Sep 2023 • Yulin Chen, Ning Ding, Hai-Tao Zheng, Zhiyuan Liu, Maosong Sun, BoWen Zhou
Artificial intelligence has been applied in various aspects of online education to facilitate teaching and learning.
no code implementations • 10 Sep 2023 • Chaiyut Luoyiching, Yangning Li, Yinghui Li, Rongsheng Li, Hai-Tao Zheng, Nannan Zhou, Hanjing Su
Previous GFSID methods rely on the episodic learning paradigm, which makes it hard to extend to a generalized setup as they do not explicitly learn the classification of seen categories and the knowledge of seen intents.
no code implementations • 10 Sep 2023 • Rongsheng Li, Yangning Li, Yinghui Li, Chaiyut Luoyiching, Hai-Tao Zheng, Nannan Zhou, Hanjing Su
However, due to the limited training data in the meta-learning scenario and the inherent properties of parameterized neural networks, poor generalization performance has become a pressing problem that needs to be addressed.
no code implementations • 7 Sep 2023 • Zilin Yuan, Borun Chen, Yimeng Dai, Yinghui Li, Hai-Tao Zheng, Rui Zhang
CIFAL leverages the anchor learning, which is model-agnostic for any Pre-trained Language Model, to help capture citation patterns from the data of different citation styles.
1 code implementation • 22 Aug 2023 • Jinpeng Wang, Ziyun Zeng, Yunxiao Wang, Yuting Wang, Xingyu Lu, Tianxiang Li, Jun Yuan, Rui Zhang, Hai-Tao Zheng, Shu-Tao Xia
We propose MISSRec, a multi-modal pre-training and transfer learning framework for SR. On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests while a novel interest-aware decoder is developed to grasp item-modality-interest relations for better sequence representation.
1 code implementation • 21 Aug 2023 • Tianyu Yu, Chengyue Jiang, Chao Lou, Shen Huang, Xiaobin Wang, Wei Liu, Jiong Cai, Yangning Li, Yinghui Li, Kewei Tu, Hai-Tao Zheng, Ningyu Zhang, Pengjun Xie, Fei Huang, Yong Jiang
However, LLMs are sometimes too footloose for natural language understanding (NLU) tasks which always have restricted output and input format.
1 code implementation • 21 Aug 2023 • Shulin Huang, Shirong Ma, Yinghui Li, Mengzuo Huang, Wuhe Zou, Weidong Zhang, Hai-Tao Zheng
With the continuous evolution and refinement of LLMs, they are endowed with impressive logical reasoning or vertical thinking capabilities.
1 code implementation • 14 Aug 2023 • Yangning Li, Shirong Ma, Xiaobin Wang, Shen Huang, Chengyue Jiang, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang
EcomInstruct scales up the data size and task diversity by constructing atomic tasks with E-commerce basic data types, such as product information, user reviews.
1 code implementation • 27 Jul 2023 • Yangning Li, Tingwei Lu, Yinghui Li, Tianyu Yu, Shulin Huang, Hai-Tao Zheng, Rui Zhang, Jun Yuan
The Entity Set Expansion (ESE) task aims to expand a handful of seed entities with new entities belonging to the same semantic class.
no code implementations • 18 Jul 2023 • Yinghui Li, Haojing Huang, Shirong Ma, Yong Jiang, Yangning Li, Feng Zhou, Hai-Tao Zheng, Qingyu Zhou
Recently, the development and progress of Large Language Models (LLMs) have amazed the entire Artificial Intelligence community.
no code implementations • 30 Jun 2023 • Yinghui Li, Shirong Ma, Shaoshen Chen, Haojing Huang, Shulin Huang, Yangning Li, Hai-Tao Zheng, Ying Shen
During the training process, ProTEC guides the model to learn text error correction by incorporating these sub-tasks into a progressive paradigm.
no code implementations • Neurocomputing 2023 • Hanqing Liu, Jiacheng Yang, Chia-Hao Chang, Wei Wang, Hai-Tao Zheng, Yong Jiang, Hui Wang, Rui Xie, and Wei Wu
Moreover, the existing method of alleviating error accumulation based on replacing reference words does not take into account the different effects of each word.
Ranked #20 on
Visual Storytelling
on VIST
1 code implementation • 21 Jun 2023 • Yinghui Li, Yong Jiang, Yangning Li, Xingyu Lu, Pengjun Xie, Ying Shen, Hai-Tao Zheng
Entity Linking (EL) is a fundamental task for Information Extraction and Knowledge Graphs.
no code implementations • 20 Jun 2023 • Kai Ouyang, Xianghong Xu, Miaoxin Chen, Zuotong Xie, Hai-Tao Zheng, Shuangyong Song, Yu Zhao
Session-based Recommendation (SR) aims to predict users' next click based on their behavior within a short period, which is crucial for online platforms.
no code implementations • 31 May 2023 • Yulin Chen, Ning Ding, Xiaobin Wang, Shengding Hu, Hai-Tao Zheng, Zhiyuan Liu, Pengjun Xie
Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning.
1 code implementation • 18 May 2023 • Jingheng Ye, Yinghui Li, Qingyu Zhou, Yangning Li, Shirong Ma, Hai-Tao Zheng, Ying Shen
Evaluating the performance of Grammatical Error Correction (GEC) systems is a challenging task due to its subjectivity.
2 code implementations • NeurIPS 2023 • Tong Wu, Zhihao Fan, Xiao Liu, Yeyun Gong, Yelong Shen, Jian Jiao, Hai-Tao Zheng, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen
Diffusion models have gained significant attention in the realm of image generation due to their exceptional performance.
no code implementations • 12 May 2023 • Kai Ouyang, Chen Tang, Wenhao Zheng, Xiangjin Xie, Xuanji Xiao, Jian Dong, Hai-Tao Zheng, Zhi Wang
To address this issue, we propose using knowledge soft integration to balance the utilization of multimodal features and the curse of knowledge problem it brings about.
no code implementations • 7 Apr 2023 • Shulin Huang, Shirong Ma, Yangning Li, Yinghui Li, Hai-Tao Zheng
For efficiency, expansion time consumed by GenExpan is independent of entity vocabulary and corpus size, and GenExpan achieves an average 600% speedup compared to strong baselines.
1 code implementation • 5 Apr 2023 • Jifan Yu, Mengying Lu, Qingyang Zhong, Zijun Yao, Shangqing Tu, Zhengshan Liao, Xiaoya Li, Manli Li, Lei Hou, Hai-Tao Zheng, Juanzi Li, Jie Tang
Student modeling, the task of inferring a student's learning characteristics through their interactions with coursework, is a fundamental issue in intelligent education.
no code implementations • 3 Apr 2023 • Kai Ouyang, Wenhao Zheng, Chen Tang, Xuanji Xiao, Hai-Tao Zheng
To tackle this issue, we argue that a trade-off should be achieved between the introduction of large amounts of auxiliary information and the protection of valuable information related to CVR.
1 code implementation • 16 Mar 2023 • Tong Wu, Hao Wang, Zhongshen Zeng, Wei Wang, Hai-Tao Zheng, Jiaxing Zhang
Recently, there has been a surge in the use of generated data to enhance the performance of downstream models, largely due to the advancements in pre-trained language models.
no code implementations • 9 Mar 2023 • Tianyu Yu, Yangning Li, Jiaoyan Chen, Yinghui Li, Hai-Tao Zheng, Xi Chen, Qingbin Liu, Wenqiang Liu, Dongxiao Huang, Bei Wu, Yexin Wang
Inspired by this, we devise a knowledge-augmented, few-shot VRD framework leveraging both textual knowledge and visual relation knowledge to improve the generalization ability of few-shot VRD.
no code implementations • 17 Feb 2023 • Yangning Li, Jiaoyan Chen, Yinghui Li, Yuejia Xiang, Xi Chen, Hai-Tao Zheng
Entity alignment (EA) for knowledge graphs (KGs) plays a critical role in knowledge engineering.
1 code implementation • 14 Dec 2022 • Wenye Lin, Yifeng Ding, Zhixiong Cao, Hai-Tao Zheng
A common practice to address this problem is to introduce a pretrained contrastive teacher model and train the lightweight networks with distillation signals generated by the teacher.
1 code implementation • 22 Nov 2022 • Yuan YAO, Tianyu Yu, Ao Zhang, Mengdi Li, Ruobing Xie, Cornelius Weber, Zhiyuan Liu, Hai-Tao Zheng, Stefan Wermter, Tat-Seng Chua, Maosong Sun
In this work, we present CLEVER, which formulates CKE as a distantly supervised multi-instance learning problem, where models learn to summarize commonsense relations from a bag of images about an entity pair without any human annotation on image instances.
no code implementations • 20 Nov 2022 • Yangning Li, Jiaoyan Chen, Yinghui Li, Tianyu Yu, Xi Chen, Hai-Tao Zheng
Extensive experiments demonstrate that PICSO can dramatically outperform the original PLMs and the other knowledge and synonym injection models on four different similarity-oriented tasks.
no code implementations • 10 Nov 2022 • Ning Ding, Yulin Chen, Ganqu Cui, Xiaobin Wang, Hai-Tao Zheng, Zhiyuan Liu, Pengjun Xie
Moreover, it is more convenient to perform metric-based classification with hypersphere prototypes than statistical modeling, as we only need to calculate the distance from a data point to the surface of the hypersphere.
no code implementations • 8 Nov 2022 • Yangning Li, Yinghui Li, Xi Chen, Hai-Tao Zheng, Ying Shen, Hong-Gee Kim
Open Relation Extraction (OpenRE) aims to discover novel relations from open domains.
1 code implementation • 29 Oct 2022 • Shulin Huang, Shirong Ma, Yinghui Li, Yangning Li, Shiyang Lin, Hai-Tao Zheng, Ying Shen
Facing this dilemma, we focus on a novel CTG scenario, i. e., blessing generation which is challenging because high-quality blessing texts require CTG models to comprehensively consider the entanglement between multiple attributes (e. g., objects and occasions).
no code implementations • 27 Oct 2022 • Zilin Yuan, Yinghui Li, Yangning Li, Rui Xie, Wei Wu, Hai-Tao Zheng
We noted that the distinctness of the domain-specific features is different, so in this paper, we propose to use a curriculum learning strategy based on keyword weight ranking to improve the performance of multi-domain text classification models.
no code implementations • 23 Oct 2022 • Jingheng Ye, Yinghui Li, Shirong Ma, Rui Xie, Wei Wu, Hai-Tao Zheng
Chinese Grammatical Error Correction (CGEC) aims to automatically detect and correct grammatical errors contained in Chinese text.
no code implementations • COLING 2022 • Borun Chen, Hongyin Tang, Jiahao Bu, Kai Zhang, Jingang Wang, Qifan Wang, Hai-Tao Zheng, Wei Wu, Liqian Yu
However, most current models use Chinese characters as inputs and are not able to encode semantic information contained in Chinese words.
1 code implementation • 17 Jul 2022 • Yinghui Li, Shulin Huang, Xinwei Zhang, Qingyu Zhou, Yangning Li, Ruiyang Liu, Yunbo Cao, Hai-Tao Zheng, Ying Shen
In addition, we propose the GAPA, a novel ESE framework that leverages the aforementioned GenerAted PAtterns to expand target entities.
no code implementations • 17 Jul 2022 • Ding Zhang, Yinghui Li, Qingyu Zhou, Shirong Ma, Yangning Li, Yunbo Cao, Hai-Tao Zheng
Chinese Spell Checking (CSC) task aims to detect and correct Chinese spelling errors.
1 code implementation • 16 Apr 2022 • Yinghui Li, Yangning Li, Yuxin He, Tianyu Yu, Ying Shen, Hai-Tao Zheng
In addition, we propose the ProbExpan, a novel probabilistic ESE framework utilizing the entity representation obtained by the aforementioned language model to expand entities.
1 code implementation • Findings (ACL) 2022 • Fanchao Qi, Chuancheng Lv, Zhiyuan Liu, Xiaojun Meng, Maosong Sun, Hai-Tao Zheng
In this paper, we utilize the multilingual synonyms, multilingual glosses and images in BabelNet for SPBS.
1 code implementation • 14 Mar 2022 • Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun
This necessitates a new branch of research focusing on the parameter-efficient adaptation of PLMs, dubbed as delta tuning in this paper.
1 code implementation • 9 Mar 2022 • Wenye Lin, Yangming Li, Lemao Liu, Shuming Shi, Hai-Tao Zheng
Specifically, we transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.
no code implementations • Findings (ACL) 2022 • Yinghui Li, Qingyu Zhou, Yangning Li, Zhongli Li, Ruiyang Liu, Rongyi Sun, Zizhen Wang, Chao Li, Yunbo Cao, Hai-Tao Zheng
However, there exists a gap between the learned knowledge of PLMs and the goal of CSC task.
1 code implementation • 7 Nov 2021 • Ruiyang Liu, Yinghui Li, Linmi Tao, Dun Liang, Hai-Tao Zheng
In the GPU era, the locally and globally weighted summations are the current mainstreams, represented by the convolution and self-attention mechanism, as well as MLP.
2 code implementations • ACL 2022 • Ning Ding, Shengding Hu, Weilin Zhao, Yulin Chen, Zhiyuan Liu, Hai-Tao Zheng, Maosong Sun
Prompt-learning has become a new paradigm in modern natural language processing, which directly adapts pre-trained language models (PLMs) to $cloze$-style prediction, autoregressive modeling, or sequence to sequence generation, resulting in promising performances on various tasks.
no code implementations • 19 Oct 2021 • Rongyi Sun, Borun Chen, Qingyu Zhou, Yinghui Li, Yunbo Cao, Hai-Tao Zheng
Existing text- and image-based multimodal dialogue systems use the traditional Hierarchical Recurrent Encoder-Decoder (HRED) framework, which has an utterance-level encoder to model utterance representation and a context-level encoder to model context representation.
no code implementations • 29 Sep 2021 • Ning Ding, Yulin Chen, Xiaobin Wang, Hai-Tao Zheng, Zhiyuan Liu, Pengjun Xie
A big prototype could be effectively modeled by two sets of learnable parameters, one is the center of the hypersphere, which is an embedding with the same dimension of training examples.
no code implementations • 24 Aug 2021 • Ning Ding, Yulin Chen, Xu Han, Guangwei Xu, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu, Juanzi Li, Hong-Gee Kim
In this work, we investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios.
no code implementations • 6 Aug 2021 • Wei Wang, Piji Li, Hai-Tao Zheng
In the phase of surface realization, a mixed-granularity sentence decoder is designed to generate text with better consistency by jointly incorporating the predicted sentence-level main idea as well as the preceding contextual token-level information.
1 code implementation • ACL 2021 • Dong Wang, Ning Ding, Piji Li, Hai-Tao Zheng
Recent works aimed to improve the robustness of pre-trained models mainly focus on adversarial training from perturbed examples with similar semantics, neglecting the utilization of different or even opposite semantics.
7 code implementations • ACL 2021 • Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu
In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types.
Ranked #6 on
Named Entity Recognition (NER)
on Few-NERD (SUP)
1 code implementation • ICLR 2021 • Ning Ding, Xiaobin Wang, Yao Fu, Guangwei Xu, Rui Wang, Pengjun Xie, Ying Shen, Fei Huang, Hai-Tao Zheng, Rui Zhang
This approach allows us to learn meaningful, interpretable prototypes for the final classification.
no code implementations • 22 Feb 2021 • Yinghui Li, Chen Wang, Yangning Li, Hai-Tao Zheng, Ying Shen
Learning an empirically effective model with generalization using limited data is a challenging task for deep neural networks.
no code implementations • 13 Feb 2021 • Wei Wang, Piji Li, Hai-Tao Zheng
Automatic comment generation is a special and challenging task to verify the model ability on news content comprehension and language generation.
no code implementations • 20 Jan 2021 • Lingyun Feng, Minghui Qiu, Yaliang Li, Hai-Tao Zheng, Ying Shen
Despite pre-trained language models such as BERT have achieved appealing performance in a wide range of natural language processing tasks, they are computationally expensive to be deployed in real-time applications.
no code implementations • 17 Nov 2020 • Yinghui Li, Ruiyang Liu, Zihao Zhang, Ning Ding, Ying Shen, Linmi Tao, Hai-Tao Zheng
We also provide a theoretical explanation of our method.
no code implementations • 17 Oct 2020 • Wei Wang, Piji Li, Hai-Tao Zheng
In terms of consistency, on one hand, GPT2 cannot guarantee the consistency of the plots explicitly.
1 code implementation • ACL 2020 • Ning Ding, Dingkun Long, Guangwei Xu, Muhua Zhu, Pengjun Xie, Xiaobin Wang, Hai-Tao Zheng
In order to simultaneously alleviate these two issues, this paper proposes to couple distant annotation and adversarial training for cross-domain CWS.
1 code implementation • 24 Jun 2020 • Huiying Li, Shawn Shan, Emily Wenger, Jiayun Zhang, Hai-Tao Zheng, Ben Y. Zhao
In particular, query-based black-box attacks do not require knowledge of the deep learning model, but can compute adversarial examples over the network by submitting queries and inspecting returns.
no code implementations • EMNLP 2020 • Zibo Lin, Deng Cai, Yan Wang, Xiaojiang Liu, Hai-Tao Zheng, Shuming Shi
Despite that response selection is naturally a learning-to-rank problem, most prior works take a point-wise view and train binary classifiers for this task: each response candidate is labeled either relevant (one) or irrelevant (zero).
Ranked #11 on
Conversational Response Selection
on E-commerce
1 code implementation • 19 Feb 2020 • Shawn Shan, Emily Wenger, Jiayun Zhang, Huiying Li, Hai-Tao Zheng, Ben Y. Zhao
In this paper, we propose Fawkes, a system that helps individuals inoculate their images against unauthorized facial recognition models.
1 code implementation • IJCNLP 2019 • Ning Ding, Ziran Li, Zhiyuan Liu, Hai-Tao Zheng, Zibo Lin
To ad- dress the two issues simultaneously, we pro- pose the Trigger-aware Lattice Neural Net- work (TLNN).
no code implementations • 3 Aug 2019 • Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang, Hai-Tao Zheng
To model and utilize the context information for aggregated search, we propose a model with context attention and representation learning (CARL).
1 code implementation • ACL 2019 • Ziran Li, Ning Ding, Zhiyuan Liu, Hai-Tao Zheng, Ying Shen
Chinese relation extraction is conducted using neural networks with either character-based or word-based inputs, and most existing methods typically suffer from segmentation errors and ambiguity of polysemy.
no code implementations • 24 May 2019 • Yuanshun Yao, Huiying Li, Hai-Tao Zheng, Ben Y. Zhao
Recent work has proposed the concept of backdoor attacks on deep neural networks (DNNs), where misbehaviors are hidden inside "normal" models, only to be triggered by very specific inputs.
1 code implementation • 18 Apr 2019 • Shawn Shan, Emily Wenger, Bolun Wang, Bo Li, Hai-Tao Zheng, Ben Y. Zhao
Attackers' optimization algorithms gravitate towards trapdoors, leading them to produce attacks similar to trapdoors in the feature space.
no code implementations • 22 Sep 2018 • Zhujun Xiao, Yanzi Zhu, Yuxin Chen, Ben Y. Zhao, Junchen Jiang, Hai-Tao Zheng
Build accurate DNN models requires training on large labeled, context specific datasets, especially those matching the target scenario.
35 code implementations • ECCV 2018 • Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun
Datasets, Transforms and Models specific to Computer Vision
Ranked #950 on
Image Classification
on ImageNet
no code implementations • 27 Aug 2017 • Yuanshun Yao, Bimal Viswanath, Jenna Cryan, Hai-Tao Zheng, Ben Y. Zhao
Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers.
Cryptography and Security Social and Information Networks