no code implementations • 4 Oct 2024 • Lijie Hu, Liang Liu, Shu Yang, Xin Chen, Zhen Tan, Muhammad Asif Ali, Mengdi Li, Di Wang
Moreover, we propose the Representation-of-Thought (RoT) framework, which leverages the robustness of low-dimensional representation spaces to enhance the robustness of the reasoning process in CoTs.
no code implementations • 18 Sep 2024 • Muhammad Asif Ali, Nawal Daftardar, Mutayyaba Waheed, Jianbin Qin, Di Wang
Later for each sub-problem, it iteratively queries the external memory and/or target LLM in order to generate the final response.
no code implementations • 18 Jun 2024 • Lijie Hu, Liang Liu, Shu Yang, Xin Chen, Hongru Xiao, Mengdi Li, Pan Zhou, Muhammad Asif Ali, Di Wang
Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models (LLMs).
no code implementations • 24 May 2024 • Keyuan Cheng, Muhammad Asif Ali, Shu Yang, Gang Lin, Yuxuan zhai, Haoyang Fei, Ke Xu, Lu Yu, Lijie Hu, Di Wang
To address these issues, in this paper, we propose a novel framework named RULE-KE, i. e., RULE based Knowledge Editing, which is a cherry on the top for augmenting the performance of all existing MQA methods under KE.
no code implementations • 30 Mar 2024 • Muhammad Asif Ali, ZhengPing Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Guimin Hu, Weimin Lyu, Lijie Hu, Lu Yu, Di Wang
We also propose GSM8K-aug, i. e., an extended version of the existing GSM8K benchmark for task-agnostic prompts in order to provide a comprehensive evaluation platform.
no code implementations • 30 Mar 2024 • Keyuan Cheng, Gang Lin, Haoyang Fei, Yuxuan zhai, Lu Yu, Muhammad Asif Ali, Lijie Hu, Di Wang
Multi-hop question answering (MQA) under knowledge editing (KE) has garnered significant attention in the era of large language models.
no code implementations • 30 Mar 2024 • Shu Yang, Jiayuan Su, Han Jiang, Mengdi Li, Keyuan Cheng, Muhammad Asif Ali, Lijie Hu, Di Wang
With the rise of large language models (LLMs), ensuring they embody the principles of being helpful, honest, and harmless (3H), known as Human Alignment, becomes crucial.
no code implementations • 17 Feb 2024 • Shu Yang, Muhammad Asif Ali, Cheng-Long Wang, Lijie Hu, Di Wang
Adapting large language models (LLMs) to new domains/tasks and enabling them to be efficient lifelong learners is a pivotal challenge.
no code implementations • 17 Feb 2024 • Shu Yang, Muhammad Asif Ali, Lu Yu, Lijie Hu, Di Wang
The increasing significance of large models and their multi-modal variants in societal information processing has ignited debates on social safety and ethics.
no code implementations • 18 Jan 2024 • Muhammad Asif Ali, Yan Hu, Jianbin Qin, Di Wang
In this paper, we propose InterlaCed Encoder NETworks (i. e., ICE-NET) for antonym vs synonym distinction, that aim to capture and model the relation-specific properties of the antonyms and synonyms pairs in order to perform the classification task in a performance-enhanced manner.
1 code implementation • 19 Oct 2023 • Muhammad Asif Ali, Maha Alshmrani, Jianbin Qin, Yan Hu, Di Wang
Bilingual Lexical Induction (BLI) is a core challenge in NLP, it relies on the relative isomorphism of individual embedding spaces.
1 code implementation • 18 Oct 2023 • Muhammad Asif Ali, Yan Hu, Jianbin Qin, Di Wang
Automated construction of bilingual dictionaries using monolingual embedding spaces is a core challenge in machine translation.
no code implementations • 27 Jan 2021 • Muhammad Asif Ali, Yifang Sun, Bing Li, Wei Wang
Fine-Grained Named Entity Typing (FG-NET) aims at classifying the entity mentions into a wide range of entity types (usually hundreds) depending upon the context.
no code implementations • 7 Apr 2020 • Muhammad Asif Ali, Yifang Sun, Bing Li, Wei Wang
Fine-Grained Named Entity Typing (FG-NET) is a key component in Natural Language Processing (NLP).
no code implementations • 13 Jun 2019 • Muhammad Asif Ali, Yifang Sun, Xiaoling Zhou, Wei Wang, Xiang Zhao
We hypothesize that the pre-trained embeddings comprehend a blend of lexical-semantic information and we may distill the task-specific information using Distiller, a model proposed in this paper.