Search Results for author: Zijun Yao

Found 36 papers, 20 papers with code

RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style

1 code implementation21 Oct 2024 Yantao Liu, Zijun Yao, Rui Min, Yixin Cao, Lei Hou, Juanzi Li

However, this approach fails to assess reward models on subtle but critical content changes and variations in style, resulting in a low correlation with policy model performance.

Benchmarking Language Modelling

Pre-training Distillation for Large Language Models: A Design Space Exploration

no code implementations21 Oct 2024 Hao Peng, Xin Lv, Yushi Bai, Zijun Yao, Jiajie Zhang, Lei Hou, Juanzi Li

Previous work applying KD in the field of large language models (LLMs) typically focused on the post-training phase, where the student LLM learns directly from instructions and corresponding responses generated by the teacher model.

Knowledge Distillation

Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks

no code implementations3 Oct 2024 Christopher J. Harvey, Sumaiya Shomaji, Zijun Yao, Amit Noheria

The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiovascular assessment.

Meta-Learning on Augmented Gene Expression Profiles for Enhanced Lung Cancer Detection

1 code implementation19 Aug 2024 Arya Hadizadeh Moghaddam, Mohsen Nayebi Kerdabadi, Cuncong Zhong, Zijun Yao

We apply this framework to well-established deep learning methodologies and employ four distinct datasets for the meta-learning tasks, where one as the target dataset and the rest as source datasets.

Meta-Learning Transfer Learning

Contrastive Learning on Medical Intents for Sequential Prescription Recommendation

no code implementations13 Aug 2024 Arya Hadizadeh Moghaddam, Mohsen Nayebi Kerdabadi, Mei Liu, Zijun Yao

To achieve this goal, we introduce Attentive Recommendation with Contrasted Intents (ARCI), a multi-level transformer-based method designed to capture the different but coexisting temporal paths across a shared sequence of visits.

Contrastive Learning Recommendation Systems

LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking

1 code implementation4 Jul 2024 Amy Xin, Yunjia Qi, Zijun Yao, Fangwei Zhu, Kaisheng Zeng, Xu Bin, Lei Hou, Juanzi Li

Entity Linking (EL) models are well-trained at mapping mentions to their corresponding entities according to a given context.

Data Augmentation Entity Linking

Aligning Teacher with Student Preferences for Tailored Training Data Generation

no code implementations27 Jun 2024 Yantao Liu, Zhao Zhang, Zijun Yao, Shulin Cao, Lei Hou, Juanzi Li

Thus, we propose ARTE, dubbed Aligning TeacheR with StudenT PreferencEs, a framework that aligns the teacher model with student preferences to generate tailored training examples for Knowledge Distillation.

In-Context Learning Knowledge Distillation

SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation

1 code implementation27 Jun 2024 Zijun Yao, Weijian Qi, Liangming Pan, Shulin Cao, Linmei Hu, Weichuan Liu, Lei Hou, Juanzi Li

This paper introduces Self-aware Knowledge Retrieval (SeaKR), a novel adaptive RAG model that extracts self-aware uncertainty of LLMs from their internal states.

Question Answering RAG +1

Finding Safety Neurons in Large Language Models

no code implementations20 Jun 2024 Jianhui Chen, Xiaozhi Wang, Zijun Yao, Yushi Bai, Lei Hou, Juanzi Li

In this paper, we explore the inner mechanisms of safety alignment from the perspective of mechanistic interpretability, focusing on identifying and analyzing safety neurons within LLMs that are responsible for safety behaviors.

Misinformation Safety Alignment

DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning

1 code implementation6 Jun 2024 Shangqing Tu, Kejian Zhu, Yushi Bai, Zijun Yao, Lei Hou, Juanzi Li

To effectively detect in-distribution contamination, we propose DICE, a novel method that leverages the internal states of LLMs to locate-then-detect the contamination.

Math

A Solution-based LLM API-using Methodology for Academic Information Seeking

1 code implementation24 May 2024 Yuanchun Wang, Jifan Yu, Zijun Yao, Jing Zhang, Yuyang Xie, Shangqing Tu, Yiyang Fu, Youhe Feng, Jinkai Zhang, Jingyao Zhang, Bowen Huang, Yuanyao Li, Huihui Yuan, Lei Hou, Juanzi Li, Jie Tang

Applying large language models (LLMs) for academic API usage shows promise in reducing researchers' academic information seeking efforts.

Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency Checking

1 code implementation10 Apr 2024 Xiaokang Zhang, Zijun Yao, Jing Zhang, Kaifeng Yun, Jifan Yu, Juanzi Li, Jie Tang

Detecting non-factual content is a longstanding goal to increase the trustworthiness of large language models (LLMs) generations.

Question Answering

A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation

no code implementations4 Apr 2024 Jifan Yu, Xiaohan Zhang, Yifan Xu, Xuanyu Lei, Zijun Yao, Jing Zhang, Lei Hou, Juanzi Li

Recently, knowledge-grounded dialogue generation models, that intentionally invoke external knowledge resources to more informative responses, are also proven to be effective in reducing hallucination.

counterfactual Counterfactual Reasoning +2

Evaluating Generative Language Models in Information Extraction as Subjective Question Correction

1 code implementation4 Apr 2024 Yuchen Fan, Yantao Liu, Zijun Yao, Jifan Yu, Lei Hou, Juanzi Li

(1) The imprecision of existing evaluation metrics that struggle to effectively gauge semantic consistency between model outputs and ground truth, and (2) The inherent incompleteness of evaluation benchmarks, primarily due to restrictive human annotation schemas, resulting in underestimated LLM performances.

Event Extraction Natural Language Inference +1

Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models

1 code implementation4 Apr 2024 Yantao Liu, Zijun Yao, Xin Lv, Yuchen Fan, Shulin Cao, Jifan Yu, Lei Hou, Juanzi Li

However, knowledge in the document may conflict with the memory of LLMs due to outdated or incorrect knowledge in the LLMs' parameters.

Question Answering

TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

1 code implementation28 Mar 2024 Xiaokang Zhang, Jing Zhang, Zeyao Ma, Yang Li, Bohan Zhang, Guanlin Li, Zijun Yao, Kangli Xu, Jinchang Zhou, Daniel Zhang-li, Jifan Yu, Shu Zhao, Juanzi Li, Jie Tang

We introduce TableLLM, a robust large language model (LLM) with 13 billion parameters, purpose-built for proficiently handling tabular data manipulation tasks, whether they are embedded within documents or spreadsheets, catering to real-world office scenarios.

Language Modelling Large Language Model

Reverse That Number! Decoding Order Matters in Arithmetic Learning

no code implementations9 Mar 2024 Daniel Zhang-li, Nianyi Lin, Jifan Yu, Zheyuan Zhang, Zijun Yao, Xiaokang Zhang, Lei Hou, Jing Zhang, Juanzi Li

Recent advancements in pretraining have demonstrated that modern Large Language Models (LLMs) possess the capability to effectively learn arithmetic operations.

Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions

1 code implementation23 Nov 2023 Shulin Cao, Jiajie Zhang, Jiaxin Shi, Xin Lv, Zijun Yao, Qi Tian, Juanzi Li, Lei Hou

During reasoning, for leaf nodes, LLMs choose a more confident answer from Closed-book QA that employs parametric knowledge and Open-book QA that employs retrieved external knowledge, thus eliminating the negative retrieval problem.

Retrieval

Contrastive Learning of Temporal Distinctiveness for Survival Analysis in Electronic Health Records

no code implementations24 Aug 2023 Mohsen Nayebi Kerdabadi, Arya Hadizadeh Moghaddam, Bin Liu, Mei Liu, Zijun Yao

Therefore, in this paper, we propose a novel Ontology-aware Temporality-based Contrastive Survival (OTCSurv) analysis framework that utilizes survival durations from both censored and observed data to define temporal distinctiveness and construct negative sample pairs with adjustable hardness for contrastive learning.

Contrastive Learning Survival Analysis

LittleMu: Deploying an Online Virtual Teaching Assistant via Heterogeneous Sources Integration and Chain of Teach Prompts

1 code implementation11 Aug 2023 Shangqing Tu, Zheyuan Zhang, Jifan Yu, Chunyang Li, Siyu Zhang, Zijun Yao, Lei Hou, Juanzi Li

However, few MOOC platforms are providing human or virtual teaching assistants to support learning for massive online students due to the complexity of real-world online education scenarios and the lack of training data.

Language Modelling Question Answering +1

VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering

no code implementations6 Jul 2023 Zijun Yao, Yuanyong Chen, Xin Lv, Shulin Cao, Amy Xin, Jifan Yu, Hailong Jin, Jianjun Xu, Peng Zhang, Lei Hou, Juanzi Li

We present Visual Knowledge oriented Programming platform (VisKoP), a knowledge base question answering (KBQA) system that integrates human into the loop to edit and debug the knowledge base (KB) queries.

Knowledge Base Question Answering Program induction +2

On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic Writing

2 code implementations7 Jun 2023 Zeyan Liu, Zijun Yao, Fengjun Li, Bo Luo

In this paper, we aim to present a comprehensive study of the detectability of ChatGPT-generated content within the academic literature, particularly focusing on the abstracts of scientific papers, to offer holistic support for the future development of LLM applications and policies in academia.

Benchmarking Prompt Engineering

MoocRadar: A Fine-grained and Multi-aspect Knowledge Repository for Improving Cognitive Student Modeling in MOOCs

1 code implementation5 Apr 2023 Jifan Yu, Mengying Lu, Qingyang Zhong, Zijun Yao, Shangqing Tu, Zhengshan Liao, Xiaoya Li, Manli Li, Lei Hou, Hai-Tao Zheng, Juanzi Li, Jie Tang

Student modeling, the task of inferring a student's learning characteristics through their interactions with coursework, is a fundamental issue in intelligent education.

cognitive diagnosis Knowledge Tracing

GLM-Dialog: Noise-tolerant Pre-training for Knowledge-grounded Dialogue Generation

1 code implementation28 Feb 2023 Jing Zhang, Xiaokang Zhang, Daniel Zhang-li, Jifan Yu, Zijun Yao, Zeyao Ma, Yiqi Xu, Haohua Wang, Xiaohan Zhang, Nianyi Lin, Sunrui Lu, Juanzi Li, Jie Tang

We present GLM-Dialog, a large-scale language model (LLM) with 10B parameters capable of knowledge-grounded conversation in Chinese using a search engine to access the Internet knowledge.

Dialogue Evaluation Dialogue Generation +2

Schema-Free Dependency Parsing via Sequence Generation

no code implementations28 Jan 2022 Boda Lin, Zijun Yao, Jiaxin Shi, Shulin Cao, Binghao Tang, Si Li, Yong Luo, Juanzi Li, Lei Hou

To remedy these drawbacks, we propose to achieve universal and schema-free Dependency Parsing (DP) via Sequence Generation (SG) DPSG by utilizing only the pre-trained language model (PLM) without any auxiliary structures or parsing algorithms.

Decoder Dependency Parsing +1

Program Transfer for Answering Complex Questions over Knowledge Bases

1 code implementation ACL 2022 Shulin Cao, Jiaxin Shi, Zijun Yao, Xin Lv, Jifan Yu, Lei Hou, Juanzi Li, Zhiyuan Liu, Jinghui Xiao

In this paper, we propose the approach of program transfer, which aims to leverage the valuable program annotations on the rich-resourced KBs as external supervision signals to aid program induction for the low-resourced KBs that lack program annotations.

Program induction Semantic Parsing

AKE-GNN: Effective Graph Learning with Adaptive Knowledge Exchange

no code implementations10 Jun 2021 Liang Zeng, Jin Xu, Zijun Yao, Yanqiao Zhu, Jian Li

In this paper, we propose to substitute these redundant channels with other informative channels to achieve this goal.

Graph Classification Graph Learning +4

Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making

1 code implementation ACL 2021 Zijun Yao, Chengjiang Li, Tiansi Dong, Xin Lv, Jifan Yu, Lei Hou, Juanzi Li, Yichi Zhang, Zelin Dai

Using a set of comparison features and a limited amount of annotated data, KAT Induction learns an efficient decision tree that can be interpreted by generating entity matching rules whose structure is advocated by domain experts.

Attribute Decision Making +2

Phenotypical Ontology Driven Framework for Multi-Task Learning

no code implementations4 Sep 2020 Mohamed Ghalwash, Zijun Yao, Prithwish Chakraborty, James Codella, Daby Sow

Despite the large number of patients in Electronic Health Records (EHRs), the subset of usable data for modeling outcomes of specific phenotypes are often imbalanced and of modest size.

Multi-Task Learning

ODVICE: An Ontology-Driven Visual Analytic Tool for Interactive Cohort Extraction

no code implementations13 May 2020 Mohamed Ghalwash, Zijun Yao, Prithwish Chakrabotry, James Codella, Daby Sow

Increased availability of electronic health records (EHR) has enabled researchers to study various medical questions.

Data Augmentation

Dimensional Reweighting Graph Convolution Networks

no code implementations25 Sep 2019 Xu Zou, Qiuye Jia, Jianwei Zhang, Chang Zhou, Zijun Yao, Hongxia Yang, Jie Tang

In this paper, we propose a method named Dimensional reweighting Graph Convolutional Networks (DrGCNs), to tackle the problem of variance between dimensional information in the node representations of GCNs.

Node Classification

Dynamic Word Embeddings for Evolving Semantic Discovery

2 code implementations2 Mar 2017 Zijun Yao, Yifan Sun, Weicong Ding, Nikhil Rao, Hui Xiong

Word evolution refers to the changing meanings and associations of words throughout time, as a byproduct of human language evolution.

Representation Learning Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.