Search Results for author: Chengjin Xu

Found 39 papers, 18 papers with code

Synthesize-on-Graph: Knowledgeable Synthetic Data Generation for Continue Pre-training of Large Language Models

no code implementations2 May 2025 Xuhui Jiang, Shengjie Ma, Chengjin Xu, Cehao Yang, Liyu Zhang, Jian Guo

SoG constructs a context graph by extracting entities and concepts from the original corpus, representing cross-document associations, and employing a graph walk strategy for knowledge-associated sampling.

Diversity Reading Comprehension +1

Financial Wind Tunnel: A Retrieval-Augmented Market Simulator

no code implementations23 Mar 2025 Bokai Cao, Xueyuan Lin, Yiyan Qi, Chengjin Xu, Cehao Yang, Jian Guo

To address this challenge, we propose Financial Wind Tunnel (FWT), a retrieval-augmented market simulator designed to generate controllable, reasonable, and adaptable market dynamics for model testing.

Retrieval

LongFaith: Enhancing Long-Context Reasoning in LLMs with Faithful Synthetic Data

1 code implementation18 Feb 2025 Cehao Yang, Xueyuan Lin, Chengjin Xu, Xuhui Jiang, Shengjie Ma, Aofan Liu, Hui Xiong, Jian Guo

Despite the growing development of long-context large language models (LLMs), data-centric approaches relying on synthetic data have been hindered by issues related to faithfulness, which limit their effectiveness in enhancing model performance on tasks such as long-context reasoning and question answering (QA).

Misinformation Question Answering

A Survey on LLM-as-a-Judge

2 code implementations23 Nov 2024 Jiawei Gu, Xuhui Jiang, Zhichao Shi, Hexiang Tan, Xuehao Zhai, Chengjin Xu, Wei Li, Yinghan Shen, Shengjie Ma, Honghao Liu, Saizhuo Wang, Kun Zhang, Yuanzhuo Wang, Wen Gao, Lionel Ni, Jian Guo

Accurate and consistent evaluation is crucial for decision-making across numerous fields, yet it remains a challenging task due to inherent subjectivity, variability, and scale.

Models Alignment Survey

Retrieval, Reasoning, Re-ranking: A Context-Enriched Framework for Knowledge Graph Completion

no code implementations12 Nov 2024 Muzhi Li, Cehao Yang, Chengjin Xu, Xuhui Jiang, Yiyan Qi, Jian Guo, Ho-fung Leung, Irwin King

Firstly, the Retrieval module gathers supporting triples from the KG, collects plausible candidate answers from a base embedding model, and retrieves context for each related entity.

Language Modeling Language Modelling +3

Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models

1 code implementation9 Nov 2024 XiaoJun Wu, Junxi Liu, Huanyi Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, Fengrui Hua, Jia Li, Jian Guo

As large language models become increasingly prevalent in the financial sector, there is a pressing need for a standardized method to comprehensively assess their performance.

Context-aware Inductive Knowledge Graph Completion with Latent Type Constraints and Subgraph Reasoning

1 code implementation22 Oct 2024 Muzhi Li, Cehao Yang, Chengjin Xu, Zixing Song, Xuhui Jiang, Jian Guo, Ho-fung Leung, Irwin King

With sufficient guidance from proper prompts and supervised fine-tuning, CATS activates the strong semantic understanding and reasoning capabilities of large language models to assess the existence of query triples, which consist of two modules.

Inductive knowledge graph completion

RuleRAG: Rule-guided retrieval-augmented generation with language models for question answering

1 code implementation15 Oct 2024 Zhongwu Chen, Chengjin Xu, Dingmin Wang, Zhen Huang, Yong Dou, Jian Guo

To address these issues, we propose Rule-Guided Retrieval-Augmented Generation with LMs, which explicitly introduces symbolic rules as demonstrations for in-context learning (RuleRAG-ICL) to guide retrievers to retrieve logically related documents in the directions of rules and uniformly guide generators to generate answers attributed by the guidance of the same set of rules.

In-Context Learning Instruction Following +3

ChartMoE: Mixture of Expert Connector for Advanced Chart Understanding

no code implementations5 Sep 2024 Zhengzhuo Xu, Bowen Qu, Yiyan Qi, Sinan Du, Chengjin Xu, Chun Yuan, Jian Guo

Combined with the vanilla connector, we initialize different experts in four distinct ways and adopt high-quality knowledge learning to further refine the MoE connector and LLM parameters.

Chart Understanding

MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced Reranking and Noise-injected Training

2 code implementations31 Jul 2024 Zhanpeng Chen, Chengjin Xu, Yiyan Qi, Jian Guo

Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in processing and generating content across multiple data modalities.

RAG Reranking +1

Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation

1 code implementation15 Jul 2024 Shengjie Ma, Chengjin Xu, Xuhui Jiang, Muzhi Li, Huaren Qu, Cehao Yang, Jiaxin Mao, Jian Guo

We conduct a series of well-designed experiments to highlight the following advantages of ToG-2: 1) ToG-2 tightly couples the processes of context retrieval and graph retrieval, deepening context retrieval via the KG while enabling reliable graph retrieval based on contexts; 2) it achieves deep and faithful reasoning in LLMs through an iterative knowledge retrieval process of collaboration between contexts and the KG; and 3) ToG-2 is training-free and plug-and-play compatible with various LLMs.

Information Retrieval Knowledge Graphs +6

Financial Knowledge Large Language Model

no code implementations29 Jun 2024 Cehao Yang, Chengjin Xu, Yiyan Qi

Secondly, we propose IDEA-FinKER, a Financial Knowledge Enhancement framework designed to facilitate the rapid adaptation of general LLMs to the financial domain, introducing a retrieval-based few-shot learning method for real-time context-level knowledge injection, and a set of high-quality financial knowledge instructions for fine-tuning any general LLM.

Few-Shot Learning Financial Analysis +5

Context Graph

no code implementations17 Jun 2024 Chengjin Xu, Muzhi Li, Cehao Yang, Xuhui Jiang, Lumingyuan Tang, Yiyan Qi, Jian Guo

Knowledge Graphs (KGs) are foundational structures in many AI applications, representing entities and their interrelations through triples.

Knowledge Graphs Question Answering

Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models

1 code implementation18 Mar 2024 Yi Luo, Zhenghao Lin, Yuhao Zhang, Jiashuo Sun, Chen Lin, Chengjin Xu, Xiangdong Su, Yelong Shen, Jian Guo, Yeyun Gong

Subsequently, the retrieval model correlates new inputs with relevant guidelines, which guide LLMs in response generation to ensure safe and high-quality outputs, thereby aligning with human values.

Response Generation Retrieval

Unlocking the Power of Large Language Models for Entity Alignment

1 code implementation23 Feb 2024 Xuhui Jiang, Yinghan Shen, Zhichao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, HuaWei Shen, Yuanzhuo Wang

To address the constraints of limited input KG data, ChatEA introduces a KG-code translation module that translates KG structures into a format understandable by LLMs, thereby allowing LLMs to utilize their extensive background knowledge to improve EA accuracy.

Code Translation Entity Alignment +2

ChartBench: A Benchmark for Complex Visual Reasoning in Charts

no code implementations26 Dec 2023 Zhengzhuo Xu, Sinan Du, Yiyan Qi, Chengjin Xu, Chun Yuan, Jian Guo

Multimodal Large Language Models (MLLMs) have shown impressive capabilities in image understanding and generation.

Visual Reasoning

Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph

3 code implementations15 Jul 2023 Jiashuo Sun, Chengjin Xu, Lumingyuan Tang, Saizhuo Wang, Chen Lin, Yeyun Gong, Lionel M. Ni, Heung-Yeung Shum, Jian Guo

Although large language models (LLMs) have achieved significant success in various tasks, they often struggle with hallucination problems, especially in scenarios requiring deep and responsible reasoning.

Hallucination Knowledge Graphs +3

Unveiling the Potential of Sentiment: Can Large Language Models Predict Chinese Stock Price Movements?

no code implementations25 Jun 2023 Haohan Zhang, Fengrui Hua, Chengjin Xu, Hao Kong, Ruiting Zuo, Jian Guo

The rapid advancement of Large Language Models (LLMs) has spurred discussions about their potential to enhance quantitative trading strategies.

Incorporating Structured Sentences with Time-enhanced BERT for Fully-inductive Temporal Relation Prediction

no code implementations10 Apr 2023 Zhongwu Chen, Chengjin Xu, Fenglong Su, Zhen Huang, Yong Dou

In the inductive setting where test TKGs contain emerging entities, the latest methods are based on symbolic rules or pre-trained language models (PLMs).

Relation Relation Prediction +1

Toward Practical Entity Alignment Method Design: Insights from New Highly Heterogeneous Knowledge Graph Datasets

1 code implementation7 Apr 2023 Xuhui Jiang, Chengjin Xu, Yinghan Shen, Yuanzhuo Wang, Fenglong Su, Fei Sun, Zixuan Li, Zhichao Shi, Jian Guo, HuaWei Shen

Firstly, we address the oversimplified heterogeneity settings of current datasets and propose two new HHKG datasets that closely mimic practical EA scenarios.

Entity Alignment Knowledge Graphs +1

Meta-Learning Based Knowledge Extrapolation for Temporal Knowledge Graph

no code implementations11 Feb 2023 Zhongwu Chen, Chengjin Xu, Fenglong Su, Zhen Huang, You Dou

Different from KGs and TKGs in the transductive setting, constantly emerging entities and relations in incomplete TKGs create demand to predict missing facts with unseen components, which is the extrapolation setting.

Knowledge Graph Embedding Knowledge Graphs +2

Ultrahyperbolic Knowledge Graph Embeddings

no code implementations1 Jun 2022 Bo Xiong, Shichao Zhu, Mojtaba Nayyeri, Chengjin Xu, Shirui Pan, Chuan Zhou, Steffen Staab

Recent knowledge graph (KG) embeddings have been advanced by hyperbolic geometry due to its superior capability for representing hierarchies.

Knowledge Graph Embeddings

Geometric Algebra based Embeddings for Static and Temporal Knowledge Graph Completion

no code implementations18 Feb 2022 Chengjin Xu, Mojtaba Nayyeri, Yung-Yu Chen, Jens Lehmann

In this work, we strive to move beyond the complex or hypercomplex space for KGE and propose a novel geometric algebra based embedding approach, GeomE, which uses multivector representations and the geometric product to model entities and relations.

Knowledge Graph Embeddings Link Prediction +2

Time-aware Relational Graph Attention Network for Temporal Knowledge Graph Embeddings

no code implementations29 Sep 2021 Chengjin Xu, Fenglong Su, Jens Lehmann

Embedding-based representation learning approaches for knowledge graphs (KGs) have been mostly designed for static data.

Entity Alignment Graph Attention +2

Multiple Run Ensemble Learning with Low-Dimensional Knowledge Graph Embeddings

1 code implementation11 Apr 2021 Chengjin Xu, Mojtaba Nayyeri, Sahar Vahdati, Jens Lehmann

For example, instead of training a model one time with a large embedding size of 1200, we repeat the training of the model 6 times in parallel with an embedding size of 200 and then combine the 6 separate models for testing while the overall numbers of adjustable parameters are same (6*200=1200) and the total memory footprint remains the same.

Ensemble Learning Knowledge Graph Completion +3

TeRo: A Time-aware Knowledge Graph Embedding via Temporal Rotation

2 code implementations COLING 2020 Chengjin Xu, Mojtaba Nayyeri, Fouad Alkhoury, Hamed Shariat Yazdi, Jens Lehmann

We show our proposed model overcomes the limitations of the existing KG embedding models and TKG embedding models and has the ability of learning and inferringvarious relation patterns over time.

Knowledge Graph Embedding Link Prediction +1

Knowledge Graph Embeddings in Geometric Algebras

no code implementations COLING 2020 Chengjin Xu, Mojtaba Nayyeri, Yung-Yu Chen, Jens Lehmann

Knowledge graph (KG) embedding aims at embedding entities and relations in a KG into a lowdimensional latent representation space.

Knowledge Graph Embeddings Knowledge Graphs +1

Temporal Knowledge Graph Embedding Model based on Additive Time Series Decomposition

2 code implementations18 Nov 2019 Chengjin Xu, Mojtaba Nayyeri, Fouad Alkhoury, Hamed Shariat Yazdi, Jens Lehmann

Moreover, considering the temporal uncertainty during the evolution of entity/relation representations over time, we map the representations of temporal KGs into the space of multi-dimensional Gaussian distributions.

Knowledge Graph Completion Knowledge Graph Embedding +5

Cannot find the paper you are looking for? You can Submit a new open access paper.