Search Results for author: Qi Liu

Found 364 papers, 163 papers with code

TraGraph-GS: Trajectory Graph-based Gaussian Splatting for Arbitrary Large-Scale Scene Rendering

no code implementations10 Jun 2025 Xiaohan Zhang, Sitong Wang, Yushen Yan, Yi Yang, Mingda Xu, Qi Liu

High-quality novel view synthesis for large-scale scenes presents a challenging dilemma in 3D computer vision.

Novel View Synthesis

Denoising Programming Knowledge Tracing with a Code Graph-based Tuning Adaptor

no code implementations7 Jun 2025 Weibo Gao, Qi Liu, Rui Li, Yuze Zhao, Hao Wang, Linan Yre, Fangzhou Yao, Zheng Zhang

However, current PKT studies primarily focus on the implicit relationship between code content and knowledge assessment, often overlooking two types of noise signals in long-term programming activities: unwanted signals from unrelated submissions and weak signals from minor modifications.

Denoising Knowledge Tracing +2

Are LLMs Reliable Translators of Logical Reasoning Across Lexically Diversified Contexts?

1 code implementation5 Jun 2025 Qingchuan Li, Jiatong Li, Zirui Liu, Mingyue Cheng, Yuting Zeng, Qi Liu, Tongxuan Liu

Building directly on the deficiencies identified through our benchmark, we propose a new method, MenTaL, to address this limitation.

Formal Logic In-Context Learning +1

CogMath: Assessing LLMs' Authentic Mathematical Ability from a Human Cognitive Perspective

no code implementations4 Jun 2025 Jiayu Liu, Zhenya Huang, Wei Dai, Cheng Cheng, Jinze Wu, Jing Sha, Song Li, Qi Liu, Shijin Wang, Enhong Chen

Although large language models (LLMs) show promise in solving complex mathematical tasks, existing evaluation paradigms rely solely on a coarse measure of overall answer accuracy, which are insufficient for assessing their authentic capabilities.

Large Language Models Can Achieve Explainable and Training-Free One-shot HRRP ATR

no code implementations3 Jun 2025 Lingfeng Chen, Panhe Hu, Zhiliang Pan, Qi Liu, Zhen Liu

This letter introduces a pioneering, training-free and explainable framework for High-Resolution Range Profile (HRRP) automatic target recognition (ATR) utilizing large-scale pre-trained Large Language Models (LLMs).

In-Context Learning

MGS3: A Multi-Granularity Self-Supervised Code Search Framework

no code implementations30 May 2025 Rui Li, Junfeng Kang, Qi Liu, Liyang He, Zheng Zhang, Yunhao Sha, Linbo Zhu, Zhenya Huang

Subsequently, we introduce a novel Multi-Granularity Self-Supervised contrastive learning code Search framework (MGS$^{3}$}).

Code Search Contrastive Learning +1

Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting

1 code implementation30 May 2025 Jiahao Wang, Mingyue Cheng, Qi Liu

Time series forecasting (TSF) is a fundamental and widely studied task, spanning methods from classical statistical approaches to modern deep learning and multimodal language modeling.

Language Modeling Language Modelling +2

Improving Time Series Forecasting via Instance-aware Post-hoc Revision

no code implementations29 May 2025 Zhiding Liu, Mingyue Cheng, Guanhao Zhao, Jiqian Yang, Qi Liu, Enhong Chen

Time series forecasting plays a vital role in various real-world applications and has attracted significant attention in recent decades.

Time Series Time Series Forecasting

WDMIR: Wavelet-Driven Multimodal Intent Recognition

no code implementations27 May 2025 Weiyin Gong, Kai Zhang, Yanghai Zhang, Qi Liu, Xinjie Sun, Junyu Lu, Linbo Zhu

Multimodal intent recognition (MIR) seeks to accurately interpret user intentions by integrating verbal and non-verbal information across video, audio and text modalities.

Multimodal Intent Recognition

Self-Reflective Planning with Knowledge Graphs: Enhancing LLM Reasoning Reliability for Question Answering

no code implementations26 May 2025 Jiajun Zhu, Ye Liu, Meikai Bao, Kai Zhang, Yanghai Zhang, Qi Liu

Recently, large language models (LLMs) have demonstrated remarkable capabilities in natural language processing tasks, yet they remain prone to hallucinations when reasoning with insufficient internal knowledge.

Knowledge Graphs Question Answering

Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models

no code implementations23 May 2025 Zekai Zhao, Qi Liu, Kun Zhou, Zihan Liu, Yifei Shao, Zhiting Hu, Biwei Huang

Despite the remarkable reasoning performance, eliciting the long chain-of-thought (CoT) ability in large language models (LLMs) typically requires costly reinforcement learning or supervised fine-tuning on high-quality distilled data.

parameter-efficient fine-tuning

Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting

1 code implementation20 May 2025 Hao Feng, Shu Wei, Xiang Fei, Wei Shi, Yingdong Han, Lei Liao, Jinghui Lu, Binghong Wu, Qi Liu, Chunhui Lin, Jingqun Tang, Hao liu, Can Huang

Document image parsing is challenging due to its complexly intertwined elements such as text paragraphs, figures, formulas, and tables.

16k

Know3-RAG: A Knowledge-aware RAG Framework with Adaptive Retrieval, Generation, and Filtering

1 code implementation19 May 2025 Xukai Liu, Ye Liu, Shiwen Wu, Yanghai Zhang, Yihao Yuan, Kai Zhang, Qi Liu

Recent advances in large language models (LLMs) have led to impressive progress in natural language generation, yet their tendency to produce hallucinated or unsubstantiated content remains a critical concern.

Knowledge Graphs RAG +3

Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression

no code implementations18 May 2025 Jingyu Peng, Maolin Wang, Nan Wang, Xiangyu Zhao, Jiatong Li, Kai Zhang, Qi Liu

Despite substantial advancements in aligning large language models (LLMs) with human values, current safety mechanisms remain susceptible to jailbreak attacks.

WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?

no code implementations16 May 2025 An-Lan Wang, Jingqun Tang, Liao Lei, Hao Feng, Qi Liu, Xiang Fei, Jinghui Lu, Han Wang, Weiwei Liu, Hao liu, Yuliang Liu, Xiang Bai, Can Huang

However, prevailing benchmarks like DocVQA and ChartQA predominantly comprise \textit{scanned or digital} documents, inadequately reflecting the intricate challenges posed by diverse real-world scenarios, such as variable illumination and physical distortions.

document understanding

Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering

no code implementations15 May 2025 Yangfu Li, Hongjian Zhan, Tianyi Chen, Qi Liu, Yue Lu

Existing visual token pruning methods target prompt alignment and visual preservation with static strategies, overlooking the varying relative importance of these objectives across tasks, which leads to inconsistent performance.

SAEN-BGS: Energy-Efficient Spiking AutoEncoder Network for Background Subtraction

no code implementations12 May 2025 Zhixuan Zhang, Xiaopeng Li, Qi Liu

Background subtraction (BGS) is utilized to detect moving objects in a video and is commonly employed at the onset of object tracking and human recognition processes.

Object Tracking

Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving

no code implementations7 May 2025 Qi Liu, Xinhao Zheng, Renqiu Xia, Xingzhi Qi, Qinxiang Cao, Junchi Yan

As a seemingly self-explanatory task, problem-solving has been a significant component of science and engineering.

Automated Theorem Proving

am-ELO: A Stable Framework for Arena-based LLM Evaluation

no code implementations6 May 2025 Zirui Liu, Jiatong Li, Yan Zhuang, Qi Liu, Shuanghong Shen, Jie Ouyang, Mingyue Cheng, Shijin Wang

Arena-based evaluation is a fundamental yet significant evaluation paradigm for modern AI models, especially large language models (LLMs).

GraphPrompter: Multi-stage Adaptive Prompt Optimization for Graph In-Context Learning

1 code implementation4 May 2025 Rui Lv, Zaixi Zhang, Kai Zhang, Qi Liu, Weibo Gao, Jiawei Liu, Jiaxia Yan, Linan Yue, Fangzhou Yao

Graph In-Context Learning, with the ability to adapt pre-trained graph models to novel and diverse downstream graphs without updating any parameters, has gained much attention in the community.

In-Context Learning

TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos

2 code implementations24 Apr 2025 Linli Yao, Yicheng Li, Yuancheng Wei, Lei LI, Shuhuai Ren, Yuanxin Liu, Kun Ouyang, Lean Wang, Shicheng Li, Sida Li, Lingpeng Kong, Qi Liu, Yuanxing Zhang, Xu sun

Remarkably, our experiments demonstrate that DTD achieves an 82. 8% reduction in video tokens while maintaining 98% performance on StreamingBench, revealing that over 80% of visual content in streaming videos is naturally redundant without requiring language guidance.

MME Video MME +1

Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey

1 code implementation21 Apr 2025 Aoran Gan, Hao Yu, Kai Zhang, Qi Liu, Wenyu Yan, Zhenya Huang, Shiwei Tong, Guoping Hu

Recent advancements in Retrieval-Augmented Generation (RAG) have revolutionized natural language processing by integrating Large Language Models (LLMs) with external information retrieval, enabling accurate, up-to-date, and verifiable text generation across diverse applications.

Computational Efficiency Information Retrieval +5

Metamon-GS: Enhancing Representability with Variance-Guided Densification and Light Encoding

no code implementations20 Apr 2025 Junyan Su, Baozhu Zhao, Xiaohan Zhang, Qi Liu

The introduction of 3D Gaussian Splatting (3DGS) has advanced novel view synthesis by utilizing Gaussians to represent scenes.

3DGS Novel View Synthesis

PSG-MAE: Robust Multitask Sleep Event Monitoring using Multichannel PSG Reconstruction and Inter-channel Contrastive Learning

no code implementations17 Apr 2025 Yifei Wang, Qi Liu, Fuli Min, Honghao Wang

When the encoder pre-trained through PSG-MAE is fine-tuned with downstream feature decomposition networks, it achieves an accuracy of 83. 7% for sleep staging and 90. 45% for detecting obstructive sleep apnea, which highlights the framework's robustness and broad applicability.

Contrastive Learning Self-Supervised Learning +1

CM3AE: A Unified RGB Frame and Event-Voxel/-Frame Pre-training Framework

1 code implementation17 Apr 2025 Wentao Wu, Xiao Wang, Chenglong Li, Bo Jiang, Jin Tang, Bin Luo, Qi Liu

Event cameras have attracted increasing attention in recent years due to their advantages in high dynamic range, high temporal resolution, low power consumption, and low latency.

Contrastive Learning

MiMu: Mitigating Multiple Shortcut Learning Behavior of Transformers

no code implementations14 Apr 2025 Lili Zhao, Qi Liu, Wei Chen, Liyi Chen, Ruijun Sun, Min Hou, Yang Wang, Shijin Wang

Then, we further design self-improvement strategy in target model to reduce the reliance on multiple shortcuts.

The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation

1 code implementation11 Apr 2025 Zheng Zhang, Ning li, Qi Liu, Rui Li, Weibo Gao, Qingyang Mao, Zhenya Huang, Baosheng Yu, DaCheng Tao

By referencing this external knowledge, RAG effectively reduces the generation of factually incorrect content and addresses hallucination issues within LLMs.

Fairness Hallucination +3

How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective

1 code implementation10 Apr 2025 Qi Liu, Jiaxin Mao, Ji-Rong Wen

Recent studies have shown that large language models (LLMs) can assess relevance and support information retrieval (IR) tasks such as document ranking and relevance judgment generation.

Document Ranking Information Retrieval

Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

no code implementations10 Apr 2025 ByteDance Seed, :, Jiaze Chen, Tiantian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu, Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang, Ruofei Zhu, Zhecheng An, Zhihao Bai, Yu Bao, Xingyan Bin, Jiangjie Chen, Feng Chen, Hongmin Chen, Riwei Chen, Liangqiang Chen, Zixin Chen, Jinsong Chen, Siyan Chen, Kaiyuan Chen, Zhi Chen, Jin Chen, Jiecao Chen, Jinxin Chi, Weinan Dai, Ning Dai, Jiahui Dai, Shihan Dou, Yantao Du, Zhengyin Du, Jianhui Duan, Chen Dun, Ting-Han Fan, Jiazhan Feng, Junda Feng, Ziyuan Feng, Yuwei Fu, Wenqi Fu, Hanjie Fu, Hao Ge, Hongyi Guo, Mingji Han, Li Han, Wenhao Hao, Xintong Hao, Qianyu He, Jerry He, Feng He, Wen Heng, Zehua Hong, Qi Hou, Liang Hu, Shengding Hu, Nan Hu, Kai Hua, Qi Huang, Ziyue Huang, Hongzhi Huang, Zihao Huang, Ting Huang, Wenhao Huang, Wei Jia, Bin Jia, Xiaoying Jia, Yuhua Jiang, Haobin Jiang, Ziheng Jiang, Kaihua Jiang, Chengquan Jiang, Jianpeng Jiao, Xiaoran Jin, Xing Jin, Xunhao Lai, Xiang Li, Liyi Li, Hongkai Li, Zheng Li, Shengxian Wan, Ya Wang, Yunshui Li, Chenggang Li, Niuniu Li, Siyu Li, Xi Li, Xiao Li, Aoyan Li, Yuntao Li, Nianning Liang, Xinnian Liang, Haibin Lin, Weijian Lin, Ye Lin, Zhicheng Liu, Guanlin Liu, Chenxiao Liu, Yan Liu, Gaohong Liu, Juncai Liu, Chundian Liu, Deyi Liu, Kaibo Liu, Siyao Liu, Qi Liu, Yongfei Liu, Kang Liu, Gan Liu, Boyi Liu, Rui Long, Weiqiang Lou, Chenwei Lou, Xiang Luo, Yao Luo, Caiping Lv, Heyang Lv, Bole Ma, Qianli Ma, Hongzhi Ma, Yiyuan Ma, Jin Ma, Wenchang Ma, Tingting Ma, Chen Mao, Qiyang Min, Zhe Nan, Guanghan Ning, Jinxiang Ou, Haojie Pan, Renming Pang, Yanghua Peng, Tao Peng, Lihua Qian, Mu Qiao, Meng Qu, Cheng Ren, Hongbin Ren, Yong Shan, Wei Shen, Ke Shen, Kai Shen, Guangming Sheng, Jinlong Shi, Wenlei Shi, Guang Shi, Shuai Shuai Cao, Yuxin Song, Zuquan Song, Jing Su, Yifan Sun, Tao Sun, Zewei Sun, Borui Wan, Xiaohui Wang, Xi Wang, Shuguang Wang, Jun Wang, Qinlong Wang, Chenyuan Wang, Shuai Wang, Zihan Wang, Changbao Wang, Jiaqiang Wang, Shihang Wang, Xuwu Wang, Zaiyuan Wang, Yuxuan Wang, Wenqi Wang, Taiqing Wang, Chengzhi Wei, Houmin Wei, Ziyun Wei, Shufa Wei, Zheng Wu, Yonghui Wu, Yangjun Wu, Bohong Wu, Shuang Wu, Jingqiao Wu, Ning Wu, Shuangzhi Wu, Jianmin Wu, Chenguang Xi, Fan Xia, Yuqiao Xian, Liang Xiang, Boren Xiang, Bowen Xiao, Zhen Xiao, Xia Xiao, Yongsheng Xiao, Chao Xin, Shulin Xin, Yuwen Xiong, Jingjing Xu, Ziwen Xu, Chenyin Xu, Jiayi Xu, Yifan Xu, Wei Xu, Yufei Xu, Shikun Xu, Shipeng Yan, Shen Yan, Qingping Yang, Xi Yang, Tianhao Yang, Yuehang Yang, Yuan Yang, Ximing Yang, Zeyu Yang, Guang Yang, Yifan Yang, Xuesong Yao, Bairen Yi, Fan Yin, Jianian Yin, Ziqiang Ying, Xiangyu Yu, Hongli Yu, Song Yu, Menghan Yu, Huan Yu, Siyu Yuan, Jun Yuan, Yutao Zeng, Tianyang Zhan, Zheng Zhang, Yun Zhang, Mofan Zhang, Wang Zhang, Ru Zhang, Zhi Zhang, Tianqi Zhang, Xinyi Zhang, Zhexi Zhang, Sijun Zhang, Wenqiang Zhang, Xiangxiang Zhang, Yongtao Zhang, Yuyu Zhang, Ge Zhang, He Zhang, Yue Zhang, Renjie Zheng, Ningxin Zheng, Zhuolin Zheng, Yaowei Zheng, Chen Zheng, Xiaoyun Zhi, Wanjun Zhong, Cheng Zhong, Zheng Zhong, Baoquan Zhong, Xun Zhou, Na Zhou, Huan Zhou, Hang Zhu, Defa Zhu, Wenjia Zhu, Lei Zuo

We introduce Seed1. 5-Thinking, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks.

Mixture-of-Experts reinforcement-learning +1

LLM4Ranking: An Easy-to-use Framework of Utilizing Large Language Models for Document Reranking

1 code implementation10 Apr 2025 Qi Liu, Haozhe Duan, Yiqun Chen, Quanfeng Lu, Weiwei Sun, Jiaxin Mao

Utilizing large language models (LLMs) for document reranking has been a popular and promising research direction in recent years, many studies are dedicated to improving the performance and efficiency of using LLMs for reranking.

Reranking Retrieval-augmented Generation

HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation

no code implementations CVPR 2025 Kun Liu, Qi Liu, Xinchen Liu, Jie Li, Yongdong Zhang, Jiebo Luo, Xiaodong He, Wu Liu

However, human-object interaction (HOI) often cannot be precisely generated by current T2V models due to the lack of large-scale videos with accurate captions for HOI.

Hallucination Human-Object Interaction Detection +2

Enhancing Knowledge Graph Completion with Entity Neighborhood and Relation Context

no code implementations29 Mar 2025 Jianfang Chen, Kai Zhang, Aoran Gan, Shiwei Tong, Shuanghong Shen, Qi Liu

Knowledge Graph Completion (KGC) aims to infer missing information in Knowledge Graphs (KGs) to address their inherent incompleteness.

Relation

TEMPLE:Temporal Preference Learning of Video LLMs via Difficulty Scheduling and Pre-SFT Alignment

1 code implementation21 Mar 2025 Shicheng Li, Lei LI, Kun Ouyang, Shuhuai Ren, Yuanxin Liu, Yuanxing Zhang, Fuzheng Zhang, Lingpeng Kong, Qi Liu, Xu sun

We further analyze the transferability of DPO data across architectures and the role of difficulty scheduling in optimization.

Scheduling

Hierarchical Reinforcement Learning for Safe Mapless Navigation with Congestion Estimation

no code implementations15 Mar 2025 Jianqi Gao, Xizheng Pang, Qi Liu, YanJie Li

Specifically, to enhance the robot's environmental perception, we introduce a new obstacle encoding method that evaluates the impact of obstacles on the robot's motion planning.

Hierarchical Reinforcement Learning Motion Planning +3

OpenVidVRD: Open-Vocabulary Video Visual Relation Detection via Prompt-Driven Semantic Space Alignment

no code implementations12 Mar 2025 Qi Liu, Weiying Xue, Yuxiao Wang, Zhenao Wei

The video visual relation detection (VidVRD) task is to identify objects and their relationships in videos, which is challenging due to the dynamic content, high annotation costs, and long-tailed distribution of relations.

Prompt Learning Relation +1

A Survey on Knowledge-Oriented Retrieval-Augmented Generation

no code implementations11 Mar 2025 Mingyue Cheng, Yucong Luo, Jie Ouyang, Qi Liu, Huijie Liu, Li Li, Shuo Yu, Bohou Zhang, Jiawei Cao, Jie Ma, Daoyu Wang

Retrieval-Augmented Generation (RAG) has gained significant attention in recent years for its potential to enhance natural language understanding and generation by combining large-scale retrieval systems with generative models.

Information Retrieval Natural Language Understanding +5

Large Language Model Guided Progressive Feature Alignment for Multimodal UAV Object Detection

no code implementations10 Mar 2025 Wentao Wu, Chenglong Li, Xiao Wang, Bin Luo, Qi Liu

To address this problem, we propose a Large Language Model (LLM) guided Progressive feature Alignment Network called LPANet, which leverages the semantic features extracted from a large language model to guide the progressive semantic and spatial alignment between modalities for multimodal UAV object detection.

Language Modeling Language Modelling +4

MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality

1 code implementation4 Mar 2025 Shuaike Li, Kai Zhang, Qi Liu, Enhong Chen

Knowledge editing is a technique for efficiently and accurately updating the knowledge of large language models (LLMs) to alleviate obsolescence and correct errors.

knowledge editing

HoH: A Dynamic Benchmark for Evaluating the Impact of Outdated Information on Retrieval-Augmented Generation

no code implementations3 Mar 2025 Jie Ouyang, Tingyue Pan, Mingyue Cheng, Ruiran Yan, Yucong Luo, Jiaying Lin, Qi Liu

While Retrieval-Augmented Generation (RAG) has emerged as an effective approach for addressing the knowledge outdating problem in Large Language Models (LLMs), it faces a critical challenge: the prevalence of outdated information in knowledge bases.

RAG Retrieval +1

PCE-GAN: A Generative Adversarial Network for Point Cloud Attribute Quality Enhancement based on Optimal Transport

no code implementations26 Feb 2025 Tian Guo, Hui Yuan, Qi Liu, Honglei Su, Raouf Hamzaoui, Sam Kwong

Point cloud compression significantly reduces data volume but sacrifices reconstruction quality, highlighting the need for advanced quality enhancement techniques.

Attribute Generative Adversarial Network +2

Entailment-Preserving First-order Logic Representations in Natural Language Entailment

no code implementations24 Feb 2025 Jinu Lee, Qi Liu, Runzhi Ma, Vincent Han, Ziqi Wang, Heng Ji, Julia Hockenmaier

To this extent, we propose a training method specialized for the task, iterative learning-to-rank, which directly optimizes the model's EPR score through a novel scoring function and a learning-to-rank objective.

Diversity Learning-To-Rank

Geometry-Aware 3D Salient Object Detection Network

no code implementations23 Feb 2025 Chen Wang, Liyuan Zhang, Le Hui, Qi Liu, Yuchao Dai

In this paper, we propose a geometry-aware 3D salient object detection network that explicitly clusters points into superpoints to enhance the geometric boundaries of objects, thereby segmenting complete objects with clear boundaries.

Object object-detection +2

SentiFormer: Metadata Enhanced Transformer for Image Sentiment Analysis

1 code implementation21 Feb 2025 Bin Feng, Shulan Ruan, Mingzheng Yang, Dongxuan Han, Huijie Liu, Kai Zhang, Qi Liu

As more and more internet users post images online to express their daily emotions, image sentiment analysis has attracted increasing attention.

Sentiment Analysis

Chinese Spelling Correction: A Comprehensive Survey of Progress, Challenges, and Opportunities

no code implementations17 Feb 2025 Changchun Liu, Kai Zhang, Junzhe Jiang, Zixiao Kong, Qi Liu, Enhong Chen

Chinese Spelling Correction (CSC) is a critical task in natural language processing, aimed at detecting and correcting spelling errors in Chinese text.

Spelling Correction Survey

Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment

1 code implementation17 Feb 2025 Yuze Zhao, Tianyun Ji, Wenjun Feng, Zhenya Huang, Qi Liu, Zhiding Liu, Yixiao Ma, Kai Zhang, Enhong Chen

In this paper, we introduce such a novel task, code reasoning, to provide a new perspective for the reasoning abilities of LLMs.

Hallucination Logical Reasoning

Fast Underwater Scene Reconstruction using Multi-View Stereo and Physical Imaging

no code implementations21 Jan 2025 Shuyi Hu, Qi Liu

To address these limitations, we propose a novel method that integrates Multi-View Stereo (MVS) with a physics-based underwater image formation model.

Depth Estimation NeRF

DASKT: A Dynamic Affect Simulation Method for Knowledge Tracing

no code implementations18 Jan 2025 Xinjie Sun, Kai Zhang, Qi Liu, Shuanghong Shen, Fei Wang, Yuxiang Guo, Enhong Chen

Knowledge Tracing (KT) predicts future performance by modeling students' historical interactions, and understanding students' affective states can enhance the effectiveness of KT, thereby improving the quality of education.

Knowledge Tracing Time Series Analysis

Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems

1 code implementation17 Jan 2025 Weibo Gao, Qi Liu, Linan Yue, Fangzhou Yao, Rui Lv, Zheng Zhang, Hao Wang, Zhenya Huang

Personalized learning represents a promising educational strategy within intelligent educational systems, aiming to enhance learners' practice efficiency.

Response Generation

VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

no code implementations CVPR 2025 Lei LI, Yuancheng Wei, Zhihui Xie, Xuqing Yang, YiFan Song, Peiyi Wang, Chenxin An, Tianyu Liu, Sujian Li, Bill Yuchen Lin, Lingpeng Kong, Qi Liu

Vision-language generative reward models (VL-GenRMs) play a crucial role in aligning and evaluating multimodal AI systems, yet their own evaluation remains under-explored.

Hallucination

Navigating Data Corruption in Machine Learning: Balancing Quality, Quantity, and Imputation Strategies

1 code implementation24 Dec 2024 Qi Liu, Wanjing Ma

Data corruption, including missing and noisy data, poses significant challenges in real-world machine learning.

Deep Reinforcement Learning Imputation +1

TextMatch: Enhancing Image-Text Consistency Through Multimodal Optimization

no code implementations24 Dec 2024 Yucong Luo, Mingyue Cheng, Jie Ouyang, Xiaoyu Tao, Qi Liu

Text-to-image generative models excel in creating images from text but struggle with ensuring alignment and consistency between outputs and prompts.

In-Context Learning Question Answering +1

GIRAFFE: Design Choices for Extending the Context Length of Visual Language Models

1 code implementation17 Dec 2024 Mukai Li, Lei LI, Shansan Gong, Qi Liu

Towards this goal, we make the best design choice through extensive experiment settings from data curation to context window extending and utilizing: (1) we analyze data sources and length distributions to construct ETVLM - a data recipe to balance the performance across scenarios; (2) we examine existing position extending methods, identify their limitations and propose M-RoPE++ as an enhanced approach; we also choose to solely instruction-tune the backbone with mixed-source data; (3) we discuss how to better utilize extended context windows and propose hybrid-resolution training.

Long-range modeling

Stepwise Reasoning Error Disruption Attack of LLMs

no code implementations16 Dec 2024 Jingyu Peng, Maolin Wang, Xiangyu Zhao, Kai Zhang, Wanyu Wang, Pengyue Jia, Qidong Liu, Ruocheng Guo, Qi Liu

Large language models (LLMs) have made remarkable strides in complex reasoning tasks, but their safety and robustness in reasoning processes remain underexplored.

Toy-GS: Assembling Local Gaussians for Precisely Rendering Large-Scale Free Camera Trajectories

no code implementations13 Dec 2024 Xiaohan Zhang, Zhenyu Sun, Yukui Qiu, Junyan Su, Qi Liu

Currently, 3D rendering for large-scale free camera trajectories, namely, arbitrary input camera trajectories, poses significant challenges: 1) The distribution and observation angles of the cameras are irregular, and various types of scenes are included in the free trajectories; 2) Processing the entire point cloud and all images at once for large-scale scenes requires a substantial amount of GPU memory.

Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis

1 code implementation13 Dec 2024 Jinhao Lin, Yifei Wang, Yanwu Xu, Qi Liu

Despite multimodal sentiment analysis being a fertile research ground that merits further investigation, current approaches take up high annotation cost and suffer from label ambiguity, non-amicable to high-quality labeled data acquisition.

Multimodal Sentiment Analysis

Conformal Prediction on Quantifying Uncertainty of Dynamic Systems

no code implementations12 Dec 2024 Aoming Liang, Qi Liu, Lei Xu, Fahad Sohrab, Weicheng Cui, Changhui Song, Moncef Gabbouj

Our motivation is to introduce conformal prediction into the uncertainty assessment of dynamical systems, providing a method supported by theoretical guarantees.

Conformal Prediction Operator learning +2

PoTable: Towards Systematic Thinking via Stage-oriented Plan-then-Execute Reasoning on Tables

no code implementations5 Dec 2024 Qingyang Mao, Qi Liu, Zhi Li, Mingyue Cheng, Zheng Zhang, Rui Li

In recent years, table reasoning has garnered substantial research interest, particularly its integration with Large Language Models (LLMs) which revolutionize natural language applications.

Code Generation Large Language Model

VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

no code implementations26 Nov 2024 Lei LI, Yuancheng Wei, Zhihui Xie, Xuqing Yang, YiFan Song, Peiyi Wang, Chenxin An, Tianyu Liu, Sujian Li, Bill Yuchen Lin, Lingpeng Kong, Qi Liu

Vision-language generative reward models (VL-GenRMs) play a crucial role in aligning and evaluating multimodal AI systems, yet their own evaluation remains under-explored.

Hallucination

TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models

1 code implementation24 Nov 2024 Jiahao Wang, Mingyue Cheng, Qingyang Mao, Qi Liu, Feiyang Xu, Xin Li, Enhong Chen

Despite their effectiveness, we reveal that these methods conceal three inherent bottlenecks: (1) they struggle to encode temporal and channel-specific information in a lossless manner, both of which are critical components of multivariate time series; (2) it is much difficult to align the learned representation space with the semantic space of the LLMs; (3) they require task-specific retraining, which is both computationally expensive and labor-intensive.

Problem Decomposition Time Series +2

Optimizing Student Ability Assessment: A Hierarchy Constraint-Aware Cognitive Diagnosis Framework for Educational Contexts

no code implementations21 Nov 2024 Xinjie Sun, Qi Liu, Kai Zhang, Shuanghong Shen, Fei Wang, Yan Zhuang, Zheng Zhang, Weiyin Gong, Shijin Wang, Lina Yang, Xingying Huo

To address this, we propose the Hierarchy Constraint-Aware Cognitive Diagnosis Framework (HCD), designed to more accurately represent student ability performance within real educational contexts.

cognitive diagnosis Diagnostic +1

MDHP-Net: Detecting an Emerging Time-exciting Threat in IVN

no code implementations15 Nov 2024 Qi Liu, Yanchen Liu, Ruifeng Li, Chenhong Cao, Yufeng Li, Xingyu Li, Peng Wang, Runhan Feng, Shiyang Bu

We systematically analyze the characteristics of the threat: dynamism, time-exciting impact, and low prior knowledge dependency.

Diagnostic

3D Focusing-and-Matching Network for Multi-Instance Point Cloud Registration

1 code implementation12 Nov 2024 Liyuan Zhang, Le Hui, Qi Liu, Bo Li, Yuchao Dai

Multi-instance point cloud registration aims to estimate the pose of all instances of a model point cloud in the whole scene.

Object Point Cloud Registration

Collaborative Cognitive Diagnosis with Disentangled Representation Learning for Learner Modeling

1 code implementation4 Nov 2024 Weibo Gao, Qi Liu, Linan Yue, Fangzhou Yao, Hao Wang, Yin Gu, Zheng Zhang

Motivated by the success of collaborative modeling in various domains, such as recommender systems, we aim to investigate how collaborative signals among learners contribute to the diagnosis of human cognitive states (i. e., knowledge proficiency) in the context of intelligent education.

cognitive diagnosis Disentanglement +1

LE-PDE++: Mamba for accelerating PDEs Simulations

no code implementations4 Nov 2024 Aoming Liang, Zhaoyang Mu, Qi Liu, Ruipeng Li, Mingming Ge, Dixia Fan

Partial Differential Equations are foundational in modeling science and natural systems such as fluid dynamics and weather forecasting.

Deep Learning Mamba +1

DisenTS: Disentangled Channel Evolving Pattern Modeling for Multivariate Time Series Forecasting

no code implementations30 Oct 2024 Zhiding Liu, Jiqian Yang, Qingyang Mao, Yuze Zhao, Mingyue Cheng, Zhi Li, Qi Liu, Enhong Chen

To this end, we propose DisenTS, a tailored framework for modeling disentangled channel evolving patterns in general multivariate time series forecasting.

Multivariate Time Series Forecasting Time Series

Leveraging LLMs for Hypothetical Deduction in Logical Inference: A Neuro-Symbolic Approach

1 code implementation29 Oct 2024 Qingchuan Li, Jiatong Li, Tongxuan Liu, Yuting Zeng, Mingyue Cheng, Weizhe Huang, Qi Liu

Large Language Models (LLMs) have exhibited remarkable potential across a wide array of reasoning tasks, including logical reasoning.

Logical Reasoning

RecFlow: An Industrial Full Flow Recommendation Dataset

1 code implementation28 Oct 2024 Qi Liu, Kai Zheng, Rui Huang, Wuchao Li, Kuo Cai, Yuan Chai, Yanan Niu, Yiqun Hui, Bing Han, Na Mou, Hongning Wang, Wentian Bao, Yunen Yu, Guorui Zhou, Han Li, Yang song, Defu Lian, Kun Gai

Industrial recommendation systems (RS) rely on the multi-stage pipeline to balance effectiveness and efficiency when delivering items from a vast corpus to users.

Recommendation Systems Selection bias

Do LLMs Overcome Shortcut Learning? An Evaluation of Shortcut Challenges in Large Language Models

1 code implementation17 Oct 2024 Yu Yuan, Lili Zhao, Kai Zhang, Guangting Zheng, Qi Liu

3) Chain-of-thought prompting notably reduces shortcut reliance and outperforms other prompting strategies, while few-shot prompts generally underperform compared to zero-shot prompts.

In-Context Learning

Understanding the Role of LLMs in Multimodal Evaluation Benchmarks

1 code implementation16 Oct 2024 Botian Jiang, Lei LI, Xiaonan Li, Zhaowei Li, Xiachong Feng, Lingpeng Kong, Qi Liu, Xipeng Qiu

The rapid advancement of Multimodal Large Language Models (MLLMs) has been accompanied by the development of various benchmarks to evaluate their capabilities.

Benchmarking Large Language Model +2

DeltaDock: A Unified Framework for Accurate, Efficient, and Physically Reliable Molecular Docking

1 code implementation15 Oct 2024 Jiaxian Yan, Zaixi Zhang, Jintao Zhu, Kai Zhang, Jianfeng Pei, Qi Liu

Despite these advancements, current methods are often tailored for specific docking settings, and limitations such as the neglect of protein side-chain structures, difficulties in handling large binding pockets, and challenges in predicting physically valid structures exist.

Blind Docking Drug Design +2

The Epochal Sawtooth Effect: Unveiling Training Loss Oscillations in Adam and Other Optimizers

1 code implementation14 Oct 2024 Qi Liu, Wanjing Ma

In this paper, we identify and analyze a recurring training loss pattern, which we term the \textit{Epochal Sawtooth Effect (ESE)}, commonly observed during training with adaptive gradient-based optimizers, particularly Adam optimizer.

VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment

no code implementations12 Oct 2024 Lei LI, Zhihui Xie, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong, Qi Liu

As large vision-language models (LVLMs) evolve rapidly, the demand for high-quality and diverse data to align these models becomes increasingly crucial.

Diversity Hallucination +3

Perceptual Quality Assessment of Trisoup-Lifting Encoded 3D Point Clouds

1 code implementation9 Oct 2024 Juncheng Long, Honglei Su, Qi Liu, Hui Yuan, Wei Gao, Jiarun Song, Zhou Wang

In addition, this work establishes a database named WPC6. 0, the first and largest PCQA database dedicated to Trisoup-Lifting encoding mode, encompassing 400 distorted point clouds with both 4 geometric multiplied by 5 texture distortion levels.

Point Cloud Quality Assessment Quantization

CursorCore: Assist Programming through Aligning Anything

1 code implementation9 Oct 2024 Hao Jiang, Qi Liu, Rui Li, Shengyu Ye, Shijin Wang

In this work, we propose a new conversational framework that comprehensively integrates these information sources, collect data to train our models and evaluate their performance.

Code Completion

Learning Recommender Systems with Soft Target: A Decoupled Perspective

1 code implementation9 Oct 2024 Hao Zhang, Mingyue Cheng, Qi Liu, Yucong Luo, Rui Li, Enhong Chen

Learning recommender systems with multi-class optimization objective is a prevalent setting in recommendation.

Recommendation Systems

Diffusion Auto-regressive Transformer for Effective Self-supervised Time Series Forecasting

3 code implementations8 Oct 2024 Daoyu Wang, Mingyue Cheng, Zhiding Liu, Qi Liu, Enhong Chen

Self-supervised learning has become a popular and effective approach for enhancing time series forecasting, enabling models to learn universal representations from unlabeled data.

Decoder Denoising +3

Temporal Reasoning Transfer from Text to Video

no code implementations8 Oct 2024 Lei LI, Yuanxin Liu, Linli Yao, Peiyuan Zhang, Chenxin An, Lean Wang, Xu sun, Lingpeng Kong, Qi Liu

Video Large Language Models (Video LLMs) have shown promising capabilities in video comprehension, yet they struggle with tracking temporal changes and reasoning about temporal relationships.

Diagnostic MME +2

DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction

1 code implementation30 Sep 2024 Zhen Yang, Yanpeng Dong, Heng Wang, Lichao Ma, Zijian Cui, Qi Liu, Haoran Pei

Multi-sensor fusion significantly enhances the accuracy and robustness of 3D semantic occupancy prediction, which is crucial for autonomous driving and robotics.

3D Object Detection 3D Semantic Occupancy Prediction +4

Generalized Protein Pocket Generation with Prior-Informed Flow Matching

no code implementations29 Sep 2024 Zaixi Zhang, Marinka Zitnik, Qi Liu

One critical step in this process involves designing protein pockets, the protein interface binding with the ligand.

valid

FlexSBDD: Structure-Based Drug Design with Flexible Protein Modeling

no code implementations29 Sep 2024 Zaixi Zhang, Mengdi Wang, Qi Liu

Structure-based drug design (SBDD), which aims to generate 3D ligand molecules binding to target proteins, is a fundamental task in drug discovery.

Data Augmentation Drug Design +1

Towards More Relevant Product Search Ranking Via Large Language Models: An Empirical Study

no code implementations26 Sep 2024 Qi Liu, Atul Singh, Jingbo Liu, Cun Mu, Zheng Yan

Training Learning-to-Rank models for e-commerce product search ranking can be challenging due to the lack of a gold standard of ranking relevance.

Learning-To-Rank

Pre-trained Language Model and Knowledge Distillation for Lightweight Sequential Recommendation

no code implementations23 Sep 2024 Li Li, Mingyue Cheng, Zhiding Liu, Hao Zhang, Qi Liu, Enhong Chen

The algorithm operates in two stages: in the first stage, we fine-tune the pre-trained language model on the recommendation dataset to transfer the pre-trained knowledge to the recommendation task; in the second stage, we distill the trained language model to transfer the learned knowledge to a lightweight model.

Knowledge Distillation Language Modeling +2

ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language Models

1 code implementation21 Sep 2024 Yuqing Huang, Rongyang Zhang, Xuesong He, Xuyang Zhi, Hao Wang, Xin Li, Feiyang Xu, Deguang Liu, Huadong Liang, Yi Li, Jian Cui, Zimu Liu, Shijin Wang, Guoping Hu, Guiquan Liu, Qi Liu, Defu Lian, Enhong Chen

To this end, we propose \textbf{\textit{ChemEval}}, which provides a comprehensive assessment of the capabilities of LLMs across a wide range of chemical domain tasks.

Few-Shot Learning Instruction Following

CSS: Overcoming Pose and Scene Challenges in Crowd-Sourced 3D Gaussian Splatting

no code implementations13 Sep 2024 Runze Chen, Mingyu Xiao, Haiyong Luo, Fang Zhao, Fan Wu, Hao Xiong, Qi Liu, Meng Song

We introduce Crowd-Sourced Splatting (CSS), a novel 3D Gaussian Splatting (3DGS) pipeline designed to overcome the challenges of pose-free scene reconstruction using crowd-sourced imagery.

3DGS 3D Reconstruction +1

Revisiting the Solution of Meta KDD Cup 2024: CRAG

1 code implementation9 Sep 2024 Jie Ouyang, Yucong Luo, Mingyue Cheng, Daoyu Wang, Shuo Yu, Qi Liu, Enhong Chen

This paper presents the solution of our team APEX in the Meta KDD CUP 2024: CRAG Comprehensive RAG Benchmark Challenge.

RAG Retrieval +2

Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical Study

2 code implementations3 Sep 2024 Shuo Yu, Mingyue Cheng, Jiqian Yang, Jie Ouyang, Yucong Luo, Chenyi Lei, Qi Liu, Enhong Chen

Retrieval-augmented generation (RAG) is increasingly recognized as an effective approach for mitigating the hallucination of large language models (LLMs) through the integration of external knowledge.

Benchmarking Hallucination +3

Efficient Transfer Learning Framework for Cross-Domain Click-Through Rate Prediction

no code implementations29 Aug 2024 Qi Liu, Xingyuan Tang, Jianqiang Huang, Xiangqian Yu, Haoran Jin, Jin Chen, Yuanhao Pu, Defu Lian, Tan Qu, Zhe Wang, Jia Cheng, Jun Lei

The challenges include the inefficiencies arising from the management of extensive source data and the problem of 'catastrophic forgetting' that results from the CTR model's daily updating.

Click-Through Rate Prediction Recommendation Systems +1

SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning

no code implementations23 Aug 2024 Zhongjian Qiao, Jiafei Lyu, Kechen Jiao, Qi Liu, Xiu Li

Extensive experimental results on D4RL datasets demonstrate that SUMO can provide more accurate uncertainty estimation and boost the performance of base algorithms.

D4RL Offline RL +1

DimeRec: A Unified Framework for Enhanced Sequential Recommendation via Generative Diffusion Models

no code implementations22 Aug 2024 Wuchao Li, Rui Huang, Haijun Zhao, Chi Liu, Kai Zheng, Qi Liu, Na Mou, Guorui Zhou, Defu Lian, Yang song, Wentian Bao, Enyun Yu, Wenwu Ou

Nevertheless, a straightforward combination of SR and DM leads to sub-optimal performance due to discrepancies in learning objectives (recommendation vs. noise reconstruction) and the respective learning spaces (non-stationary vs. stationary).

Image Generation Representation Learning +1

RePair: Automated Program Repair with Process-based Feedback

1 code implementation21 Aug 2024 Yuze Zhao, Zhenya Huang, Yixiao Ma, Rui Li, Kai Zhang, Hao Jiang, Qi Liu, Linbo Zhu, Yu Su

The gap between the trepidation of program reliability and the expense of repairs underscores the indispensability of Automated Program Repair (APR).

Program Repair

A Review of Human-Object Interaction Detection

no code implementations20 Aug 2024 Yuxiao Wang, Qiwei Xiong, Yu Lei, Weiying Xue, Qi Liu, Zhenao Wei

Human-object interaction (HOI) detection plays a key role in high-level visual understanding, facilitating a deep comprehension of human activities.

Human-Object Interaction Detection Object +3

Experimental evaluation of offline reinforcement learning for HVAC control in buildings

1 code implementation15 Aug 2024 Jun Wang, Linyan Li, Qi Liu, Yu Yang

In summary, this paper presents our well-structured investigations and new findings when applying offline reinforcement learning to building HVAC systems.

Offline RL Reinforcement Learning (RL)

An Efficient Continuous Control Perspective for Reinforcement-Learning-based Sequential Recommendation

no code implementations15 Aug 2024 Jun Wang, Likang Wu, Qi Liu, Yu Yang

However, previous studies mainly focus on discrete action and policy spaces, which might have difficulties in handling dramatically growing items efficiently.

continuous-control Continuous Control +1

Modeling Domain and Feedback Transitions for Cross-Domain Sequential Recommendation

no code implementations15 Aug 2024 Changshuo Zhang, Teng Shi, Xiao Zhang, Qi Liu, Ruobing Xie, Jun Xu, Ji-Rong Wen

In this paper, we propose $\text{Transition}^2$, a novel method to model transitions across both domains and types of user feedback.

Representation Learning Sequential Recommendation

Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval

no code implementations15 Aug 2024 Hanqi Zhang, Chong Chen, Lang Mei, Qi Liu, Jiaxin Mao

Experimental results show that (1) on the MS MARCO passage ranking dataset and BEIR, the Mamba Retriever achieves comparable or better effectiveness compared to Transformer-based retrieval models, and the effectiveness grows with the size of the Mamba model; (2) on the long-text LoCoV0 dataset, the Mamba Retriever can extend to longer text length than its pre-trained length after fine-tuning on retrieval task, and it has comparable or better effectiveness compared to other long-text retrieval models; (3) the Mamba Retriever has superior inference speed for long-text retrieval.

Information Retrieval Mamba +2

Towards Few-shot Self-explaining Graph Neural Networks

1 code implementation14 Aug 2024 Jingyu Peng, Qi Liu, Linan Yue, Zaixi Zhang, Kai Zhang, Yunhao Sha

Subsequently, the predictor mimics the decision-making process, which makes predictions based on the generated explanation.

KnowPC: Knowledge-Driven Programmatic Reinforcement Learning for Zero-shot Coordination

no code implementations8 Aug 2024 Yin Gu, Qi Liu, Zhi Li, Kai Zhang

Zero-shot coordination (ZSC) remains a major challenge in the cooperative AI field, which aims to learn an agent to cooperate with an unseen partner in training environments or even novel environments.

Deep Reinforcement Learning reinforcement-learning

Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization

1 code implementation6 Aug 2024 Yanghai Zhang, Ye Liu, Shiwei Wu, Kai Zhang, Xukai Liu, Qi Liu, Enhong Chen

The rapid increase in multimedia data has spurred advancements in Multimodal Summarization with Multimodal Output (MSMO), which aims to produce a multimodal summary that integrates both text and relevant images.

Knowledge Distillation Language Modeling +1

Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models

no code implementations25 Jul 2024 Haoyu Tang, Ye Liu, Xukai Liu, Kai Zhang, Yanghai Zhang, Qi Liu, Enhong Chen

Recent advancements in machine learning, particularly in Natural Language Processing (NLP), have led to the development of sophisticated models trained on extensive datasets, yet raising concerns about the potential leakage of sensitive information.

Contrastive Learning Machine Unlearning

Empowering Few-Shot Relation Extraction with The Integration of Traditional RE Methods and Large Language Models

1 code implementation12 Jul 2024 Ye Liu, Kai Zhang, Aoran Gan, Linan Yue, Feng Hu, Qi Liu, Enhong Chen

Specifically, DSARE innovatively injects the prior knowledge of LLMs into traditional RE models, and conversely enhances LLMs' task-specific aptitude for RE through relation extraction augmentation.

In-Context Learning Relation +1

Detect, Investigate, Judge and Determine: A Novel LLM-based Framework for Few-shot Fake News Detection

no code implementations12 Jul 2024 Ye Liu, Jiajun Zhu, Kai Zhang, Haoyu Tang, Yanghai Zhang, Xukai Liu, Qi Liu, Enhong Chen

To address these shortcomings, we propose a Dual-perspective Augmented Fake News Detection (DAFND) model, designed to enhance LLMs from both inside and outside perspectives.

Fake News Detection In-Context Learning

Dynamic neural network with memristive CIM and CAM for 2D and 3D vision

no code implementations12 Jul 2024 Yue Zhang, Woyu Zhang, Shaocong Wang, Ning Lin, Yifei Yu, Yangu He, Bo wang, Hao Jiang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing.

A Survey of Models for Cognitive Diagnosis: New Developments and Future Directions

no code implementations7 Jul 2024 Fei Wang, Weibo Gao, Qi Liu, Jiatong Li, Guanhao Zhao, Zheng Zhang, Zhenya Huang, Mengxiao Zhu, Shijin Wang, Wei Tong, Enhong Chen

Cognitive diagnosis has been developed for decades as an effective measurement tool to evaluate human cognitive status such as ability level and knowledge mastery.

cognitive diagnosis parameter estimation

A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding

1 code implementation2 Jul 2024 Jinghui Lu, Haiyang Yu, Yanjie Wang, YongJie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao liu, Can Huang

However, existing methods that integrate spatial layouts with text have limitations, such as producing overly long text sequences or failing to fully leverage the autoregressive traits of LLMs.

document understanding Key Information Extraction +6

Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models

1 code implementation21 Jun 2024 Qi Liu, Bo wang, Nan Wang, Jiaxin Mao

To address these issues, in this paper, we propose PE-Rank, leveraging the single passage embedding as a good context compression for efficient listwise passage reranking.

Learning-To-Rank Passage Ranking +2

Jailbreaking as a Reward Misspecification Problem

1 code implementation20 Jun 2024 Zhihui Xie, Jiahui Gao, Lei LI, Zhenguo Li, Qi Liu, Lingpeng Kong

In this paper, we propose a novel perspective that attributes this vulnerability to reward misspecification during the alignment process.

Red Teaming

TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy

1 code implementation17 Jun 2024 Yiqun Chen, Qi Liu, Yi Zhang, Weiwei Sun, Daiting Shi, Jiaxin Mao, Dawei Yin

However, several significant challenges still persist in LLMs for ranking: (1) LLMs are constrained by limited input length, precluding them from processing a large number of documents simultaneously; (2) The output document sequence is influenced by the input order of documents, resulting in inconsistent ranking outcomes; (3) Achieving a balance between cost and ranking performance is quite challenging.

InstructRL4Pix: Training Diffusion for Image Editing by Reinforcement Learning

no code implementations14 Jun 2024 Tiancheng Li, Jinxiu Liu, Huajun Chen, Qi Liu

Instruction-based image editing has made a great process in using natural human language to manipulate the visual content of images.

Object reinforcement-learning +1

Continuous-Time Digital Twin with Analogue Memristive Neural Ordinary Differential Equation Solver

1 code implementation12 Jun 2024 Hegan Chen, Jichang Yang, Jia Chen, Songqi Wang, Shaocong Wang, Dingchen Wang, Xinyu Tian, Yifei Yu, Xi Chen, Yinan Lin, Yangu He, Xiaoshan Wu, Xinyuan Zhang, Ning Lin, Meng Xu, Yi Li, Xumeng Zhang, Zhongrui Wang, Han Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

We experimentally validate our approach by developing a digital twin of the HP memristor, which accurately extrapolates its nonlinear dynamics, achieving a 4. 2-fold projected speedup and a 41. 4-fold projected decrease in energy consumption compared to state-of-the-art digital hardware, while maintaining an acceptable error margin.

QAGCF: Graph Collaborative Filtering for Q&A Recommendation

no code implementations7 Jun 2024 Changshuo Zhang, Teng Shi, Xiao Zhang, Yanping Zheng, Ruobing Xie, Qi Liu, Jun Xu, Ji-Rong Wen

Traditional recommendation methods treat the question-answer pair as a whole or only consider the answer as a single item, which overlooks the two challenges and cannot effectively model user interests.

Collaborative Filtering Contrastive Learning +1

EduNLP: Towards a Unified and Modularized Library for Educational Resources

1 code implementation3 Jun 2024 Zhenya Huang, Yuting Ning, Longhu Qin, Shiwei Tong, Shangzi Xue, Tong Xiao, Xin Lin, Jiayu Liu, Qi Liu, Enhong Chen, Shijing Wang

We also provide a configurable pipeline to unify the data usage and model usage in standard ways, where users can customize their own needs.

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

1 code implementation3 Jun 2024 Weichao Zhao, Hao Feng, Qi Liu, Jingqun Tang, Shu Wei, Binghong Wu, Lei Liao, YongJie Ye, Hao liu, Wengang Zhou, Houqiang Li, Can Huang

In this mechanism, all the involved diverse visual table understanding (VTU) tasks and multi-source visual embeddings are abstracted as concepts.

Language Modelling Question Answering +3

PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations

1 code implementation30 May 2024 Jiatong Li, Renjun Hu, Kunzhe Huang, Yan Zhuang, Qi Liu, Mengxiao Zhu, Xing Shi, Wei Lin

Our toolkit further includes a suite of \textbf{response consistency analyses} that compare performance on raw vs. perturbed test sets to precisely assess LLMs' genuine knowledge capacity.

Memorization

Cognitive Evolutionary Learning to Select Feature Interactions for Recommender Systems

no code implementations29 May 2024 Runlong Yu, Qixiang Shao, Qi Liu, Huan Liu, Enhong Chen

We show that CELL can adaptively evolve into different models for different tasks and data, which enables practitioners to access off-the-shelf models.

Recommendation Systems

TerDiT: Ternary Diffusion Models with Transformers

1 code implementation23 May 2024 Xudong Lu, Aojun Zhou, Ziyi Lin, Qi Liu, Yuhui Xu, Renrui Zhang, Yafei Wen, Shuai Ren, Peng Gao, Junchi Yan, Hongsheng Li

Recent developments in large-scale pre-trained text-to-image diffusion models have significantly improved the generation of high-fidelity images, particularly with the emergence of diffusion models based on transformer architecture (DiTs).

Image Generation Quantization

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

1 code implementation20 May 2024 Jingqun Tang, Qi Liu, YongJie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao liu, Xiang Bai, Can Huang

Text-Centric Visual Question Answering (TEC-VQA) in its proper format not only facilitates human-machine interaction in text-centric visual environments but also serves as a de facto gold proxy to evaluate AI models in the domain of text-centric scene understanding.

Benchmarking Question Answering +4

CELA: Cost-Efficient Language Model Alignment for CTR Prediction

1 code implementation17 May 2024 Xingmei Wang, Weiwen Liu, Xiaolong Chen, Qi Liu, Xu Huang, Yichao Wang, Xiangyang Li, Yasheng Wang, Zhenhua Dong, Defu Lian, Ruiming Tang

This model-agnostic framework can be equipped with plug-and-play textual features, with item-level alignment enhancing the utilization of external information while maintaining training and inference efficiency.

Click-Through Rate Prediction Collaborative Filtering +2

Evaluation of Retrieval-Augmented Generation: A Survey

1 code implementation13 May 2024 Hao Yu, Aoran Gan, Kai Zhang, Shiwei Tong, Qi Liu, Zhaofeng Liu

Retrieval-Augmented Generation (RAG) has recently gained traction in natural language processing.

Information Retrieval RAG +3

Aerial-NeRF: Adaptive Spatial Partitioning and Sampling for Large-Scale Aerial Rendering

no code implementations10 May 2024 Xiaohan Zhang, Yukui Qiu, Zhenyu Sun, Qi Liu

To that end, we propose Aerial-NeRF with three innovative modifications for jointly adapting NeRF in large-scale aerial rendering: (1) Designing an adaptive spatial partitioning and selection method based on drones' poses to adapt different flight trajectories; (2) Using similarity of poses instead of (expert) network for rendering speedup to determine which region a new viewpoint belongs to; (3) Developing an adaptive sampling approach for rendering performance improvement to cover the entire buildings at different heights.

NeRF

Tree-based Ensemble Learning for Out-of-distribution Detection

no code implementations5 May 2024 Zhaiming Shen, Menglun Wang, Guang Cheng, Ming-Jun Lai, Lin Mu, Ruihao Huang, Qi Liu, Hao Zhu

In this paper, we propose TOOD detection, a simple yet effective tree-based out-of-distribution (TOOD) detection mechanism to determine if a set of unseen samples will have similar distribution as of the training samples.

Ensemble Learning Out-of-Distribution Detection

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

no code implementations19 Apr 2024 Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao liu, Yuan Xie, Xiang Bai, Can Huang

Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive, high-quality instruction tuning data.

Hallucination Hallucination Evaluation +2

AG-NeRF: Attention-guided Neural Radiance Fields for Multi-height Large-scale Outdoor Scene Rendering

1 code implementation18 Apr 2024 Jingfeng Guo, Xiaohan Zhang, Baozhu Zhao, Qi Liu

Existing neural radiance fields (NeRF)-based novel view synthesis methods for large-scale outdoor scenes are mainly built on a single altitude.

NeRF Novel View Synthesis

Efficient and accurate neural field reconstruction using resistive memory

no code implementations15 Apr 2024 Yifei Yu, Shaocong Wang, Woyu Zhang, Xinyuan Zhang, Xiuzhe Wu, Yangu He, Jichang Yang, Yue Zhang, Ning Lin, Bo wang, Xi Chen, Songqi Wang, Xumeng Zhang, Xiaojuan Qi, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

The GE harnesses the intrinsic stochasticity of resistive memory for efficient input encoding, while the PE achieves precise weight mapping through a Hardware-Aware Quantization (HAQ) circuit.

Novel View Synthesis Quantization

Event Grounded Criminal Court View Generation with Cooperative (Large) Language Models

1 code implementation10 Apr 2024 Linan Yue, Qi Liu, Lili Zhao, Li Wang, Weibo Gao, Yanqing An

Then, we incorporate the extracted events into court view generation by merging case facts and events.

Event Extraction

Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

1 code implementation8 Apr 2024 Jichang Yang, Hegan Chen, Jia Chen, Songqi Wang, Shaocong Wang, Yifei Yu, Xi Chen, Bo wang, Xinyuan Zhang, Binbin Cui, Ning Lin, Meng Xu, Yi Li, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Han Wang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Demonstrating equivalent generative quality to the software baseline, our system achieved remarkable enhancements in generative speed for both unconditional and conditional generation tasks, by factors of 64. 8 and 156. 5, respectively.

Edge-computing

Survey of Computerized Adaptive Testing: A Machine Learning Perspective

1 code implementation31 Mar 2024 Qi Liu, Yan Zhuang, Haoyang Bi, Zhenya Huang, Weizhe Huang, Jiatong Li, Junhao Yu, Zirui Liu, Zirui Hu, Yuting Hong, Zachary A. Pardos, Haiping Ma, Mengxiao Zhu, Shijin Wang, Enhong Chen

Computerized Adaptive Testing (CAT) provides an efficient and tailored method for assessing the proficiency of examinees, by dynamically adjusting test questions based on their performance.

cognitive diagnosis Question Selection +2

An Analysis on Matching Mechanisms and Token Pruning for Late-interaction Models

no code implementations20 Mar 2024 Qi Liu, Gang Guo, Jiaxin Mao, Zhicheng Dou, Ji-Rong Wen, Hao Jiang, Xinyu Zhang, Zhao Cao

Based on these findings, we then propose several simple document pruning methods to reduce the storage overhead and compare the effectiveness of different pruning methods on different late-interaction models.

Retrieval

Advancing Time Series Classification with Multimodal Language Modeling

4 code implementations19 Mar 2024 Mingyue Cheng, Yiheng Chen, Qi Liu, Zhiding Liu, Yucong Luo

In this work, we propose InstructTime, a novel attempt to reshape time series classification as a learning-to-generate paradigm.

Classification Language Modeling +3

Cross-Domain Pre-training with Language Models for Transferable Time Series Representations

4 code implementations19 Mar 2024 Mingyue Cheng, Xiaoyu Tao, Qi Liu, Hao Zhang, Yiheng Chen, Defu Lian

To address this challenge, we propose CrossTimeNet, a novel cross-domain SSL learning framework to learn transferable knowledge from various domains to largely benefit the target downstream task.

Language Modelling Time Series +1

Towards Personalized Evaluation of Large Language Models with An Anonymous Crowd-Sourcing Platform

1 code implementation13 Mar 2024 Mingyue Cheng, Hao Zhang, Jiqian Yang, Qi Liu, Li Li, Xin Huang, Liwei Song, Zhi Li, Zhenya Huang, Enhong Chen

Through this gateway, users have the opportunity to submit their questions, testing the models on a personalized and potentially broader range of capabilities.

Language Model Evaluation Language Modelling +1

Clinically Feasible Diffusion Reconstruction for Highly-Accelerated Cardiac Cine MRI

no code implementations13 Mar 2024 Shihan Qiu, Shaoyan Pan, Yikang Liu, Lin Zhao, Jian Xu, Qi Liu, Terrence Chen, Eric Z. Chen, Xiao Chen, Shanhui Sun

The currently limited quality of accelerated cardiac cine reconstruction may potentially be improved by the emerging diffusion models, but the clinically unacceptable long processing time poses a challenge.

Towards Faithful Explanations: Boosting Rationalization with Shortcuts Discovery

1 code implementation12 Mar 2024 Linan Yue, Qi Liu, Yichao Du, Li Wang, Weibo Gao, Yanqing An

Since existing methods still suffer from adopting the shortcuts in data to compose rationales and limited large-scale annotated rationales by human, in this paper, we propose a Shortcuts-fused Selective Rationalization (SSR) method, which boosts the rationalization by discovering and exploiting potential shortcuts.

Empowering Sequential Recommendation from Collaborative Signals and Semantic Relatedness

1 code implementation12 Mar 2024 Mingyue Cheng, Hao Zhang, Qi Liu, Fajie Yuan, Zhi Li, Zhenya Huang, Enhong Chen, Jun Zhou, Longfei Li

It is also significant to model the \textit{semantic relatedness} reflected in content features, e. g., images and text.

Sequential Recommendation

A Dataset for the Validation of Truth Inference Algorithms Suitable for Online Deployment

1 code implementation10 Mar 2024 Fei Wang, Haoyu Liu, Haoyang Bi, Xiangzhuang Shen, Renyu Zhu, Runze Wu, Minmin Lin, Tangjie Lv, Changjie Fan, Qi Liu, Zhenya Huang, Enhong Chen

In this paper, we introduce a substantial crowdsourcing annotation dataset collected from a real-world crowdsourcing platform.

Cooperative Classification and Rationalization for Graph Generalization

1 code implementation10 Mar 2024 Linan Yue, Qi Liu, Ye Liu, Weibo Gao, Fangzhou Yao, Wenfeng Li

To address these challenges, in this paper, we propose a Cooperative Classification and Rationalization (C2R) method, consisting of the classification and the rationalization module.

Graph Classification Knowledge Distillation

Unified Uncertainty Estimation for Cognitive Diagnosis Models

no code implementations9 Mar 2024 Fei Wang, Qi Liu, Enhong Chen, Chuanren Liu, Zhenya Huang, Jinze Wu, Shijin Wang

Specifically, based on the idea of estimating the posterior distributions of cognitive diagnosis model parameters, we first provide a unified objective function for mini-batch based optimization that can be more efficiently applied to a wide range of models and large datasets.

cognitive diagnosis Diagnostic

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

1 code implementation5 Mar 2024 Xijia Tao, Shuai Zhong, Lei LI, Qi Liu, Lingpeng Kong

In this paper, we propose a novel jailbreaking attack against VLMs, aiming to bypass their safety barrier when a user inputs harmful instructions.

FreeA: Human-object Interaction Detection using Free Annotation Labels

no code implementations4 Mar 2024 Qi Liu, Yuxiao Wang, Xinyu Jiang, Wolin Liang, Zhenao Wei, Yu Lei, Nan Zhuang, Weiying Xue

Recent human-object interaction (HOI) detection methods depend on extensively annotated image datasets, which require a significant amount of manpower.

Human-Object Interaction Detection Object

PointCore: Efficient Unsupervised Point Cloud Anomaly Detector Using Local-Global Features

2 code implementations4 Mar 2024 Baozhu Zhao, Qiwei Xiong, Xiaohan Zhang, Jingfeng Guo, Qi Liu, Xiaofen Xing, Xiangmin Xu

Three-dimensional point cloud anomaly detection that aims to detect anomaly data points from a training set serves as the foundation for a variety of applications, including industrial inspection and autonomous driving.

Anomaly Detection Autonomous Driving

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models

1 code implementation22 Feb 2024 Xudong Lu, Qi Liu, Yuhui Xu, Aojun Zhou, Siyuan Huang, Bo Zhang, Junchi Yan, Hongsheng Li

Specifically, we propose, for the first time to our best knowledge, post-training approaches for task-agnostic and task-specific expert pruning and skipping of MoE LLMs, tailored to improve deployment efficiency while maintaining model performance across a wide range of tasks.

All Mixture-of-Experts

SISSA: Real-time Monitoring of Hardware Functional Safety and Cybersecurity with In-vehicle SOME/IP Ethernet Traffic

1 code implementation21 Feb 2024 Qi Liu, Xingyu Li, Ke Sun, Yufeng Li, Yanchen Liu

Scalable service-Oriented Middleware over IP (SOME/IP) is an Ethernet communication standard protocol in the Automotive Open System Architecture (AUTOSAR), promoting ECU-to-ECU communication over the IP stack.

ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained Visual Categorization

no code implementations30 Jan 2024 Danning Lao, Qi Liu, Jiazi Bu, Junchi Yan, Wei Shen

As computer vision continues to advance and finds widespread applications across various domains, the need for interpretability in deep learning models becomes paramount.

Decision Making Fine-Grained Visual Categorization

FedGT: Federated Node Classification with Scalable Graph Transformer

no code implementations26 Jan 2024 Zaixi Zhang, Qingyong Hu, Yang Yu, Weibo Gao, Qi Liu

However, existing methods have the following limitations: (1) The links between local subgraphs are missing in subgraph federated learning.

Classification Federated Learning +2

TED-Net: Dispersal Attention for Perceiving Interaction Region in Indirectly-Contact HOI Detection

1 code implementation IEEE Transactions on Circuits and Systems for Video Technology 2024 Yuxiao Wang, Qi Liu, Yu Lei

Human-Object Interaction (HOI) detection is a fertile research ground that merits further investigation in computer vision, and plays an important role in image high-level semantic information understanding.

Human-Object Interaction Detection object-detection +1

Red Teaming Visual Language Models

no code implementations23 Jan 2024 Mukai Li, Lei LI, Yuwei Yin, Masood Ahmed, Zhenguang Liu, Qi Liu

Additionally, we simply apply red teaming alignment to LLaVA-v1. 5 with Supervised Fine-tuning (SFT) using RTVLM, and this bolsters the models' performance with 10% in RTVLM test set, 13% in MM-Hal, and without noticeable decline in MM-Bench, overpassing other LLaVA-based models with regular alignment data.

Fairness Red Teaming

A locally statistical active contour model for SAR image segmentation can be solved by denoising algorithms

no code implementations10 Jan 2024 Guangming Liu, Quanying Sun, Jing Liang, Qi Liu

In this paper, we propose a novel locally statistical variational active contour model based on I-divergence-TV denoising model, which hybrides geodesic active contour (GAC) model with active contours without edges (ACWE) model, and can be used to segment images corrupted by multiplicative gamma noise.

Denoising Image Segmentation +1

LMaaS: Exploring Pricing Strategy of Large Model as a Service for Communication

no code implementations5 Jan 2024 Panlong Wu, Qi Liu, Yanjie Dong, Fangxin Wang

In the first step, we optimize the seller's pricing decision and propose an Iterative Model Pricing (IMP) algorithm that optimizes the prices of large models iteratively by reasoning customers' future rental decisions, which is able to achieve a near-optimal pricing solution.

Intelligent Communication

Improving Depth Completion via Depth Feature Upsampling

no code implementations CVPR 2024 YuFei Wang, Ge Zhang, Shaoqian Wang, Bo Li, Qi Liu, Le Hui, Yuchao Dai

In this paper we visualize the internal feature maps to analyze how the network densifies the input sparse depth.

Decoder Depth Completion +1

Unlocking the Potential of Large Language Models for Explainable Recommendations

1 code implementation25 Dec 2023 Yucong Luo, Mingyue Cheng, Hao Zhang, Junyu Lu, Qi Liu, Enhong Chen

In this study, we propose LLMXRec, a simple yet effective two-stage explainable recommendation framework aimed at further boosting the explanation quality by employing LLMs.

Decision Making Explainable Recommendation +2

Active contours driven by local and global intensity fitting energy with application to SAR image segmentation and its fast solvers

no code implementations19 Dec 2023 Guangming Liu, Qi Liu, Jing Liang, Quanying Sun

In this paper, we propose a novel variational active contour model based on Aubert-Aujol (AA) denoising model, which hybrides geodesic active contour (GAC) model with active contours without edges (ACWE) model and can be used to segment images corrupted by multiplicative gamma noise.

Denoising Image Segmentation +2

Random resistive memory-based deep extreme point learning machine for unified visual processing

no code implementations14 Dec 2023 Shaocong Wang, Yizhao Gao, Yi Li, Woyu Zhang, Yifei Yu, Bo wang, Ning Lin, Hegan Chen, Yue Zhang, Yang Jiang, Dingchen Wang, Jia Chen, Peng Dai, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Xiaoxin Xu, Hayden So, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Our random resistive memory-based deep extreme point learning machine may pave the way for energy-efficient and training-friendly edge AI across various data modalities and tasks.

AT4CTR: Auxiliary Match Tasks for Enhancing Click-Through Rate Prediction

no code implementations9 Dec 2023 Qi Liu, Xuyang Hou, Defu Lian, Zhe Wang, Haoran Jin, Jia Cheng, Jun Lei

Most existing methods focus on the network architecture design of the CTR model for better accuracy and suffer from the data sparsity problem.

Click-Through Rate Prediction Collaborative Filtering +1

A global optimization SAR image segmentation model can be easily transformed to a general ROF denoising model

no code implementations8 Dec 2023 Guangming Liu, Qi Liu, Jing Liang

The second model is: we use a different splitting approach than one model to transform the global optimization model into a differentiable term and a general ROF model term, which can be solved by the same technique as the first model.

Denoising global-optimization +3

SigFormer: Sparse Signal-Guided Transformer for Multi-Modal Human Action Segmentation

1 code implementation29 Nov 2023 Qi Liu, Xinchen Liu, Kun Liu, Xiaoyan Gu, Wu Liu

Nowadays, the majority of approaches concentrate on the fusion of dense signals (i. e., RGB, optical flow, and depth maps).

Action Segmentation Optical Flow Estimation +1

DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding

no code implementations20 Nov 2023 Hao Feng, Qi Liu, Hao liu, Jingqun Tang, Wengang Zhou, Houqiang Li, Can Huang

This work presents DocPedia, a novel large multimodal model (LMM) for versatile OCR-free document understanding, capable of parsing images up to 2, 560$\times$2, 560 resolution.

document understanding Language Modeling +3

Deep Group Interest Modeling of Full Lifelong User Behaviors for CTR Prediction

no code implementations15 Nov 2023 Qi Liu, Xuyang Hou, Haoran Jin, Xiaolong Chen, Jin Chen, Defu Lian, Zhe Wang, Jia Cheng, Jun Lei

The insights from this subset reveal the user's decision-making process related to the candidate item, improving prediction accuracy.

Click-Through Rate Prediction

Pruning random resistive memory for optimizing analogue AI

no code implementations13 Nov 2023 Yi Li, Songqi Wang, Yaping Zhao, Shaocong Wang, Woyu Zhang, Yangu He, Ning Lin, Binbin Cui, Xi Chen, Shiming Zhang, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Zhongrui Wang, Xiaoxin Xu, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning to optimize the topology of a randomly weighted analogue resistive memory neural network.

Audio Classification Image Segmentation +1

Sparse Attention-Based Neural Networks for Code Classification

no code implementations11 Nov 2023 Ziyang Xiang, Zaixi Zhang, Qi Liu

We introduce an approach named the Sparse Attention-based neural network for Code Classification (SACC) in this paper.

Classification Code Classification

AutoSAM: Towards Automatic Sampling of User Behaviors for Sequential Recommender Systems

1 code implementation1 Nov 2023 Hao Zhang, Mingyue Cheng, Qi Liu, Zhiding Liu, Junzhe Jiang, Enhong Chen

Sequential recommender systems (SRS) have gained widespread popularity in recommendation due to their ability to effectively capture dynamic user preferences.

Future prediction Sequential Recommendation

SoulChat: Improving LLMs' Empathy, Listening, and Comfort Abilities through Fine-tuning with Multi-turn Empathy Conversations

1 code implementation1 Nov 2023 YiRong Chen, Xiaofen Xing, Jingkai Lin, huimin zheng, Zhenyu Wang, Qi Liu, Xiangmin Xu

Large language models (LLMs) have been widely applied in various fields due to their excellent capability for memorizing knowledge and chain of thought (CoT).

BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT

1 code implementation24 Oct 2023 YiRong Chen, Zhenyu Wang, Xiaofen Xing, huimin zheng, Zhipei Xu, Kai Fang, Junhong Wang, Sihang Li, Jieling Wu, Qi Liu, Xiangmin Xu

Large language models (LLMs) have performed well in providing general and extensive health suggestions in single-turn conversations, exemplified by systems such as ChatGPT, ChatGLM, ChatDoctor, DoctorGLM, and etc.

AdaptSSR: Pre-training User Model with Augmentation-Adaptive Self-Supervised Ranking

1 code implementation NeurIPS 2023 Yang Yu, Qi Liu, Kai Zhang, Yuren Zhang, Chao Song, Min Hou, Yuqing Yuan, Zhihao Ye, Zaixi Zhang, Sanshi Lei Yu

Specifically, we adopt a multiple pairwise ranking loss which trains the user model to capture the similarity orders between the implicitly augmented view, the explicitly augmented view, and views from other users.

Contrastive Learning Data Augmentation

LRRU: Long-short Range Recurrent Updating Networks for Depth Completion

no code implementations ICCV 2023 YuFei Wang, Bo Li, Ge Zhang, Qi Liu, Tao Gao, Yuchao Dai

Existing deep learning-based depth completion methods generally employ massive stacked layers to predict the dense depth map from sparse input data.

Depth Completion

Full-Atom Protein Pocket Design via Iterative Refinement

1 code implementation NeurIPS 2023 Zaixi Zhang, Zepu Lu, Zhongkai Hao, Marinka Zitnik, Qi Liu

In the initial stage, the residue types and backbone coordinates are refined using a hierarchical context encoder, complemented by two structure refinement modules that capture both inter-residue and pocket-ligand interactions.

Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of Large Language Models

1 code implementation2 Oct 2023 Jean Kaddour, Qi Liu

The in-context learning ability of large language models (LLMs) enables them to generalize to novel downstream tasks with relatively few labeled examples.

Data Augmentation In-Context Learning +4

Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

1 code implementation21 Sep 2023 Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li

Current speaker recognition systems primarily rely on supervised approaches, constrained by the scale of labeled datasets.

Speaker Recognition

Reformulating Sequential Recommendation: Learning Dynamic User Interest with Content-enriched Language Modeling

1 code implementation19 Sep 2023 Junzhe Jiang, Shang Qu, Mingyue Cheng, Qi Liu, Zhiding Liu, Hao Zhang, Rujiao Zhang, Kai Zhang, Rui Li, Jiatong Li, Min Gao

Recommender systems are indispensable in the realm of online applications, and sequential recommendation has enjoyed considerable prevalence due to its capacity to encapsulate the dynamic shifts in user interests.

Language Modeling Language Modelling +2

FedJudge: Federated Legal Large Language Model

2 code implementations15 Sep 2023 Linan Yue, Qi Liu, Yichao Du, Weibo Gao, Ye Liu, Fangzhou Yao

To this end, in this paper, we propose the first Federated Legal Large Language Model (FedJudge) framework, which fine-tunes Legal LLMs efficiently and effectively.

Continual Learning Federated Learning +5

Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation

no code implementations8 Sep 2023 Jiatong Li, Rui Li, Qi Liu

Existing LLM evaluation methods are mainly supervised signal-based which depends on static datasets and cannot evaluate the ability of LLMs in dynamic real-world scenarios where deep interaction widely exists.

Code Generation Machine Translation

Decomposed Guided Dynamic Filters for Efficient RGB-Guided Depth Completion

no code implementations5 Sep 2023 YuFei Wang, Yuxin Mao, Qi Liu, Yuchao Dai

The decomposed filters not only maintain the favorable properties of guided dynamic filters as being content-dependent and spatially-variant, but also reduce model parameters and hardware costs, as the learned adaptors are decoupled with the number of feature channels.

Depth Completion object-detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.