Search Results for author: Wei zhang

Found 732 papers, 218 papers with code

HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity Prediction

no code implementations ECCV 2020 Tianjiao Li, Jun Liu, Wei zhang, Ling-Yu Duan

In this paper, we propose a novel Hardness-AwaRe Discrimination Network (HARD-Net) to specifically investigate the relationships between the similar activity pairs that are hard to be discriminated.

Activity Prediction Skeleton Based Action Recognition

Unsupervised Multi-View CNN for Salient View Selection of 3D Objects and Scenes

1 code implementation ECCV 2020 Ran Song, Wei zhang, Yitian Zhao, Yonghuai Liu

We present an unsupervised 3D deep learning framework based on a ubiquitously true proposition named by us view-object consistency as it states that a 3D object and its projected 2D views always belong to the same object class.

Object

Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

no code implementations ECNLP (ACL) 2022 Zheng Liu, Wei zhang, Yan Chen, Weiyi Sun, Tianchuan Du, Benjamin Schroeder

Recently, semantic search has been successfully applied to E-commerce product search and the learned semantic space for query and product encoding are expected to generalize well to unseen queries or products.

text similarity

Flow2Code: Evaluating Large Language Models for Flowchart-based Code Generation Capability

no code implementations2 Jun 2025 Mengliang He, Jiayi Zeng, Yankai Jiang, Wei zhang, Zeming Liu, Xiaoming Shi, Aimin Zhou

While large language models (LLMs) show promise in code generation, existing benchmarks neglect the flowchart-based code generation.

Code Generation

Distributed perception of social power in influence networks with stubborn individuals

no code implementations1 Jun 2025 Ye Tian, Yu Kawano, Wei zhang, Kenji Kashima

We propose two dynamical models for distributed perception of social power based on the Friedkin-Johnsen (FJ) opinion dynamics: one without and one with reflected appraisals.

Biological Pathway Guided Gene Selection Through Collaborative Reinforcement Learning

1 code implementation30 May 2025 Ehtesamul Azim, Dongjie Wang, Tae Hyun Hwang, Yanjie Fu, Wei zhang

Gene selection in high-dimensional genomic data is essential for understanding disease mechanisms and improving therapeutic outcomes.

Dimensionality Reduction feature selection +4

LPCM: Learning-based Predictive Coding for LiDAR Point Cloud Compression

no code implementations26 May 2025 Chang Sun, Hui Yuan, Shiqi Jiang, Da Ai, Wei zhang, Raouf Hamzaoui

The predictive geometry coding method in the geometry-based point cloud compression (G-PCC) standard uses the inherent angular resolution to predict the azimuth angles.

Quantization

Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

1 code implementation22 May 2025 Xinghao Chen, Anhao Zhao, Heming Xia, Xuan Lu, Hanlin Wang, Yanjun Chen, Wei zhang, Jian Wang, Wenjie Li, Xiaoyu Shen

By decoupling reasoning from language, latent reasoning promises richer cognitive representations and more flexible, faster inference.

LIFEBench: Evaluating Length Instruction Following in Large Language Models

1 code implementation22 May 2025 Wei zhang, Zhenhong Zhou, Junfeng Fang, Rongwu Xu, Kun Wang, Yuanhe Zhang, Rui Wang, Ge Zhang, Xinfeng Li, Li Sun, Lingjuan Lyu, Yang Liu, Sen Su

To this end, we introduce Length Instruction Following Evaluation Benchmark (LIFEBench) to comprehensively evaluate LLMs' ability to follow length instructions across diverse tasks and a wide range of specified lengths.

Instruction Following Text Generation

SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation

1 code implementation20 May 2025 Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, Wang Chen, Wei zhang, Anh Tuan Luu

Process Reward Models (PRMs) have demonstrated promising results in mathematical reasoning, but existing process annotation approaches, whether through human annotations or Monte Carlo simulations, remain computationally expensive.

Mathematical Reasoning

Improving the Euclidean Diffusion Generation of Manifold Data by Mitigating Score Function Singularity

no code implementations15 May 2025 Zichen Liu, Wei zhang, Tiejun Li

Euclidean diffusion models have achieved remarkable success in generative modeling across diverse domains, and they have been extended to manifold case in recent advances.

SpNeRF: Memory Efficient Sparse Volumetric Neural Rendering Accelerator for Edge Devices

no code implementations13 May 2025 Yipu Zhang, Jiawei Liang, Jian Peng, Jiang Xu, Wei zhang

The preprocessing step employs hash mapping to support irregular data access while maintaining a minimal memory size.

Neural Rendering

Efficient and Scalable Neural Symbolic Search for Knowledge Graph Complex Query Answering

no code implementations13 May 2025 Weizhi Fei, ZiHao Wang, Hang Yin, Shukai Zhao, Wei zhang, Yangqiu Song

While neuro-symbolic search utilized neural link predictions achieve superior accuracy, they encounter significant complexity bottlenecks: (i) Data complexity typically scales quadratically with the number of entities in the knowledge graph, and (ii) Query complexity becomes NP-hard for cyclic queries.

Complex Query Answering

A Practical Introduction to Deep Reinforcement Learning

no code implementations13 May 2025 Yinghan Sun, Hongxi Wang, Hua Chen, Wei zhang

Deep reinforcement learning (DRL) has emerged as a powerful framework for solving sequential decision-making problems, achieving remarkable success in a wide range of applications, including game AI, autonomous driving, biomedicine, and large language models.

Autonomous Driving Decision Making +4

MMiC: Mitigating Modality Incompleteness in Clustered Federated Learning

no code implementations11 May 2025 Lishan Yang, Wei zhang, Quan Z. Sheng, Lina Yao, Weitong Chen, Ali Shakeri

Multimodal Federated Learning (MFL) is a distributed approach that enhances the efficiency and quality of multimodal learning, ensuring collaborative work and privacy protection.

Federated Learning Portfolio Optimization

Riemannian Denoising Diffusion Probabilistic Models

no code implementations7 May 2025 Zichen Liu, Wei zhang, Christof Schütte, Tiejun Li

We propose Riemannian Denoising Diffusion Probabilistic Models (RDDPMs) for learning distributions on submanifolds of Euclidean space that are level sets of functions, including most of the manifolds relevant to applications.

Denoising

Tempo: Application-aware LLM Serving with Mixed SLO Requirements

no code implementations24 Apr 2025 Wei zhang, Zhiyu Wu, Yi Mu, Banruo Liu, Myungjin Lee, Fan Lai

The integration of Large Language Models (LLMs) into diverse applications, ranging from interactive chatbots and cloud AIOps to intelligent agents, has introduced a wide spectrum of Service Level Objectives (SLOs) for responsiveness.

Graph Matching Scheduling

Joint Topology and Power Optimization for Multi-UAV Collaborative Secure Communication

no code implementations23 Apr 2025 Bin Qiu, Wenchi Cheng, Hongxiang He, Wei zhang

Subsequently, the non-convex constraints are converted into convex terms, and a double-loop search algorithm is proposed to solve the transmit power minimization problem.

The PHD/CPHD filter for Multiple Extended Target Tracking with Trajectory Set Theory and Explicit Shape Estimation

no code implementations21 Apr 2025 Yuanhao Cheng, Yunhe Cao, Tat-Soon Yeo, Fu Jie, Wei zhang

These two methods are deduced from the famous Probability Hypothesis Density (PHD) filter and the Cardinality-PHD (CPHD) filter, respectively.

Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions

no code implementations20 Apr 2025 Luyang Fang, Xiaowei Yu, Jiazhang Cai, Yongkai Chen, Shushan Wu, Zhengliang Liu, Zhenyuan Yang, Haoran Lu, Xilin Gong, Yufang Liu, Terry Ma, Wei Ruan, Ali Abbasi, Jing Zhang, Tao Wang, Ehsan Latif, Wei Liu, Wei zhang, Soheil Kolouri, Xiaoming Zhai, Dajiang Zhu, Wenxuan Zhong, Tianming Liu, Ping Ma

Despite substantial progress, open challenges remain in preserving emergent reasoning and linguistic diversity, enabling efficient adaptation to continually evolving teacher models and datasets, and establishing comprehensive evaluation protocols.

Dataset Distillation Diversity +2

SlimPipe: Memory-Thrifty and Efficient Pipeline Parallelism for Long-Context LLM Training

no code implementations20 Apr 2025 Zhouyang Li, Yuliang Liu, Wei zhang, TaiLing Yuan, Bin Chen, Chengru Song, Di Zhang

For example, on the Llama 70B model, compared to state-of-the-art methods, SlimPipe significantly boosts the Model FLOPs Utilization (MFU) to up to $1. 57\times$ for a context length of 512K.

EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery

no code implementations17 Apr 2025 Wei zhang, Miaoxin Cai, Yaqian Ning, Tong Zhang, Yin Zhuang, He Chen, Jun Li, Xuerui Mao

Recent advances in the visual-language area have developed natural multi-modal large language models (MLLMs) for spatial reasoning through visual prompting.

Large Language Model Multi-Task Learning +2

Dynamic Compressing Prompts for Efficient Inference of Large Language Models

no code implementations15 Apr 2025 Jinwu Hu, Wei zhang, Yufeng Wang, Yu Hu, Bin Xiao, Mingkui Tan, Qing Du

We model prompt compression as a Markov Decision Process (MDP), enabling the DCP-Agent to sequentially remove redundant tokens by adapting to dynamic contexts and retaining crucial content.

Token Reduction

CliniChat: A Multi-Source Knowledge-Driven Framework for Clinical Interview Dialogue Reconstruction and Evaluation

no code implementations14 Apr 2025 Jing Chen, Zhihua Wei, Wei zhang, Yingying Hu, Qiong Zhang

So we propose CliniChat, a framework that integrates multi-source knowledge to enable LLMs to simulate real-world clinical interviews.

MedQA

BioChemInsight: An Open-Source Toolkit for Automated Identification and Recognition of Optical Chemical Structures and Activity Data in Scientific Publications

1 code implementation12 Apr 2025 Zhe Wang, Fangtian Fu, Wei zhang, Lige Yan, Yan Meng, Jianping Wu, Hui Wu, Gang Xu, Si Chen

Automated extraction of chemical structures and their bioactivity data is crucial for accelerating drug discovery and enabling data-driven pharmaceutical research.

Drug Design Drug Discovery

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill

no code implementations CVPR 2025 Jieming Cui, Tengyu Liu, Ziyu Meng, Jiale Yu, Ran Song, Wei zhang, Yixin Zhu, Siyuan Huang

Extensive experiments across diverse embodiments and learning paradigms demonstrate GROVE's effectiveness, achieving 22. 2% higher motion naturalness and 25. 7% better task completion scores while training 8. 4x faster than previous methods.

Dexterous Manipulation through Imitation Learning: A Survey

no code implementations4 Apr 2025 Shan An, Ziyu Meng, Chao Tang, Yuning Zhou, Tengyu Liu, Fangqiang Ding, Shufang Zhang, Yao Mu, Ran Song, Wei zhang, Zeng-Guang Hou, Hong Zhang

Dexterous manipulation, which refers to the ability of a robotic hand or multi-fingered end-effector to skillfully control, reorient, and manipulate objects through precise, coordinated finger movements and adaptive force modulation, enables complex interactions similar to human hand dexterity.

Imitation Learning Reinforcement Learning (RL) +1

Gaussian Process Tilted Nonparametric Density Estimation using Fisher Divergence Score Matching

no code implementations4 Apr 2025 John Paisley, Wei zhang, Brian Barr

We present three Fisher divergence (FD) minimization algorithms for learning Gaussian process (GP) based score models for lower dimensional density estimation problems.

Density Estimation Form +1

Reconfigurable Codebook-Based Beamforming for RDARS-Aided mmWave MU-MIMO Systems

no code implementations2 Apr 2025 Chengwang Ji, Qing Xue, Haiquan Lu, Jintao Wang, Qiaoyan Peng, Shaodan Ma, Wei zhang

In RDARS-aided mmWave systems, the active and passive beamforming design and working mode configuration for reconfigurable elements are crucial for system performance.

ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement

no code implementations2 Apr 2025 Runhui Huang, Chunwei Wang, Junwei Yang, Guansong Lu, Yunlong Yuan, Jianhua Han, Lu Hou, Wei zhang, Lanqing Hong, Hengshuang Zhao, Hang Xu

We present ILLUME+ that leverages dual visual tokenization and a diffusion decoder to improve both deep semantic understanding and high-fidelity image generation.

Decoder Image Generation +1

Hierarchical Attention Networks for Lossless Point Cloud Attribute Compression

no code implementations1 Apr 2025 Yueru Chen, Wei zhang, Dingquan Li, Jing Wang, Ge Li

In this paper, we propose a deep hierarchical attention context model for lossless attribute compression of point clouds, leveraging a multi-resolution spatial structure and residual learning.

Attribute

Mapping Geopolitical Bias in 11 Large Language Models: A Bilingual, Dual-Framing Analysis of U.S.-China Tensions

no code implementations31 Mar 2025 William Guey, Pierrick Bougault, Vitor D. de Moura, Wei zhang, Jose O. Gomes

This study systematically analyzes geopolitical bias across 11 prominent Large Language Models (LLMs) by examining their responses to seven critical topics in U. S.-China relations.

PAPI-Reg: Patch-to-Pixel Solution for Efficient Cross-Modal Registration between LiDAR Point Cloud and Camera Image

no code implementations19 Mar 2025 Yuanchao Yue, Zhengxin Li, Wei zhang, Hui Yuan

To address this issue, we propose a framework that projects point clouds into several 2D representations for matching with camera images, which not only leverages the geometric characteristic of LiDAR point clouds more effectively but also bridge the domain gap between the point cloud and image.

StyleLoco: Generative Adversarial Distillation for Natural Humanoid Robot Locomotion

no code implementations19 Mar 2025 Le Ma, Ziyu Meng, Tengyu Liu, Yuhan Li, Ran Song, Wei zhang, Siyuan Huang

Existing methods encounter a fundamental dilemma in learning humanoid locomotion: reinforcement learning with handcrafted rewards can achieve agile locomotion but produces unnatural gaits, while Generative Adversarial Imitation Learning (GAIL) with motion capture data yields natural movements but suffers from unstable training processes and restricted agility.

Imitation Learning reinforcement-learning +1

Fuzzy Rule-based Differentiable Representation Learning

no code implementations16 Mar 2025 Wei zhang, Zhaohong Deng, Guanjin Wang, Kup-Sze Choi

Subsequently, a novel differentiable optimization method is proposed for the consequence part learning which can preserve the model's interpretability and transparency while further exploring the nonlinear relationships within the data.

Representation Learning

Evaluating Mathematical Reasoning Across Large Language Models: A Fine-Grained Approach

no code implementations13 Mar 2025 Afrar Jahin, Arif Hassan Zidan, Wei zhang, Yu Bao, Tianming Liu

With the rapid advancement of Artificial Intelligence (AI), Large Language Models (LLMs) have significantly impacted a wide array of domains, including healthcare, engineering, science, education, and mathematical reasoning.

Formal Logic MMLU

Integrating Chain-of-Thought for Multimodal Alignment: A Study on 3D Vision-Language Learning

no code implementations8 Mar 2025 Yanjun Chen, Yirong Sun, Xinghao Chen, Jian Wang, Xiaoyu Shen, Wenjie Li, Wei zhang

Chain-of-Thought (CoT) reasoning has proven effective in natural language tasks but remains underexplored in multimodal alignment.

Multimodal Reasoning

We Care Each Pixel: Calibrating on Medical Segmentation Model

1 code implementation7 Mar 2025 Wenhao Liang, Wei zhang, Yue Lin, Miao Xu, Olaf Maennel, Weitong Chen

Medical image segmentation is fundamental for computer-aided diagnostics, providing accurate delineation of anatomical structures and pathological regions.

Image Segmentation Medical Image Segmentation +2

FGS-SLAM: Fourier-based Gaussian Splatting for Real-time SLAM with Sparse and Dense Map Fusion

no code implementations3 Mar 2025 Yansong Xu, Junlin Li, Wei zhang, Siyu Chen, Shengyong Zhang, Yuquan Leng, Weijia Zhou

3D gaussian splatting has advanced simultaneous localization and mapping (SLAM) technology by enabling real-time positioning and the construction of high-fidelity maps.

Simultaneous Localization and Mapping

EasyCraft: A Robust and Efficient Framework for Automatic Avatar Crafting

no code implementations CVPR 2025 Suzhen Wang, WeiJie Chen, Wei zhang, Minda Zhao, Lincheng Li, Rongsheng Zhang, Zhipeng Hu, Xin Yu

We first establish a unified feature distribution in the translator's image encoder through self-supervised learning on a large-scale dataset, enabling photos of any style to be embedded into a unified feature representation.

Self-Supervised Learning

Coherency Improved Explainable Recommendation via Large Language Model

no code implementations21 Feb 2025 Shijie Liu, Ruixing Ding, Weihai Lu, Jun Wang, Mo Yu, Xiaoming Shi, Wei zhang

Explainable recommender systems are designed to elucidate the explanation behind each recommendation, enabling users to comprehend the underlying logic.

Explainable Recommendation Explanation Generation +5

3D Gaussian Splatting aided Localization for Large and Complex Indoor-Environments

no code implementations19 Feb 2025 Vincent Ress, Jonas Meyer, Wei zhang, David Skuddis, Uwe Soergel, Norbert Haala

The field of visual localization has been researched for several decades and has meanwhile found many practical applications.

3DGS Visual Localization

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

no code implementations18 Feb 2025 Xianfu Cheng, Wei zhang, Shiwei Zhang, Jian Yang, Xiangyuan Guan, Xianjie Wu, Xiang Li, Ge Zhang, Jiaheng Liu, Yuying Mai, Yutao Zeng, Zhoufutu Wen, Ke Jin, Baorui Wang, Weixiao Zhou, Yunhong Lu, Tongliang Li, Wenhao Huang, Zhoujun Li

The increasing application of multi-modal large language models (MLLMs) across various sectors have spotlighted the essence of their output reliability and accuracy, particularly their ability to produce content grounded in factual information (e. g. common and domain-specific knowledge).

Image Comprehension Question Answering +2

Continuous-Aperture Array Based OAM High-Capacity Communication For Metaverse

no code implementations12 Feb 2025 Hongyun Jin, Wenchi Cheng, Jingqing Wang, Wei zhang

The extensive data interaction demands of an immersive metaverse necessitate the adoption of emerging technologies to enable high-capacity communication.

Data Interaction

Collaborative Filtering Meets Spectrum Shift: Connecting User-Item Interaction with Graph-Structured Side Information

1 code implementation12 Feb 2025 Yunhang He, Cong Xu, Jun Wang, Wei zhang

However, when graph-structured side information (e. g., multimodal similarity graphs or social networks) is integrated into the U-I bipartite graph, existing graph collaborative filtering methods fall short of achieving satisfactory performance.

Collaborative Filtering Multimodal Recommendation

Multi-Agent Collaboration for Multilingual Code Instruction Tuning

no code implementations11 Feb 2025 Jian Yang, Wei zhang, Jiaxi Yang, Yibo Miao, Shanghaoran Quan, Zhenhe Wu, Qiyao Peng, Liqun Yang, Tianyu Liu, Zeyu Cui, Binyuan Hui, Junyang Lin

Recent advancement in code understanding and generation demonstrates that code LLMs fine-tuned on a high-quality instruction dataset can gain powerful capabilities to address wide-ranging code-related tasks.

Cross-Lingual Transfer Transfer Learning

HetSSNet: Spatial-Spectral Heterogeneous Graph Learning Network for Panchromatic and Multispectral Images Fusion

no code implementations7 Feb 2025 Mengting Ma, Yizhen Jiang, Mengjiao Zhao, Jiaxin Li, Wei zhang

Graph is the more flexible structure, however, there are two major challenges when modeling spatial-spectral properties with graph: \emph{1) constructing the customized graph structure for spatial-spectral relationship priors}; \emph{2) learning the unified spatial-spectral representation through the graph}.

Graph Learning Pansharpening

DobLIX: A Dual-Objective Learned Index for Log-Structured Merge Trees

no code implementations7 Feb 2025 Alireza Heidari, Amirhossein Ahmadi, Wei zhang

In this paper, we introduce DobLIX, a dual-objective learned index specifically designed for Log-Structured Merge(LSM) tree-based key-value stores.

Middleman Bias in Advertising: Aligning Relevance of Keyphrase Recommendations with Search

no code implementations31 Jan 2025 Soumik Dey, Wei zhang, Hansi Wu, Bingfeng Dong, Binbin Li

E-commerce sellers are recommended keyphrases based on their inventory on which they advertise to increase buyer engagement (clicks/sales).

Advancing Generative Artificial Intelligence and Large Language Models for Demand Side Management with Internet of Electric Vehicles

no code implementations26 Jan 2025 Hanwen Zhang, Ruichen Zhang, Wei zhang, Dusit Niyato, Yonggang Wen

This paper explores the integration of LLMs into energy management, emphasizing their roles in automating the optimization of DSM strategies with Internet of electric vehicles.

Code Generation energy management +4

CD-Lamba: Boosting Remote Sensing Change Detection via a Cross-Temporal Locally Adaptive State Space Model

1 code implementation26 Jan 2025 Zhenkai Wu, Xiaowen Ma, Rongrong Lian, Kai Zheng, Mengting Ma, Wei zhang, Siyang Song

However, existing remote sensing change detection (RSCD) approaches based on Mamba frequently struggle to effectively perceive the inherent locality of change regions as they direct flatten and scan RS images (i. e., the features of the same region of changes are not distributed continuously within the sequence but are mixed with features from other regions throughout the sequence).

Change Detection Mamba

SpikSSD: Better Extraction and Fusion for Object Detection with Spiking Neuron Networks

1 code implementation25 Jan 2025 Yimeng Fan, Changsong Liu, Mingyang Li, Wei zhang

However, efficiently conducting feature extraction and fusion under the spiking characteristics of SNNs for object detection remains a pressing challenge.

object-detection Object Detection

A Novel Scene Coupling Semantic Mask Network for Remote Sensing Image Segmentation

1 code implementation22 Jan 2025 Xiaowen Ma, Rongrong Lian, Zhenkai Wu, Renxiang Guan, Tingfeng Hong, Mengjiao Zhao, Mengting Ma, Jiangtao Nie, Zhenhong Du, Siyang Song, Wei zhang

To deal with such limitations, this paper proposes a novel scene-Coupling semantic mask network, which reconstructs the vanilla attention with scene coupling and local global semantic masks strategies.

Image Segmentation Semantic Segmentation

GiNet: Integrating Sequential and Context-Aware Learning for Battery Capacity Prediction

no code implementations9 Jan 2025 Sara Sameer, Wei zhang, Xin Lou, Qingyu Yan, Terence Goh, Yulin Gao

The surging demand for batteries requires advanced battery management systems, where battery capacity modelling is a key functionality.

LDMapNet-U: An End-to-End System for City-Scale Lane-Level Map Updating

no code implementations6 Jan 2025 Deguo Xia, Weiming Zhang, Xiyan Liu, Wei zhang, Chenting Gong, Xiao Tan, Jizhou Huang, Mengmeng Yang, Diange Yang

By reconceptualizing the update task as an end-to-end map generation process grounded in historical map data, we introduce a paradigm shift in map updating that simultaneously generates vectorized maps and change information.

Autonomous Driving Change Detection

Balanced Multi-view Clustering

no code implementations5 Jan 2025 Zhenglai Li, Jun Wang, Chang Tang, Xinzhong Zhu, Wei zhang, Xinwang Liu

The widely used joint training paradigm in MvC is potentially not fully leverage the multi-view information, since the imbalanced and under-optimized view-specific features caused by the uniform learning objective for all views.

Clustering MULTI-VIEW LEARNING

Digital-Analog Transmission based Emergency Semantic Communications

no code implementations3 Jan 2025 Yuzhou Fu, Wenchi Cheng, Jingqing Wang, Liuguo Yin, Wei zhang

For EWC scene, we propose a performance-constrained semantic coding model, which considers the effects of the semantic noise and the channel noise.

Semantic Communication

Unleashing Correlation and Continuity for Hyperspectral Reconstruction from RGB Images

no code implementations2 Jan 2025 Fuxiang Feng, Runmin Cong, Shoushui Wei, YiPeng Zhang, Jun Li, Sam Kwong, Wei zhang

Therefore, we fully explore these inter-spectral relationships and propose a Correlation and Continuity Network (CCNet) for HSI reconstruction from RGB images.

Spectral Reconstruction

Decoupled Motion Expression Video Segmentation

no code implementations CVPR 2025 Hao Fang, Runmin Cong, Xiankai Lu, Xiaofei Zhou, Sam Kwong, Wei zhang

In this work, we propose DMVS, a simple framework constructed on the existing query-based VIS model, emphasizing decoupling the task into video instance segmentation and motion expression understanding.

Instance Segmentation Referring Video Object Segmentation +5

Less Attention is More: Prompt Transformer for Generalized Category Discovery

no code implementations CVPR 2025 Wei zhang, Baopeng Zhang, Zhu Teng, Wenxin Luo, Junnan Zou, Jianping Fan

This results in a model with more yet scattered attention, where neither excessive nor insufficient focus can grasp subtle differences to classify fine-grained unknown and known categories.

Contrastive Learning Self-Learning

STeInFormer: Spatial-Temporal Interaction Transformer Architecture for Remote Sensing Change Detection

1 code implementation23 Dec 2024 Xiaowen Ma, Zhenkai Wu, Mengting Ma, Mengjiao Zhao, Fan Yang, Zhenhong Du, Wei zhang

To address this problem, we present STeInFormer, a spatial-temporal interaction Transformer architecture for multi-temporal feature extraction, which is the first general backbone network specifically designed for RSCD.

Change Detection

Mathematics and Machine Creativity: A Survey on Bridging Mathematics with AI

no code implementations21 Dec 2024 Shizhe Liang, Wei zhang, Tianyang Zhong, Tianming Liu

This paper presents a comprehensive overview on the applications of artificial intelligence (AI) in mathematical research, highlighting the transformative role AI has begun to play in this domain.

Reinforcement Learning (RL) Survey

Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings

1 code implementation18 Dec 2024 Yuanhe Zhang, Zhenhong Zhou, Wei zhang, Xinyue Wang, Xiaojun Jia, Yang Liu, Sen Su

Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks yet still are vulnerable to external threats, particularly LLM Denial-of-Service (LLM-DoS) attacks.

Learning Implicit Features with Flow Infused Attention for Realistic Virtual Try-On

no code implementations16 Dec 2024 Delong Zhang, Qiwei Huang, Yuanliu liu, Yang Sun, Wei-Shi Zheng, Pengfei Xiong, Wei zhang

Image-based virtual try-on is challenging since the generated image should fit the garment to model images in various poses and keep the characteristics and details of the garment simultaneously.

Virtual Try-on

High-speed and High-quality Vision Reconstruction of Spike Camera with Spike Stability Theorem

no code implementations16 Dec 2024 Wei zhang, Weiquan Yan, Yun Zhao, Wenxiang Cheng, Gang Chen, Huihui Zhou, Yonghong Tian

To realize high-speed and high-quality vision reconstruction of the spike camera, we propose a new spike stability theorem that reveals the relationship between spike stream characteristics and stable light intensity.

STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation

1 code implementation16 Dec 2024 Cong Xu, Yunhang He, Jun Wang, Wei zhang

In order to combine the two distinct types of information, some additional challenges are encountered: 1) Modality erasure: Vanilla graph convolution, which proves rather useful in collaborative filtering, however erases multimodal information; 2) Modality forgetting: Multimodal information tends to be gradually forgotten as the recommendation loss essentially facilitates the learning of collaborative information.

Collaborative Filtering Multimodal Recommendation

Predicting Human Brain States with Transformer

1 code implementation11 Dec 2024 Yifei Sun, Mariano Cabezas, JiAh Lee, Chenyu Wang, Wei zhang, Fernando Calamante, Jinglei Lv

The human brain is a complex and highly dynamic system, and our current knowledge of its functional mechanism is still very limited.

Language Modelling Music Generation

Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset

1 code implementation CVPR 2025 Xiao Wang, Yu Jin, Wentao Wu, Wei zhang, Lin Zhu, Bo Jiang, Yonghong Tian

Object detection in event streams has emerged as a cutting-edge research area, demonstrating superior performance in low-light conditions, scenarios with motion blur, and rapid movements.

Computational Efficiency Mixture-of-Experts +3

ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance

no code implementations9 Dec 2024 Chunwei Wang, Guansong Lu, Junwei Yang, Runhui Huang, Jianhua Han, Lu Hou, Wei zhang, Hang Xu

In this paper, we introduce ILLUME, a unified multimodal large language model (MLLM) that seamlessly integrates multimodal understanding and generation capabilities within a single large language model through a unified next-token prediction formulation.

Image Generation Language Modeling +4

Rate-Distortion Optimized Skip Coding of Region Adaptive Hierarchical Transform Coefficients for MPEG G-PCC

no code implementations7 Dec 2024 Zehan Wang, Yuxuan Wei, Hui Yuan, Wei zhang, Peng Li

To address this problem, we propose an adaptive skip coding method for RAHT, which adaptively determines whether to encode the residuals of the last several layers or not, thereby improving the coding efficiency.

LAA-Net: A Physical-prior-knowledge Based Network for Robust Nighttime Depth Estimation

no code implementations5 Dec 2024 KeBin Peng, Haotang Li, Zhenyu Qi, Huashan Chen, Zi Wang, Wei zhang, Sen He

Existing self-supervised monocular depth estimation (MDE) models attempt to improve nighttime performance by using GANs to transfer nighttime images into their daytime versions.

Monocular Depth Estimation

Learning Whole-Body Loco-Manipulation for Omni-Directional Task Space Pose Tracking with a Wheeled-Quadrupedal-Manipulator

no code implementations4 Dec 2024 Kaiwen Jiang, Zhen Fu, Junde Guo, Wei zhang, Hua Chen

Specifically, we focus on the problem of how to coordinate the floating base and the robotic arm of a wheeled-quadrupedal manipulator robot to achieve direct six-dimensional (6D) end-effector (EE) pose tracking in task space.

Pose Tracking Reinforcement Learning (RL)

CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels

1 code implementation3 Dec 2024 Lingxiao Wei, He Yan, Xiangju Lu, Junmin Zhu, Jun Wang, Wei zhang

However, the scarcity of high-quality long-context summarization datasets has hindered further advancements in this area.

16k

CAdam: Confidence-Based Optimization for Online Learning

no code implementations29 Nov 2024 Shaowen Wang, AnAn Liu, Jian Xiao, Huan Liu, Yuekui Yang, Cong Xu, Qianqian Pu, Suncong Zheng, Wei zhang, Jian Li

Modern recommendation systems frequently employ online learning to dynamically update their models with freshly collected data.

Recommendation Systems

HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction

1 code implementation27 Nov 2024 Wei zhang, Qing Cheng, David Skuddis, Niclas Zeller, Daniel Cremers, Norbert Haala

We present HI-SLAM2, a geometry-aware Gaussian SLAM system that achieves fast and accurate monocular scene reconstruction using only RGB input.

3DGS

Generative Fuzzy System for Sequence Generation

no code implementations21 Nov 2024 Hailong Yang, Zhaohong Deng, Wei zhang, Zhuangzhuang Zhao, Guanjin Wang, Kup-Sze Choi

In this work, we introduce the fuzzy system, a classical modeling method that combines data and knowledge-driven mechanisms, to generative tasks.

Code Generation Machine Translation

Exploring Feature-based Knowledge Distillation for Recommender System: A Frequency Perspective

1 code implementation16 Nov 2024 Zhangchi Zhu, Wei zhang

By defining knowledge as different frequency components of the features, we theoretically demonstrate that regular feature-based knowledge distillation is equivalent to equally minimizing losses on all knowledge and further analyze how this equal loss weight allocation method leads to important knowledge being overlooked.

Knowledge Distillation Recommendation Systems

Counterfactual Learning-Driven Representation Disentanglement for Search-Enhanced Recommendation

no code implementations14 Nov 2024 Jiajun Cui, Xu Chen, Shuai Xiao, Chen Ju, Jinsong Lan, Qingwen Liu, Wei zhang

To address this, we propose a Counterfactual learning-driven representation disentanglement framework for search-enhanced recommendation, based on the common belief that a user would click an item under a query not solely because of the item-query match but also due to the item's query-independent general features (e. g., color or style) that interest the user.

Collaborative Filtering counterfactual +3

CDXLSTM: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory

1 code implementation12 Nov 2024 Zhenkai Wu, Xiaowen Ma, Rongrong Lian, Kai Zheng, Wei zhang

In complex scenes and varied conditions, effectively integrating spatial-temporal context is crucial for accurately identifying changes.

Change Detection

LIFBench: Evaluating the Instruction Following Performance and Stability of Large Language Models in Long-Context Scenarios

1 code implementation11 Nov 2024 Xiaodong Wu, Minhao Wang, Yichen Liu, Xiaoming Shi, He Yan, Xiangju Lu, Junmin Zhu, Wei zhang

As Large Language Models (LLMs) evolve in natural language processing (NLP), their ability to stably follow instructions in long-context inputs has become critical for real-world applications.

Instruction Following

MdEval: Massively Multilingual Code Debugging

no code implementations4 Nov 2024 Shukai Liu, Linzheng Chai, Jian Yang, Jiajun Shi, He Zhu, Liran Wang, Ke Jin, Wei zhang, Hualei Zhu, Shuyue Guo, Tao Sun, Jiaheng Liu, Yunlong Duan, Yu Hao, Liqun Yang, Guanglin Niu, Ge Zhang, Zhoujun Li

Code large language models (LLMs) have made significant progress in code debugging by directly generating the correct code based on the buggy code snippet.

Program Repair

E2E-AFG: An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation

no code implementations1 Nov 2024 Yun Jiang, Zilong Xie, Wei zhang, Yun Fang, Shuai Pan

Retrieval-augmented generation methods often neglect the quality of content retrieved from external knowledge bases, resulting in irrelevant information or potential misinformation that negatively affects the generation results of large language models.

Misinformation Retrieval +2

Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models

1 code implementation31 Oct 2024 Jianqun Zhou, Yuanlei Zheng, Wei Chen, Qianqian Zheng, Hui Su, Wei zhang, Rui Meng, Xiaoyu Shen

Instruction-following capabilities in LLMs have progressed significantly, enabling more complex user interactions through detailed prompts.

Instruction Following Reranking +1

Integration of Communication and Computational Imaging

no code implementations25 Oct 2024 Zhenming Yu, Liming Cheng, Hongyu Huang, Wei zhang, Liang Lin, Kun Xu

Herein, we propose a novel framework that integrates communication and computational imaging (ICCI) to break through the inherent isolation between communication and computational imaging for remote perception.

Data Compression

CLAP. I. Resolving miscalibration for deep learning-based galaxy photometric redshift estimation

1 code implementation25 Oct 2024 Qiufan Lin, Hengxin Ruan, Dominique Fouchez, Shupei Chen, Rui Li, Paulo Montero-Camacho, Nicola R. Napolitano, Yuan-Sen Ting, Wei zhang

It leverages supervised contrastive learning (SCL) and k-nearest neighbours (KNN) to construct and calibrate raw probability density estimates, and implements a refitting procedure to resume end-to-end discriminative models ready to produce final estimates for large-scale imaging data.

Computational Efficiency Contrastive Learning +2

Brain-like Functional Organization within Large Language Models

no code implementations25 Oct 2024 Haiyang Sun, Lin Zhao, Zihao Wu, Xiaohui Gao, Yutao Hu, Mengfei Zuo, Wei zhang, Junwei Han, Tianming Liu, Xintao Hu

In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs), the foundational organizational structure of the human brain.

Calibrating Deep Neural Network using Euclidean Distance

no code implementations23 Oct 2024 Wenhao Liang, Chang Dong, Liangwei Zheng, Zhengyang Li, Wei zhang, Weitong Chen

This research introduces a novel loss function called Focal Calibration Loss (FCL), designed to improve probability calibration while retaining the advantages of Focal Loss in handling difficult samples.

Navigate

Corrected Soft Actor Critic for Continuous Control

no code implementations22 Oct 2024 Yanjun Chen, Xinming Zhang, Xianghui Wang, Zhiqiang Xu, Xiaoyu Shen, Wei zhang

The Soft Actor-Critic (SAC) algorithm is known for its stability and high sample efficiency in deep reinforcement learning.

continuous-control Continuous Control +1

MAC Revivo: Artificial Intelligence Paves the Way

no code implementations21 Oct 2024 Jinzhe Pan, Jingqing Wang, Zelin Yun, Zhiyong Xiao, Yuehui Ouyang, Wenchi Cheng, Wei zhang

The vast adoption of Wi-Fi and/or Bluetooth capabilities in Internet of Things (IoT) devices, along with the rapid growth of deployed smart devices, has caused significant interference and congestion in the industrial, scientific, and medical (ISM) bands.

Real-time Stereo-based 3D Object Detection for Streaming Perception

1 code implementation16 Oct 2024 Changcai Li, Zonghua Gu, Gang Chen, Libo Huang, Wei zhang, Huihui Zhou

StreamDSGN is an end-to-end framework that directly predicts the 3D properties of objects in the next moment by leveraging historical information, thereby alleviating the accuracy degradation of streaming perception.

3D Object Detection Autonomous Driving +1

AoI-Aware Resource Allocation for Smart Multi-QoS Provisioning

no code implementations16 Oct 2024 Jingqing Wang, Wenchi Cheng, Wei zhang

To address these challenges, we propose a DRL-based framework for AoI-aware optimal resource allocation in mURLLC-driven multi-QoS schemes, leveraging AoI as a core metric within the finite blocklength regime.

Deep Reinforcement Learning

FBC-Enhanced ε-Effective Capacity Optimization for NOMA

no code implementations15 Oct 2024 Jingqing Wang, Wenchi Cheng, Wei zhang

The advent of massive ultra-reliable and low-latency communications (mURLLC) has introduced a critical class of time- and reliability-sensitive services within next-generation wireless networks.

The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Language Models

1 code implementation9 Oct 2024 Yanjun Chen, Dawei Zhu, Yirong Sun, Xinghao Chen, Wei zhang, Xiaoyu Shen

Reinforcement Learning from Human Feedback significantly enhances Natural Language Processing by aligning language models with human expectations.

As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss

no code implementations7 Oct 2024 Xin Mao, Feng-Lin Li, Huimin Xu, Wei zhang, Wang Chen, Anh Tuan Luu

The experimental results show that BNF achieves comparable performance to the best methods on QA benchmarks, while its performance decrease on the four reasoning benchmarks is significantly lower compared to the best methods, thus striking a better balance between value alignment and reasoning ability.

EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation

no code implementations30 Sep 2024 Hongyu Chen, Weiming Zeng, Chengcheng Chen, Luhui Cai, Fei Wang, Yuhu Shi, Lei Wang, Wei zhang, Yueyang Li, Hongjie Yan, Wai Ting Siok, Nizhuan Wang

This paper presents the EEG Emotion Copilot, a system optimizing a lightweight large language model (LLM) with 0. 5B parameters operating in a local setting, which first recognizes emotional states directly from EEG signals, subsequently generates personalized diagnostic and treatment suggestions, and finally supports the automation of assisted electronic medical records.

Computational Efficiency Diagnostic +4

X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation

1 code implementation28 Sep 2024 Pinxue Guo, Wanyun Li, Hao Huang, Lingyi Hong, Xinyu Zhou, Zhaoyu Chen, Jinglun Li, Kaixun Jiang, Wei zhang, Wenqiang Zhang

The X-Prompt framework first pre-trains a video object segmentation foundation model using RGB data, and then utilize the additional modality of the prompt to adapt it to downstream multi-modal tasks with limited data.

Semantic Segmentation Video Object Segmentation +1

Multi-Atlas Brain Network Classification through Consistency Distillation and Complementary Information Fusion

no code implementations28 Sep 2024 Jiaxing Xu, Mengcheng Lan, Xia Dong, Kai He, Wei zhang, Qingtian Bian, Yiping Ke

Some recent methods have proposed utilizing multiple atlases, but they neglect consistency across atlases and lack ROI-level information exchange.

General Compression Framework for Efficient Transformer Object Tracking

no code implementations26 Sep 2024 Lingyi Hong, Jinglun Li, Xinyu Zhou, Shilin Yan, Pinxue Guo, Kaixun Jiang, Zhaoyu Chen, Shuyong Gao, Wei zhang, Hong Lu, Wenqiang Zhang

Thus, we propose a general model compression framework for efficient transformer object tracking, named CompressTracker, to reduce the size of a pre-trained tracking model into a lightweight tracker with minimal performance degradation.

Model Compression Object +1

Performance Boundary Analyses for Statistical Multi-QoS Framework Over 6G SAGINs

no code implementations25 Sep 2024 Jingqing Wang, Wenchi Cheng, Wei zhang

In response to the complex, heterogeneous, and dynamic serving scenarios and stringent performance expectations for 6G SAGINs, it is crucial to undertake modeling, assurance, and analysis of the key technologies, aligned with the diverse demands for QoS provisioning in the non-asymptotic regime, i. e., when implementing finite blocklength coding (FBC) as a new dimension for error-rate bounded QoS metric.

The Roles of Generative Artificial Intelligence in Internet of Electric Vehicles

no code implementations24 Sep 2024 Hanwen Zhang, Dusit Niyato, Wei zhang, Changyuan Zhao, Hongyang Du, Abbas Jamalipour, Sumei Sun, Yiyang Pei

With the advancements of generative artificial intelligence (GenAI) models, their capabilities are expanding significantly beyond content generation and the models are increasingly being used across diverse applications.

Survey

Reinforcement Leaning for Infinite-Dimensional Systems

no code implementations24 Sep 2024 Wei zhang, Jr-Shin Li

We then develop a moment kernel transform to map the parameterized system and the value function of an RL problem into a reproducing kernel Hilbert space.

Reinforcement Learning (RL)

FreeAvatar: Robust 3D Facial Animation Transfer by Learning an Expression Foundation Model

no code implementations20 Sep 2024 Feng Qiu, Wei zhang, Chen Liu, Rudong An, Lincheng Li, Yu Ding, Changjie Fan, Zhipeng Hu, Xin Yu

In the facial animation transfer component, we propose a novel Expression-driven Multi-avatar Animator, which first maps expressive semantics to the facial control parameters of 3D avatars and then imposes perceptual constraints between the input and output images to maintain expression consistency.

Face Reconstruction

Atmospheric Turbulence-Immune Free Space Optical Communication System based on Discrete-Time Analog Transmission

no code implementations18 Sep 2024 Hongyu Huang, Zhenming Yu, Yi Lei, Wei zhang, Yongli Zhao, Shanguo Huang, Kun Xu

To effectively mitigate the influence of atmospheric turbulence, a novel discrete-time analog transmission free-space optical (DTAT-FSO) communication scheme is proposed.

IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web

no code implementations14 Sep 2024 Hongcheng Guo, Wei zhang, JunHao Chen, Yaonan Gu, Jian Yang, Junjia Du, Binyuan Hui, Tianyu Liu, Jianxin Ma, Chang Zhou, Zhoujun Li

We have conducted extensive experiments on existing large multimodal models, offering insights into their performance and areas for improvement in image-to-web domain.

Image Comprehension

Bayesian Dynamic Factor Models for High-dimensional Matrix-valued Time Series

no code implementations12 Sep 2024 Wei zhang

To determine the dimension of the factor matrix, we employ an importance-sampling estimator based on the cross-entropy method to estimate marginal likelihoods.

Time Series

E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning

no code implementations10 Sep 2024 Zihan Liao, Hang Yu, Lingxiao Wei, Jianguo Li, Jun Wang, Wei zhang

In the realm of Large Language Models (LLMs), the ability to process long contexts is increasingly crucial for tasks such as multi-round dialogues, code generation, and document summarization.

Code Generation Decoder +2

Preserving Individuality while Following the Crowd: Understanding the Role of User Taste and Crowd Wisdom in Online Product Rating Prediction

no code implementations6 Sep 2024 Liang Wang, Shubham Jain, Yingtong Dou, Junpeng Wang, Chin-Chia Michael Yeh, Yujie Fan, Prince Aboagye, Yan Zheng, Xin Dai, Zhongfang Zhuang, Uday Singh Saini, Wei zhang

Our findings underscore the significance of individual user tastes in the context of online product rating prediction and the robustness of our approach across different model architectures.

Prediction

Cycle Pixel Difference Network for Crisp Edge Detection

no code implementations6 Sep 2024 Changsong Liu, Wei zhang, Yanyan Liu, Mingyang Li, Wenlin Li, Yimeng Fan, Xiangnan Bai, Liang Zhang

We construct a U-shape encoder-decoder model named CPD-Net that successfully addresses these two issues simultaneously.

Decoder Edge Detection

Interpreting and Improving Large Language Models in Arithmetic Calculation

no code implementations3 Sep 2024 Wei zhang, Chaoqun Wan, Yonggang Zhang, Yiu-ming Cheung, Xinmei Tian, Xu Shen, Jieping Ye

In this work, we delve into uncovering a specific mechanism by which LLMs execute calculations.

MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement Learning

1 code implementation2 Sep 2024 Jiarui Sun, M. Ugur Akcal, Wei zhang, Girish Chowdhary

In visual Reinforcement Learning (RL), learning from pixel-based observations poses significant challenges on sample efficiency, primarily due to the complexity of extracting informative state representations from high-dimensional data.

Contrastive Learning graph construction +1

Are LLM-based Recommenders Already the Best? Simple Scaled Cross-entropy Unleashes the Potential of Traditional Sequential Recommenders

1 code implementation26 Aug 2024 Cong Xu, Zhangchi Zhu, Mo Yu, Jun Wang, Jianyong Wang, Wei zhang

Some studies have observed that LLMs, when fine-tuned by the cross-entropy (CE) loss with a full softmax, could achieve `state-of-the-art' performance in sequential recommendation.

Sequential Recommendation

When Diffusion MRI Meets Diffusion Model: A Novel Deep Generative Model for Diffusion MRI Generation

no code implementations23 Aug 2024 Xi Zhu, Wei zhang, Yijie Li, Lauren J. O'Donnell, Fan Zhang

This achievement underscores a substantial progression in enhancing dMRI quality, highlighting the potential of our novel generative approach to revolutionize dMRI imaging standards.

Diffusion MRI model

Coarse-to-Fine Detection of Multiple Seams for Robotic Welding

no code implementations20 Aug 2024 Pengkun Wei, Shuo Cheng, Dayou Li, Ran Song, YiPeng Zhang, Wei zhang

The RGB image is used to obtain the region of interest by approximately localizing the weld seams, and the point cloud is used to achieve the fine-edge extraction of the weld seams within the region of interest using region growth.

UNINEXT-Cutie: The 1st Solution for LSVOS Challenge RVOS Track

no code implementations19 Aug 2024 Hao Fang, Feiyu Pan, Xiankai Lu, Wei zhang, Runmin Cong

Referring video object segmentation (RVOS) relies on natural language expressions to segment target objects in video.

Referring Video Object Segmentation Semantic Segmentation +1

Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track

no code implementations19 Aug 2024 Feiyu Pan, Hao Fang, Runmin Cong, Wei zhang, Xiankai Lu

Video Object Segmentation (VOS) task aims to segmenting a particular object instance throughout the entire video sequence given only the object mask of the first frame.

Object Segmentation +4

Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting

no code implementations18 Aug 2024 Zeyuan Chen, Haiyan Wu, Kaixin Wu, Wei Chen, Mingjie Zhong, Jia Xu, Zhongyi Liu, Wei zhang

In response, we propose ProRBP, a novel Progressive Retrieved Behavior-augmented Prompting framework for integrating search scenario-oriented knowledge with LLMs effectively.

Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions

1 code implementation14 Aug 2024 Quan Liu, Zhenhong Zhou, Longzhu He, Yi Liu, Wei zhang, Sen Su

Large language models are susceptible to jailbreak attacks, which can result in the generation of harmful content.

Safety Alignment

Breaking Limits of Line-of-Sight MIMO Capacity in 6G Wireless Communications

no code implementations13 Aug 2024 Haiyue Jing, Wenchi Cheng, Wei zhang

Multiple-input-multiple-output (MIMO) has been proved its success for the fourth generation (4G) long term evolution (LTE) and is one of the key technical enablers for evolved mobile broadband (eMBB) in the fifth generation (5G) wireless communications.

Achieving Practical OAM Based Wireless Communications With Misaligned Transceiver

no code implementations13 Aug 2024 Wenchi Cheng, Haiyue Jing, Wei zhang, Zan Li, Hailin Zhang

To maintain the orthogonality among different OAM modes at the receiver, the strict alignment between transmit and receive antennas is highly demanded.

Quasi-Fractal UCA Based OAM for Highly Efficient Orthogonal Transmission

no code implementations10 Aug 2024 Wenchi Cheng, Haiyue Jing, Wei zhang, Keyi Zhang, Hailin Zhang

We perform the two-dimension OAM modulation (TOM) and demodulation (TOD) schemes with the orthogonal OAM mode number exceeding the array-element number, which is beyond the traditional concept of multiple antennas based wireless communications.

Enhanced Traffic Flow Prediction with Multi-Segment Fusion Tensor Graph Convolutional Networks

no code implementations8 Aug 2024 Wei zhang, Peng Tang

Accurate traffic Flow Prediction can assist in traffic management, route planning, and congestion mitigation, which holds significant importance in enhancing the efficiency and reliability of intelligent transportation systems (ITS).

Management Prediction

UpLIF: An Updatable Self-Tuning Learned Index Framework

no code implementations7 Aug 2024 Alireza Heidari, Amirhossein Ahmadi, Wei zhang

The emergence of learned indexes has caused a paradigm shift in our perception of indexing by considering indexes as predictive models that estimate keys' positions within a data set, resulting in notable improvements in key search efficiency and index size reduction; however, a significant challenge inherent in learned index modeling is its constrained support for update operations, necessitated by the requirement for a fixed distribution of records.

DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting

1 code implementation5 Aug 2024 Ruixin Ding, Yuqi Chen, Yu-Ting Lan, Wei zhang

Our proposed model, named DRFormer, is evaluated on various real-world datasets, and experimental results demonstrate its superiority compared to existing methods.

Position Sparse Learning +3

Rate Maximization for RIS-Assisted OAM Multiuser Wireless Communications

no code implementations2 Aug 2024 Jun Lan, Liping Liang, Wenchi Cheng, Wei zhang

Conventional multiple-input multiple-out (MIMO) technologies have encountered bottlenecks of significantly increasing spectrum efficiencies of wireless communications due to the low degrees of freedom in practical line-of-sight scenarios and severe path loss of high frequency carriers.

A new approach for encoding code and assisting code understanding

no code implementations1 Aug 2024 Mengdan Fan, Wei zhang, Haiyan Zhao, Zhi Jin

Some companies(e. g., Microsoft Research and Google DeepMind) have discovered some of the limitations of GPTs autoregressive paradigm next-word prediction, manifested in the model lack of planning, working memory, backtracking, and reasoning skills.

Code Generation Image Generation

STANet: A Novel Spatio-Temporal Aggregation Network for Depression Classification with Small and Unbalanced FMRI Data

no code implementations31 Jul 2024 Wei zhang, Weiming Zeng, Hongyu Chen, Jie Liu, Hongjie Yan, Kaile Zhang, Ran Tao, Wai Ting Siok, Nizhuan Wang

In this study, we propose the Spatio-Temporal Aggregation Network (STANet) for diagnosing depression by integrating CNN and RNN to capture both temporal and spatial features of brain activity.

Diagnostic Functional Connectivity +1

Exploring Loss Landscapes through the Lens of Spin Glass Theory

no code implementations30 Jul 2024 Hao Liao, Wei zhang, Zhanyi Huang, Zexiao Long, Mingyang Zhou, Xiaoqun Wu, Rui Mao, Chi Ho Yeung

Specifically, we used (1) random walk in the parameter space of DNNs to unravel the structures in their loss landscape; (2) a permutation-interpolation protocol to study the connection between copies of identical regions in the loss landscape due to the permutation symmetry in the hidden layers; (3) hierarchical clustering to reveal the hierarchy among trained solutions of DNNs, reminiscent of the so-called Replica Symmetry Breaking (RSB) phenomenon (i. e. the Parisi solution) in spin glass; (4) finally, we examine the relationship between the ruggedness of DNN's loss landscape and its generalizability, showing an improvement of flattened minima.

MetaHive: A Cache-Optimized Metadata Management for Heterogeneous Key-Value Stores

no code implementations26 Jul 2024 Alireza Heidari, Amirhossein Ahmadi, Zefeng Zhi, Wei zhang

KV stores frequently consist of heterogeneous clusters, characterized by varying hardware specifications of the deployment nodes, with each node potentially running a distinct version of the KV store software.

Management

Channel Estimation for Movable-Antenna MIMO Systems Via Tensor Decomposition

no code implementations26 Jul 2024 Ruoyu Zhang, Lei Cheng, Wei zhang, Xinrong Guan, Yueming Cai, Wen Wu, Rui Zhang

In this letter, we investigate the channel estimation problem for MIMO wireless communication systems with movable antennas (MAs) at both the transmitter (Tx) and receiver (Rx).

Tensor Decomposition

Virtual Full-Duplex Wireless Communications with Zero-Interval Modulation and Sampling

no code implementations24 Jul 2024 Jianyu Wang, Wenchi Cheng, Wei zhang, Hailin Zhang

In ZIMS-VFD, the transceiver inserts a zero-interval for each symbol in the transmit signal and provides self-interference (SI)-free intervals for itself.

Resource Allocation for 5G-UAV Based Emergency Wireless Communications

no code implementations24 Jul 2024 Zhuohui Yao, Wenchi Cheng, Wei zhang, Hailin Zhang

Numerical results show that the new heterogeneous Fisher-Snedecor $\mathcal{F}$ composite fading channel adapted resource allocation schemes can achieve higher capacity and energy efficiency than those of traditional channel model adapted resource allocation schemes, thus providing better communications service for post-disaster areas.

Affective Behaviour Analysis via Progressive Learning

no code implementations24 Jul 2024 Chen Liu, Wei zhang, Feng Qiu, Lincheng Li, Xin Yu

To advance this, the 7th Affective Behavior Analysis in-the-wild (ABAW) competition establishes two tracks: i. e., the Multi-task Learning (MTL) Challenge and the Compound Expression (CE) challenge based on Aff-Wild2 and C-EXPR-DB datasets.

Multi-Task Learning

Enhancing LLM's Cognition via Structurization

1 code implementation23 Jul 2024 Kai Liu, Zhihang Fu, Chao Chen, Wei zhang, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye

Besides, we show the feasibility of distilling advanced LLMs' language processing abilities to a smaller yet effective StruXGPT-7B to execute structurization, addressing the practicality of our approach.

Hallucination Hallucination Evaluation +1

Norface: Improving Facial Expression Analysis by Identity Normalization

1 code implementation22 Jul 2024 Hanwei Liu, Rudong An, Zhimeng Zhang, Bowen Ma, Wei zhang, Yan Song, Yujing Hu, Wei Chen, Yu Ding

First, the carefully designed normalization network struggles to directly remove the above task-irrelevant noise, by maintaining facial expression consistency but normalizing all original images to a common identity with consistent pose, and background.

Classification Facial Action Unit Detection +3

Robust Multi-Beam Secure mmWave Wireless Communication for Hybrid Wiretapping Systems

no code implementations19 Jul 2024 Bin Qiu, Wenchi Cheng, Wei zhang

In this paper, we consider the physical layer (PHY) security problem for hybrid wiretapping wireless systems in millimeter wave transmission, where active eavesdroppers (AEs) and passive eavesdroppers (PEs) coexist to intercept the confidential messages and emit jamming signals.

valid

Joint Information and Jamming Beamforming for Securing IoT Networks With Rate-Splitting

no code implementations19 Jul 2024 Bin Qiu, Wenchi Cheng, Wei zhang

The goal of this paper is to address the physical layer (PHY) security problem for multi-user multi-input single-output (MU-MISO) Internet of Things (IoT) systems in the presence of passive eavesdroppers (Eves).

Mode Hopping with OAM-Based Index Modulation

no code implementations18 Jul 2024 Liping Liang, Wenchi Cheng, Wei zhang, Hailin Zhang

In this paper, we propose an MH with OAM-based index modulation scheme, where several OAM-modes are activated for hopping, to achieve high SE at a given bit error rate in radio vortex wireless communications.

Mode Hopping for Anti-Jamming in Radio Vortex Wireless Communications

no code implementations18 Jul 2024 Liping Liang, Wenchi Cheng, Wei zhang, Hailin Zhang

In particular, we propose the mode hopping (MH) scheme for antijamming within the narrow frequency band.

Joint OAM Multiplexing and OFDM in Sparse Multipath Environments

no code implementations18 Jul 2024 Liping Liang, Wenchi Cheng, Wei zhang, Hailin Zhang

In this paper, a hybrid orthogonal division multiplexing (HODM) scheme by using OAM multiplexing and orthogonal frequency division multiplexing (OFDM) in conjunction is proposed to achieve high-capacity wireless communications in sparse multipath environments, where the scatterers are sparse.

Decomposed and Distributed Directional Modulation for Secure Wireless Communication

no code implementations18 Jul 2024 Bin Qiu, Wenchi Cheng, Wei zhang

This paper presents an AN-aided decomposed and distributed directional modulation (D3M) scheme for secure wireless communications, which takes advantage of the spatial signatures to achieve an extra range-dimension security apart from the angles.

EarthMarker: A Visual Prompting Multi-modal Large Language Model for Remote Sensing

1 code implementation18 Jul 2024 Wei zhang, Miaoxin Cai, Tong Zhang, Jun Li, Yin Zhuang, Xuerui Mao

Specifically, a shared visual encoding method is developed to establish the spatial pattern interpretation relationships between the multi-scale representations of input images and various visual prompts.

Instruction Following Language Modeling +4

JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation

no code implementations17 Jul 2024 Chenhan Jiang, Yihan Zeng, Tianyang Hu, Songcun Xu, Wei zhang, Hang Xu, Dit-yan Yeung

However, this paradigm distills view-agnostic 2D image distributions into the rendering distribution of 3D representation for each view independently, overlooking the coherence across views and yielding 3D inconsistency in generations.

3D Generation Text to 3D

Index Modulation Embedded Mode Hopping for Anti-Jamming

no code implementations17 Jul 2024 Liping Liang, Wenchi Cheng, Wei zhang, Hailin Zhang

To achieve efficient anti-jamming and increase SE of wireless communications with slight computational complexity cost, in this paper we propose an index-modulation embedded mode-hopping (IM-MH) scheme, which simultaneously activates several OAM-modes for hopping along with additional index information and signal information transmission.

RIS-Based Self-Interference Cancellation for Full-Duplex Broadband Transmission

no code implementations17 Jul 2024 Jiayan Wu, Wenchi Cheng, Jianyu Wang, Jingqing Wang, Wei zhang

The problem is solved with alternate optimization (AO) algorithm in three cases: ideal case, where both the amplitude and phase of each RIS unit cell can be controlled independently and continuously, continuous phases, where the phase of each RIS unit cell can be controlled independently, while the amplitude is fixed to one, and discrete phases, where the RC of each RIS unit cell can only take discrete values and these discrete values are equally spaced on the unit circle.

Generative AI Driven Task-Oriented Adaptive Semantic Communications

no code implementations16 Jul 2024 Yuzhou Fu, Wenchi Cheng, Jingqing Wang, Liuguo Yin, Wei zhang

The existing TOSC frameworks focus on extracting the full semantic features of source data and learning low-dimensional channel inputs to transmit them within limited bandwidth resources.

Instance Segmentation object-detection +3

Phases Calibration of RIS Using Backpropagation Algorithm

1 code implementation16 Jul 2024 Wei zhang, Bin Zhou, Tianyi Zhang, Yi Jiang, Zhiyong Bu

Reconfigurable intelligent surface (RIS) technology has emerged in recent years as a promising solution to the ever-increasing demand for wireless communication capacity.

Reconfigurable-Intelligent-Surface Assisted Orbital-Angular-Momentum Secure Communications

no code implementations16 Jul 2024 Minmin Wang, Liping Liang, Wenchi Cheng, Wei zhang, Ruirui Chen, Hailin Zhang

As a kind of wavefront with helical phase, orbital angular momentum (OAM) shows the great potential to enhance the security results of wireless communications due to its unique orthogonality and central hollow electromagnetic wave structure.

Scalable Extraction Based Semantic Communication for 6G Wireless Networks

no code implementations16 Jul 2024 Yuzhou Fu, Wenchi Cheng, Wei zhang, Jingqing Wang

In this article, we introduce a novel Scalable Extraction based Semantic Communication (SE-SC) model to support the potential applications in 6G wireless networks and then analyze its feasibility.

Semantic Communication

Digital-Analog Transmission Framework for Task-Oriented Semantic Communications

no code implementations16 Jul 2024 Yuzhou Fu, Wenchi Cheng, Wei zhang

The semantic feature, constituted by analog vectors of a lower dimensionality relative to the original source data, reserves the meaning of the source data.

Semantic Communication

Enhanced Battery Degradation-Aware Scheduling for Distribution Network with Electric Vehicle Load

no code implementations9 Jul 2024 Vijay Babu Pamshetti, Wei zhang, Andy Man-Fai Ng, Qingyu Yan, Kuan Tak Tan

We formulate a multi-objective framework for optimizing battery scheduling with the goals of minimizing monetary costs and improving network performance.

Scheduling

Evolutionary Morphology Towards Overconstrained Locomotion via Large-Scale, Multi-Terrain Deep Reinforcement Learning

no code implementations1 Jul 2024 Yenan Chen, Chuye Zhang, Pengxi Gu, Jianuo Qiu, Jiayi Yin, Nuofan Qiu, Guojing Huang, Bangchao Huang, Zishang Zhang, Hui Deng, Wei zhang, Fang Wan, Chaoyang Song

Then, we implemented a large-scale, multi-terrain deep reinforcement learning framework to train these reconfigurable limbs for a comparative analysis of overconstrained locomotion in energy efficiency.

Deep Reinforcement Learning

Stochastic Solutions for Simultaneous Seismic Data Denoising and Reconstruction via Score-based Generative Models

1 code implementation IEEE Transactions on Geoscience and Remote Sensing 2024 Chuangji Meng, Jinghuai Gao∗, Yajun Tian, Hongling Chen∗, Wei zhang, Renyu Luo.

We also analyze the advantages of our approach and concluded that successful generative modeling of seismic data by the score-based generative models (SGMs) is the key to posterior sampling for the inverse problems, which all benefit from the seismic data prior implicit in the trained score network in the SGMs.

Denoising Geophysics +1

Retain, Blend, and Exchange: A Quality-aware Spatial-Stereo Fusion Approach for Event Stream Recognition

1 code implementation27 Jun 2024 Lan Chen, Dong Li, Xiao Wang, Pengpeng Shao, Wei zhang, YaoWei Wang, Yonghong Tian, Jin Tang

In this paper, we propose a novel dual-stream framework for event stream-based pattern recognition via differentiated fusion, termed EFV++.

Graph Neural Network

Assessing "Implicit" Retrieval Robustness of Large Language Models

no code implementations26 Jun 2024 Xiaoyu Shen, Rexhina Blloshmi, Dawei Zhu, Jiahuan Pei, Wei zhang

Our findings reveal that fine-tuning on a mix of gold and distracting context significantly enhances the model's robustness to retrieval inaccuracies, while still maintaining its ability to extract correct answers when retrieval is accurate.

Retrieval Retrieval-augmented Generation

D2LLM: Decomposed and Distilled Large Language Models for Semantic Search

1 code implementation25 Jun 2024 Zihan Liao, Hang Yu, Jianguo Li, Jun Wang, Wei zhang

In this paper, we present D2LLMs-Decomposed and Distilled LLMs for semantic search-that combines the best of both worlds.

LOGCAN++: Adaptive Local-global class-aware network for semantic segmentation of remote sensing imagery

1 code implementation24 Jun 2024 Xiaowen Ma, Rongrong Lian, Zhenkai Wu, Hongbo Guo, Mengting Ma, Sensen Wu, Zhenhong Du, Siyang Song, Wei zhang

In particular, we introduce affine transformations in the LCA module for adaptive extraction of local class representations to effectively tolerate scale and orientation variations in remotely sensed images.

Image Segmentation Segmentation +2

Rethinking Remote Sensing Change Detection With A Mask View

1 code implementation21 Jun 2024 Xiaowen Ma, Zhenkai Wu, Rongrong Lian, Wei zhang, Siyang Song

Consequently, we further propose the instance network CDMaskFormer customized for the change detection task, which includes: (i) a Spatial-temporal convolutional attention-based instantiated change extractor to capture spatio-temporal context simultaneously with lightweight operations; and (ii) a scene-guided axial attention-instantiated transformer decoder to extract more spatial details.

Change Detection Decoder

DuMapNet: An End-to-End Vectorization System for City-Scale Lane-Level Map Generation

no code implementations20 Jun 2024 Deguo Xia, Weiming Zhang, Xiyan Liu, Wei zhang, Chenting Gong, Jizhou Huang, Mengmeng Yang, Diange Yang

This paper overcomes these limitations and presents an industrial-grade solution named DuMapNet that outputs standardized, vectorized map elements and their topology in an end-to-end paradigm.

DDLNet: Boosting Remote Sensing Change Detection with Dual-Domain Learning

1 code implementation19 Jun 2024 Xiaowen Ma, Jiawei Yang, Rui Che, Huanting Zhang, Wei zhang

Remote sensing change detection (RSCD) aims to identify the changes of interest in a region by analyzing multi-temporal remote sensing images, and has an outstanding value for local development monitoring.

Change Detection

Estimating Difficulty Levels of Programming Problems with Pre-trained Model

no code implementations13 Jun 2024 Zhiyuan Wang, Wei zhang, Jun Wang

As the demand for programming skills grows across industries and academia, students often turn to Programming Online Judge (POJ) platforms for coding practice and competition.

Integrated Sensing and Communication for Anti-Jamming with OAM

no code implementations9 Jun 2024 Liping Liang, Wenchi Cheng, Wei zhang, Zhuohui Yao

The spectrum share and open nature of wireless channels enable integrated sensing and communication (ISAC) susceptible to hostile jamming attacks.

Integrated sensing and communication ISAC +1

Double-RIS-Assisted Orbital Angular Momentum Near-Field Secure Communications

no code implementations9 Jun 2024 Liping Liang, Minmin Wang, Wenchi Cheng, Wei zhang

To satisfy the various demands of growing devices and services, emerging high-frequency-based technologies promote near-field wireless communications.

Achieving High Capacity Transmission With N-Dimensional Quasi-Fractal UCA

no code implementations9 Jun 2024 Hongyun Jin, Wenchi Cheng, Haiyue Jing, Jingqing Wang, Wei zhang

Then, we investigate different dimensional multiplexing transmission schemes based on the corresponding QF-UCA antenna structure with various array-element layouts and evaluate the optimal layout type and dimension to obtain the highest channel capacity with a fixed number of array-elements.

Learning to utilize image second-order derivative information for crisp edge detection

no code implementations9 Jun 2024 Changsong Liu, Yimeng Fan, Mingyang Li, Wei zhang, Yanyan Liu, Yuming Li, Wenlin Li, Liang Zhang

In the end, we propose a U-shape network named LUS-Net which is based on the SDMCM and BRM for crisp edge detection.

Edge Detection

Fractal OAM Generation and Detection Schemes

no code implementations8 Jun 2024 Runyu Lyu, Wenchi Cheng, Muyao Wang, Wei zhang

In this paper, we propose Talbot-effect-based fractal OAM generation and detection schemes using a uniform circular array (UCA) to significantly improve capacity and BER performance in unaligned OAM transmissions.

Modeling and Performance Analysis of OAM-NFC Systems

no code implementations8 Jun 2024 Runyu Lyu, Wenchi Cheng, Wei zhang

In this paper, we model and analyze the performance of the orbital angular momentum based NFC (OAM-NFC) system, which can significantly increase the capacity of NFC.

Activation Map-based Vector Quantization for 360-degree Image Semantic Communication

no code implementations7 Jun 2024 Yang Ma, Wenchi Cheng, Jingqing Wang, Wei zhang

In virtual reality (VR) applications, 360-degree images play a pivotal role in crafting immersive experiences and offering panoramic views, thus improving user Quality of Experience (QoE).

Quantization Semantic Communication

Statistical QoS Provisioning Architecture for 6G Satellite-Terrestrial Integrated Networks

no code implementations7 Jun 2024 Jingqing Wang, Wenchi Cheng, Wei zhang, Hui Liang

To effectively measure data freshness in satellite-terrestrial integrated communications, age of information (AoI) has recently surfaced as a new dimension of QoS metric to support time-sensitive applications.

Leveraging Pedagogical Theories to Understand Student Learning Process with Graph-based Reasonable Knowledge Tracing

1 code implementation7 Jun 2024 Jiajun Cui, Hong Qian, Bo Jiang, Wei zhang

The advancement of deep learning in this field has led to deep-learning knowledge tracing (DLKT) models that prioritize high predictive accuracy.

Knowledge Tracing

Nutrition Estimation for Dietary Management: A Transformer Approach with Depth Sensing

no code implementations4 Jun 2024 Zhengyi Kwan, Wei zhang, Zhengkui Wang, Aik Beng Ng, Simon See

In this paper, we propose NuNet, a transformer-based network designed for nutrition estimation that utilizes both RGB and depth information from food images.

Decoder Management +1

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

no code implementations CVPR 2024 Yuxing Duan, Shihan Peng, Lin Zhu, Wei zhang, Yi Chang, Sheng Zhong, Luxin Yan

Event camera has significant advantages in capturing dynamic scene information while being prone to noise interference, particularly in challenging conditions like low threshold and low illumination.

Denoising

Cannot find the paper you are looking for? You can Submit a new open access paper.