no code implementations • 5 Jun 2025 • Lorenzo Rossi, Michael Aerni, Jie Zhang, Florian Tramèr
Sequence models, such as Large Language Models (LLMs) and autoregressive image generators, have a tendency to memorize and inadvertently leak sensitive information.
no code implementations • 30 May 2025 • Jie Zhang, Haoyin Yan, Xiaofei Li
It is promising to design a single model that can suppress various distortions and improve speech quality, i. e., universal speech enhancement (USE).
no code implementations • 30 May 2025 • Ying Yang, Jie Zhang, Xiao Lv, Di Lin, Tao Xiang, Qing Guo
To address this, we propose \textbf{LightD}, a novel framework that generates natural adversarial samples for VLP models via semantically guided relighting.
no code implementations • 27 May 2025 • Yiding Shi, Jianan Zhou, Wen Song, Jieyi Bi, Yaoxin Wu, Jie Zhang
Heuristic design with large language models (LLMs) has emerged as a promising approach for tackling combinatorial optimization problems (COPs).
no code implementations • 23 May 2025 • Yingpeng Du, Tianjun Wei, Zhu Sun, Jie Zhang
Although speculative decoding (SD) methods can be a remedy with verification at different positions, they face challenges in ranking systems due to their left-to-right decoding paradigm.
no code implementations • 21 May 2025 • Zhehao Huang, Xinwen Cheng, Jie Zhang, JingHao Zheng, Haoran Wang, Zhengbao He, Tao Li, Xiaolin Huang
Recent advancements in deep models have highlighted the need for intelligent systems that combine continual learning (CL) for knowledge acquisition with machine unlearning (MU) for data removal, forming the Continual Learning-Unlearning (CLU) paradigm.
no code implementations • 18 May 2025 • Zhenhe Wu, Jian Yang, Jiaheng Liu, Xianjie Wu, Changzai Pan, Jie Zhang, Yu Zhao, Shuangyong Song, Yongxiang Li, Zhoujun Li
Tables present unique challenges for language models due to their structured row-column interactions, necessitating specialized approaches for effective comprehension.
1 code implementation • 18 May 2025 • Jie Zhang, Cezara Petrui, Kristina Nikolić, Florian Tramèr
Existing benchmarks for evaluating mathematical reasoning in large language models (LLMs) rely primarily on competition problems, formal proofs, or artificially challenging questions -- failing to capture the nature of mathematics encountered in actual research environments.
1 code implementation • 17 May 2025 • Qi Zhou, Jie Zhang, Dongxia Wang, Qiang Liu, Tianlin Li, Jin Song Dong, Wenhai Wang, Qing Guo
Human preference plays a crucial role in the refinement of large language models (LLMs).
no code implementations • 16 May 2025 • Qiuyu Zhu, Liang Zhang, Qianxiong Xu, Cheng Long, Jie Zhang
Graph-based retrieval-augmented generation (RAG) enables large language models (LLMs) to incorporate structured knowledge via graph retrieval as contextual input, enhancing more accurate and context-aware reasoning.
no code implementations • 9 May 2025 • Yize Zhou, Jie Zhang, Meijie Wang, Lun Yu
Academic misconduct detection in biomedical research remains challenging due to algorithmic narrowness in existing methods and fragmented analytical pipelines.
no code implementations • 8 May 2025 • Cong Qi, Yeqing Chen, Jie Zhang, Wei Zhi
Single-cell RNA sequencing (scRNA-seq) has revealed complex cellular heterogeneity, but recent studies emphasize that understanding biological function also requires modeling cell-cell communication (CCC), the signaling interactions mediated by ligand-receptor pairs that coordinate cellular behavior.
no code implementations • 6 May 2025 • Haoran Ou, Gelei Deng, Xingshuo Han, Jie Zhang, Xinlei He, Han Qiu, Shangwei Guo, Tianwei Zhang
The rise of Internet connectivity has accelerated the spread of disinformation, threatening societal trust, decision-making, and national security.
1 code implementation • 30 Apr 2025 • Huizhong Guo, Zhu Sun, Dongxia Wang, Tianjun Wei, Jinfeng Li, Jie Zhang
In addition, FairAgent introduces a novel reward mechanism for recommendation tailored to the characteristics of DRSs, which consists of three components: 1) a new-item exploration reward to promote the exposure of dynamically introduced new-items, 2) a fairness reward to adapt to users' personalized fairness requirements for new-items, and 3) an accuracy reward which leverages users' dynamic feedback to enhance recommendation accuracy.
no code implementations • 29 Apr 2025 • Shiqian Zhao, Jiayang Liu, Yiming Li, Runyi Hu, Xiaojun Jia, Wenshu Fan, Xinfeng Li, Jie Zhang, Wei Dong, Tianwei Zhang, Luu Anh Tuan
Different from previous attacks that fuse the unsafe target prompt into one ultimate adversarial prompt, which can be easily detected or may generate non-unsafe images due to under- or over-optimization, we propose Inception, the first multi-turn jailbreak attack against the memory mechanism in real-world text-to-image generation systems.
1 code implementation • 29 Apr 2025 • Zhongqi Wang, Jie Zhang, Shiguang Shan, Xilin Chen
To quantify these dynamic anomalies, we first introduce DAA-I, which treats the tokens' attention maps as spatially independent and measures dynamic feature using the Frobenius norm.
no code implementations • 22 Apr 2025 • Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Shicheng Xu, Junyuan Mao, Yu Wang, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Wenjie Qu, Yue Liu, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Zhaoxin Fan, Kai Wang, Yi Ding, Donghai Hong, Jiaming Ji, Yingxin Lai, Zitong Yu, Xinfeng Li, Yifan Jiang, Yanhui Li, Xinyu Deng, Junlin Wu, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Qiufeng Wang, Xiaolong Jin, Wenxuan Wang, Dongrui Liu, Yanwei Yue, Wenke Huang, Guancheng Wan, Heng Chang, Tianlin Li, Yi Yu, Chenghao Li, Jiawei Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Jiaheng Zhang, Tianwei Zhang, Xingjun Ma, Jindong Gu, Liang Pang, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Lingjuan Lyu, Yuval Elovici, Bhavya Kailkhura, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, XiaoFeng Wang, DaCheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu
Currently, existing surveys on LLM safety primarily focus on specific stages of the LLM lifecycle, e. g., deployment phase or fine-tuning phase, lacking a comprehensive understanding of the entire "lifechain" of LLMs.
no code implementations • 22 Apr 2025 • Zimo Yan, Jie Zhang, Zheng Xie, Chang Liu, Yizhen Liu, Yiping Song
Molecular generation plays an important role in drug discovery and materials science, especially in data-scarce scenarios where traditional generative models often struggle to achieve satisfactory conditional generalization.
1 code implementation • 17 Apr 2025 • Runyi Hu, Jie Zhang, Shiqian Zhao, Nils Lukas, Jiwei Li, Qing Guo, Han Qiu, Tianwei Zhang
MaskMark has two variants: (1) MaskMark-D, which supports global watermark embedding, watermark localization, and local watermark extraction for applications such as tamper detection; (2) MaskMark-ED, which focuses on local watermark embedding and extraction, offering enhanced robustness in small regions to support fine-grined image protection.
1 code implementation • 14 Apr 2025 • Kristina Nikolić, Luze Sun, Jie Zhang, Florian Tramèr
For example, while all jailbreaks we tested bypass guardrails in models aligned to refuse to answer math, this comes at the expense of a drop of up to 92% in accuracy.
no code implementations • 12 Apr 2025 • Lingyou Zhou, Xin Dong, Kehai Qiu, Gang Yu, Jie Zhang, Jiliang Zhang
In this paper, we characterize the adaptive multiple path loss exponent (AMPLE) radio propagation model under urban macrocell (UMa) and urban microcell (UMi) scenarios from 0. 85-5 GHz using Ranplan Professional.
1 code implementation • 9 Apr 2025 • Ran Zhang, Xuanhua He, Ke Cao, Liu Liu, Li Zhang, Man Zhou, Jie Zhang
The distilled network, requiring only 10% of the parameters and inference time of the teacher network, retains 90% of its performance and outperforms existing SOTA methods.
1 code implementation • 7 Apr 2025 • Jinxiang Lai, Wenlong Wu, Jiawei Zhan, Jian Li, Bin-Bin Gao, Jun Liu, Jie Zhang, Song Guo
Box-supervised instance segmentation methods aim to achieve instance segmentation with only box annotations.
no code implementations • 27 Mar 2025 • Shuai Li, Jie Zhang, Yuang Qi, Kejiang Chen, Tianwei Zhang, Weiming Zhang, Nenghai Yu
It is worth noting that these attacks typically involve altering the query images, which is not a practical concern in real-world scenarios.
no code implementations • 24 Mar 2025 • Mengya Xu, Zhongzhen Huang, Jie Zhang, Xiaofan Zhang, Qi Dou
Large Language Models (LLMs) show promise in understanding surgical video content but remain underexplored for predictive decision-making in SAP, as they focus mainly on retrospective analysis.
1 code implementation • 22 Mar 2025 • Jie Zhang, Zhongqi Wang, Shiguang Shan, Xilin Chen
Backdoor attacks targeting text-to-image diffusion models have advanced rapidly, enabling attackers to implant malicious triggers into these models to manipulate their outputs.
no code implementations • 5 Mar 2025 • Tao Feng, Jie Zhang, Xiangjian Li, Rong Huang, Huashan Liu, Zhijie Wang
In contrast to FL, Personalized federated learning aims at serving for each client in achieving persoanlized model.
no code implementations • 2 Mar 2025 • Chang Liu, Haolin Wu, Xi Yang, Kui Zhang, Cong Wu, Weiming Zhang, Nenghai Yu, Tianwei Zhang, Qing Guo, Jie Zhang
As speech translation (ST) systems become increasingly prevalent, understanding their vulnerabilities is crucial for ensuring robust and reliable communication.
no code implementations • 1 Mar 2025 • Yujie Lei, Wenjie Sun, Sen Jia, Qingquan Li, Jie Zhang
Challenges in remote sensing object detection (RSOD), such as high inter-class similarity, imbalanced foreground-background distribution, and the small size of objects in remote sensing images significantly hinder detection accuracy.
no code implementations • 20 Feb 2025 • Ke Cao, Jing Wang, Ao Ma, Jiasong Feng, Zhanjie Zhang, Xuanhua He, Shanyuan Liu, Bo Cheng, Dawei Leng, Yuhui Yin, Jie Zhang
The Diffusion Transformer plays a pivotal role in advancing text-to-image and text-to-video generation, owing primarily to its inherent scalability.
1 code implementation • 20 Feb 2025 • Zhenhong Zhou, Zherui Li, Jie Zhang, Yuanhe Zhang, Kun Wang, Yang Liu, Qing Guo
We evaluate Corba on two widely-used LLM-MASs, namely, AutoGen and Camel across various topologies and commercial models.
1 code implementation • 18 Feb 2025 • Yunpeng Zhao, Jie Zhang
As synthetic data becomes increasingly popular in machine learning tasks, numerous methods--without formal differential privacy guarantees--use synthetic data for training.
1 code implementation • 8 Feb 2025 • Zenghao Duan, Wenbin Duan, Zhiyi Yin, Yinghan Shen, Shaoling Jing, Jie Zhang, HuaWei Shen, Xueqi Cheng
Knowledge editing has become a promising approach for efficiently and precisely updating knowledge embedded in large language models (LLMs).
no code implementations • 4 Feb 2025 • Javier Rando, Jie Zhang, Nicholas Carlini, Florian Tramèr
In the past decade, considerable research effort has been devoted to securing machine learning (ML) models that operate in adversarial settings.
1 code implementation • 1 Feb 2025 • Jie Zhang, Kuan-Chieh Wang, Bo-Wei Chiu, Min-Te Sun
Recent advances in deep learning have established Transformer architectures as the predominant modeling paradigm.
Ranked #3 on
Long-range modeling
on LRA
1 code implementation • 24 Jan 2025 • Runyi Hu, Jie Zhang, Yiming Li, Jiwei Li, Qing Guo, Han Qiu, Tianwei Zhang
Artificial Intelligence Generated Content (AIGC) has advanced significantly, particularly with the development of video generation models such as text-to-video (T2V) models and image-to-video (I2V) models.
no code implementations • 24 Jan 2025 • Zhenhao Jiang, Chenghao Chen, Hao Feng, Yu Yang, Jin Liu, Jie Zhang, Jia Jia, Ning Hu
We first propose the theory of the information bottleneck for fine-tuning and present an explanation for the fine-tuning technique in recommenders.
no code implementations • 23 Jan 2025 • Stefanos Bakirtzis, Çağkan Yapar, Kehai Qiu, Ian Wassell, Jie Zhang
To encourage further research and to facilitate fair comparisons in the development of deep learning-based radio propagation models, in the less explored case of directional radio signal emissions in indoor propagation environments, we have launched the ICASSP 2025 First Indoor Pathloss Radio Map Prediction Challenge.
no code implementations • 20 Jan 2025 • Chenrui Sun, Swarna Bindu Chetty, Gianluca Fontanesi, Jie Zhang, Amirhossein Mohajerzadeh, David Grace, Hamed Ahmadi
The advent of 6G technology demands flexible, scalable wireless architectures to support ultra-low latency, high connectivity, and high device density.
1 code implementation • 15 Jan 2025 • Baoming Zhang, Mingcai Chen, Jianqing Song, Shuangjie Li, Jie Zhang, Chongjun Wang
In this paper, we first analyze the restrictions of GNNs generalization from the perspective of supervision signals in the context of few-shot semi-supervised node classification.
no code implementations • 14 Jan 2025 • Jie Zhang, Yiyang Ni, Jun Li, Guangji Chen, Zhe Wang, Long Shi, Shi Jin, Wen Chen, H. Vincent Poor
Reconfigurable intelligent surfaces (RISs) have been recognized as a revolutionary technology for future wireless networks.
no code implementations • 7 Jan 2025 • Yanqing Ye, Weilong Yang, Kai Qiu, Jie Zhang
Situation assessment in Real-Time Strategy (RTS) games is crucial for understanding decision-making in complex adversarial environments.
no code implementations • 7 Jan 2025 • Weilong Yang, Jie Zhang, Xunyun Liu, Yanqing Ye
Building on traditional static evaluation functions, the method employs gradient descent in online reinforcement learning to update weights dynamically, incorporating weight decay techniques to ensure stability.
no code implementations • CVPR 2025 • Zonghui Guo, Yingjie Liu, Jie Zhang, Haiyong Zheng, Shiguang Shan
Then, we devise a future guide module to unravel inconsistency cues by iteratively aggregating historical anomaly cues and gradually propagating them into future frames.
no code implementations • CVPR 2025 • Hui Han, Siyuan Li, Jiaqi Chen, Yiwen Yuan, Yuling Wu, Yufan Deng, Chak Tou Leong, Hanwen Du, Junchen Fu, Youhua Li, Jie Zhang, Chi Zhang, Li-Jia Li, Yongxin Ni
This benchmark represents the first attempt to systematically leverage MLLMs across all dimensions relevant to video generation assessment in generative models.
1 code implementation • CVPR 2025 • Xuecheng Wu, Heli Sun, Yifan Wang, Jiayu Nie, Jie Zhang, Yabing Wang, Junxiao Xue, Liang He
To tackle these gaps, we introduce AVF-MAE++, a series audio-visual MAE designed to explore the impact of scaling on AVFA with a focus on advanced correlation modeling.
1 code implementation • 30 Dec 2024 • Bei Yan, Jie Zhang, ZhiYuan Chen, Shiguang Shan, Xilin Chen
To bridge this gap, we introduce M$^3$oralBench, the first MultiModal Moral Benchmark for LVLMs.
1 code implementation • 29 Dec 2024 • Daiheng Gao, Shilin Lu, Shaw Walters, Wenbo Zhou, Jiaming Chu, Jie Zhang, Bang Zhang, Mengxi Jia, Jian Zhao, Zhaoxin Fan, Weiming Zhang
Removing unwanted concepts from large-scale text-to-image (T2I) diffusion models while maintaining their overall generative quality remains an open challenge.
1 code implementation • 27 Dec 2024 • Jie Zhang, Xiangkui Cao, Zhouyu Han, Shiguang Shan, Xilin Chen
Multi-P$^2$A covers 26 categories of personal privacy, 15 categories of trade secrets, and 18 categories of state secrets, totaling 31, 962 samples.
no code implementations • 21 Dec 2024 • Wenjie Xi, Rundong Zuo, Alejandro Alvarez, Jie Zhang, Byron Choi, Jessica Lin
Additionally, the recent success of Transformer models has inspired many studies.
1 code implementation • 18 Dec 2024 • Yanhua Li, Xiaocao Ouyang, Chaofan Pan, Jie Zhang, Sen Zhao, Shuyin Xia, Xin Yang, Guoyin Wang, Tianrui Li
To tackle these issues, we propose a Multi-granularity Open intent classification method via adaptive Granular-Ball decision boundary (MOGB).
no code implementations • 16 Dec 2024 • Jie Zhang, Xun Gong, Zhonglin Sun
However, face recognition performance is heavily affected by the label noise, especially closed-set noise.
no code implementations • 15 Dec 2024 • Yingpeng Du, Zhu Sun, Ziyan Wang, Haoyan Chua, Jie Zhang, Yew-Soon Ong
Knowledge distillation (KD)-based methods can alleviate these issues by transferring the knowledge to a small student, which trains a student based on the predictions of a cumbersome teacher.
no code implementations • 13 Dec 2024 • Runyi Hu, Jie Zhang, Yiming Li, Jiwei Li, Qing Guo, Han Qiu, Tianwei Zhang
For extraction, the process is reversed: the watermarked image is inverted back to the initial watermarked noise via DDIM Inversion, from which the embedded watermark is extracted.
no code implementations • 12 Dec 2024 • Jianwei Cui, Yu Gu, Shihao Chen, Jie Zhang, Liping Chen, LiRong Dai
Singing Voice Synthesis (SVS) aims to generate singing voices of high fidelity and expressiveness.
no code implementations • 11 Dec 2024 • Jiaqi Chen, Xiaoye Zhu, Tianyang Liu, Ying Chen, Xinhui Chen, Yiwen Yuan, Chak Tou Leong, Zuchao Li, Tang Long, Lei Zhang, Chenyu Yan, Guanghao Mei, Jie Zhang, Lefei Zhang
Large Language Models (LLMs) have revolutionized text generation, making detecting machine-generated text increasingly challenging.
no code implementations • 11 Dec 2024 • Zhongyi Zhang, Jie Zhang, Wenbo Zhou, Xinghui Zhou, Qing Guo, Weiming Zhang, Tianwei Zhang, Nenghai Yu
Face-swapping techniques have advanced rapidly with the evolution of deep learning, leading to widespread use and growing concerns about potential misuse, especially in cases of fraud.
no code implementations • 11 Dec 2024 • Yun Xing, Nhat Chung, Jie Zhang, Yue Cao, Ivor Tsang, Yang Liu, Lei Ma, Qing Guo
We validate our method on both digital and physical level, \ie, nuImage and manually captured real scenes, where both statistical and visual results prove that our MAGIC is powerful and effectively for attacking wide-used object detection systems.
1 code implementation • 5 Dec 2024 • Shuhe Wang, Shengyu Zhang, Jie Zhang, Runyi Hu, Xiaoya Li, Tianwei Zhang, Jiwei Li, Fei Wu, Guoyin Wang, Eduard Hovy
This paper surveys research in the rapidly growing field of enhancing large language models (LLMs) with reinforcement learning (RL), a technique that enables LLMs to improve their performance by receiving feedback in the form of rewards based on the quality of their outputs, allowing them to generate more accurate, coherent, and contextually appropriate responses.
no code implementations • 5 Dec 2024 • Shuhao Ma, Jie Zhang, Chaoyang Shi, Pei Di, Ian D. Robertson, Zhi-Qiang Zhang
To achieve this, the Hill muscle model-based forward dynamics is embedded into the deep neural network as the additional loss to further regulate the behavior of the deep neural network.
no code implementations • 3 Dec 2024 • Ziqing Wu, Zhu Sun, Dongxia Wang, Lu Zhang, Jie Zhang, Yew Soon Ong
The Neighbor Preference Retrieval Module retrieves and summarizes the preferences of similar users from the KB to obtain collaborative signals.
1 code implementation • CVPR 2025 • Aniruddha Ganguly, Debolina Chatterjee, Wentao Huang, Jie Zhang, Alisa Yurovsky, Travis Steele Johnson, Chao Chen
Recent advances in Spatial Transcriptomics (ST) pair histology images with spatially resolved gene expression profiles, enabling predictions of gene expression across different tissue locations based on image patches.
no code implementations • 3 Dec 2024 • Yunkai Dang, Kaichen Huang, Jiahao Huo, Yibo Yan, Sirui Huang, Dongrui Liu, Mengxi Gao, Jie Zhang, Chen Qian, Kun Wang, Yong liu, Jing Shao, Hui Xiong, Xuming Hu
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with large language models (LLMs) and computer vision (CV) systems driving advancements in natural language understanding and visual processing, respectively.
no code implementations • 3 Dec 2024 • Hao Chen, Han Tao, Guo Song, Jie Zhang, Yunlong Yu, Yonghan Dong, Chuang Yang, Lei Bai
Atmospheric science is intricately connected with other fields, e. g., geography and aerospace.
1 code implementation • 2 Dec 2024 • Jie Zhang, Min-Te Sun
Different polynomial bases, such as Bernstein, Chebyshev, and monomial basis, have various convergence rates that will affect the error in polynomial interpolation.
no code implementations • CVPR 2025 • Yue Cao, Yun Xing, Jie Zhang, Di Lin, Tianwei Zhang, Ivor Tsang, Yang Liu, Qing Guo
In this paper, we present the first approach to generate scene-coherent typographic adversarial attacks that mislead advanced LVLMs while maintaining visual naturalness through the capability of the LLM-based agent.
no code implementations • 22 Nov 2024 • Jie Zhang, Christian Schlarmann, Kristina Nikolić, Nicholas Carlini, Francesco Croce, Matthias Hein, Florian Tramèr
Ensemble everything everywhere is a defense to adversarial examples that was recently proposed to make image classifiers robust.
no code implementations • 19 Nov 2024 • Mahmut S. Gokmen, Jie Zhang, Ge Wang, Jin Chen, Cody Bumgardner
This is combined with a sinusoidal curriculum that enhances the learning of the trajectory between the noise distribution and the posterior distribution of interest, allowing High Noise Improved Consistency Training (HN-iCT) to be trained in a supervised fashion.
no code implementations • 18 Nov 2024 • Xingming Long, Jie Zhang, Shiguang Shan, Xilin Chen
The primary goal of out-of-distribution (OOD) detection tasks is to identify inputs with semantic shifts, i. e., if samples from novel classes are absent in the in-distribution (ID) dataset used for training, we should reject these OOD samples rather than misclassifying them into existing ID classes.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
1 code implementation • 14 Nov 2024 • Jinxiang Lai, Jie Zhang, Jun Liu, Jian Li, Xiaocheng Lu, Song Guo
To address this limitation, we introduce Spider, a novel efficient Any-to-Many Modalities Generation (AMMG) framework, which can generate an arbitrary combination of modalities 'Text + Xs', such as Text + {Image and Audio and Video}.
1 code implementation • 12 Nov 2024 • Xin Zhou, Lei Zhang, Honglei Zhang, Yixin Zhang, Xiaoxiong Zhang, Jie Zhang, Zhiqi Shen
Human behavioral patterns and consumption paradigms have emerged as pivotal determinants in environmental degradation and climate change, with quotidian decisions pertaining to transportation, energy utilization, and resource consumption collectively precipitating substantial ecological impacts.
no code implementations • 2 Nov 2024 • Xingming Long, Jie Zhang, Shiguang Shan
The prediction confidence for each sample is subsequently assessed using the Mahalanobis distance between the sample and the Gaussians for the "known data".
1 code implementation • 28 Oct 2024 • Jieyi Bi, Yining Ma, Jianan Zhou, Wen Song, Zhiguang Cao, Yaoxin Wu, Jie Zhang
Vehicle Routing Problems (VRPs) can model many real-world scenarios and often involve complex constraints.
1 code implementation • 22 Oct 2024 • Chen Qian, Dongrui Liu, Jie Zhang, Yong liu, Jing Shao
Extensive experimental results demonstrate that DEAN eliminates the trade-off phenomenon and significantly improves LLMs' fairness and privacy awareness simultaneously, \eg improving Qwen-2-7B-Instruct's fairness awareness by 12. 2\% and privacy awareness by 14. 0\%.
1 code implementation • 18 Oct 2024 • Enneng Yang, Li Shen, Zhenyi Wang, Guibing Guo, Xingwei Wang, Xiaocun Cao, Jie Zhang, DaCheng Tao
However, in this paper, we examine the merged model's representation distribution and uncover a critical issue of "representation bias".
1 code implementation • 18 Oct 2024 • Jie Zhang, Dongrui Liu, Chen Qian, Linfeng Zhang, Yong liu, Yu Qiao, Jing Shao
Therefore, model owners and third parties need to identify whether a suspect model is a subsequent development of the victim model.
no code implementations • 16 Oct 2024 • Jianwei Cui, Yu Gu, Chao Weng, Jie Zhang, Liping Chen, LiRong Dai
This paper presents an advanced end-to-end singing voice synthesis (SVS) system based on the source-filter mechanism that directly translates lyrical and melodic cues into expressive and high-fidelity human-like singing.
1 code implementation • 14 Oct 2024 • Boheng Li, Yanhao Wei, Yankai Fu, Zhenting Wang, Yiming Li, Jie Zhang, Run Wang, Tianwei Zhang
In this paper, we introduce SIREN, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models.
no code implementations • 7 Oct 2024 • Jinbo Hou, Kehai Qiu, Zitian Zhang, Yong Yu, Kezhi Wang, Stefano Capolongo, Jiliang Zhang, Zeyang Li, Jie Zhang
This paper aims to simultaneously optimize indoor wireless and daylight performance by adjusting the positions of windows and the beam directions of window-deployed reconfigurable intelligent surfaces (RISs) for RIS-aided outdoor-to-indoor (O2I) networks utilizing large language models (LLM) as optimizers.
1 code implementation • 7 Oct 2024 • Jianan Zhou, Yaoxin Wu, Zhiguang Cao, Wen Song, Jie Zhang, Zhiqi Shen
Given a neural VRP method, we adversarially train multiple models in a collaborative manner to synergistically promote robustness against attacks, while boosting standard generalization on clean instances.
no code implementations • 5 Oct 2024 • Rabeya Tus Sadia, Jie Zhang, Jin Chen
In response to these challenges, we propose LTDiff++, a multiscale latent diffusion model designed to enhance feature extraction in medical imaging.
no code implementations • 29 Sep 2024 • Jie Zhang, Debeshee Das, Gautam Kamath, Florian Tramèr
We argue that this approach is fundamentally unsound: to provide convincing evidence, the data creator needs to demonstrate that their attack has a low false positive rate, i. e., that the attack's output is unlikely under the null hypothesis that the model was not trained on the target data.
1 code implementation • 25 Sep 2024 • Zonghui Guo, Yingjie Liu, Jie Zhang, Haiyong Zheng, Shiguang Shan
Specifically, we analyze the crucial contributions of backbones with different configurations in FFD task and propose leveraging the ViT network with self-supervised learning on real-face datasets to pre-train a backbone, equipping it with superior facial representation capabilities.
no code implementations • 24 Sep 2024 • Ying Dong, Hang Xiao, Haonan Hu, Jiliang Zhang, Qianbin Chen, Jie Zhang
The results show that by jointly optimising the COR and TGR, the partial offloading scheme outperforms the local and remote computing schemes in terms of the MAoI, which can be improved by up to 51% and 61%, respectively.
1 code implementation • 20 Sep 2024 • Haoyin Yan, Jie Zhang, Cunhang Fan, Yeping Zhou, Peiqi Liu
Speech enhancement (SE) aims to extract the clean waveform from noise-contaminated measurements to improve the speech quality and intelligibility.
no code implementations • 19 Sep 2024 • Keying Zuo, Qingtian Xu, Jie Zhang, ZhenHua Ling
Brain-assisted speech enhancement (BASE) aims to extract the target speaker in complex multi-talker scenarios using electroencephalogram (EEG) signals as an assistive modality, as the auditory attention of the listener can be decoded from electroneurographic signals of the brain.
1 code implementation • 19 Sep 2024 • Jingyuan Wang, Jie Zhang, Shihao Chen, Miao Sun
Binaural speech enhancement (BSE) aims to jointly improve the speech quality and intelligibility of noisy signals received by hearing devices and preserve the spatial cues of the target for natural listening.
no code implementations • 8 Sep 2024 • Xiurui Pan, Endian Li, Qiao Li, Shengwen Liang, Yizhou Shan, Ke Zhou, Yingwei Luo, Xiaolin Wang, Jie Zhang
Several cost-effective solutions leverage host memory or SSDs to reduce storage costs for offline inference scenarios and improve the throughput.
no code implementations • 5 Sep 2024 • Yujie Wang, Shenhan Zhu, Fangcheng Fu, Xupeng Miao, Jie Zhang, Juan Zhu, Fan Hong, Yong Li, Bin Cui
Recent foundation models are capable of handling multiple tasks and multiple data modalities with the unified base model structure and several specialized model components.
no code implementations • 3 Sep 2024 • Ke Cao, Xuanhua He, Tao Hu, Chengjun Xie, Jie Zhang, Man Zhou, Danfeng Hong
Multi-modal image fusion integrates complementary information from different modalities to produce enhanced and informative images.
no code implementations • 23 Aug 2024 • Yonghui Nie, Zhi Li, Jie Zhang, Lei Gao, Yang Li, Hengyu Zhou
To coordinate resources among multi-level stakeholders and enhance the integration of electric vehicles (EVs) into multi-microgrids, this study proposes an optimal dispatch strategy within a multi-microgrid cooperative alliance using a nuanced two-stage pricing mechanism.
no code implementations • 22 Aug 2024 • Shihao Chen, Yu Gu, Jianwei Cui, Jie Zhang, Rilin Chen, LiRong Dai
We achieved one-step or few-step inference while maintaining the high performance by distilling a pre-trained LDM based SVC model, which had the advantages of timbre decoupling and sound quality.
no code implementations • 22 Aug 2024 • Stefanos Bakirtzis, Cagkan Yapar, Marco Fiore, Jie Zhang, Ian Wassell
The efficient deployment and operation of any wireless communication ecosystem rely on knowledge of the received signal quality over the target coverage area.
1 code implementation • 22 Aug 2024 • Xingtong Yu, Jie Zhang, Yuan Fang, Renhe Jiang
In particular, many real-world graphs are non-homophilic, not strictly or uniformly homophilic with mixing homophilic and heterophilic patterns, exhibiting varying non-homophilic characteristics across graphs and nodes.
1 code implementation • 22 Aug 2024 • Kunsheng Tang, Wenbo Zhou, Jie Zhang, Aishan Liu, Gelei Deng, Shuai Li, Peigui Qi, Weiming Zhang, Tianwei Zhang, Nenghai Yu
By offering a realistic assessment and tailored reduction of gender biases, we hope that our GenderCARE can represent a significant step towards achieving fairness and equity in LLMs.
1 code implementation • 14 Aug 2024 • Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xiaochun Cao, Jie Zhang, DaCheng Tao
Model merging is an efficient empowerment technique in the machine learning community that does not require the collection of raw training data and does not require expensive computation.
1 code implementation • 7 Aug 2024 • Ruiqi Wang, Jinyang Huang, Jie Zhang, Xin Liu, Xiang Zhang, Zhi Liu, Peng Zhao, Sigui Chen, Xiao Sun
Depression is a prevalent mental health disorder that significantly impacts individuals' lives and well-being.
no code implementations • 6 Aug 2024 • Xin Zhou, Aixin Sun, Jie Zhang, Donghui Lin
The increasing availability of learning activity data in Massive Open Online Courses (MOOCs) enables us to conduct a large-scale analysis of learners' learning behavior.
no code implementations • 18 Jul 2024 • Xuanhua He, Lang Li, Yingying Wang, Hui Zheng, Ke Cao, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhou
To address this issue, we propose Large Model Driven Image Restoration framework (LMDIR), a novel multiple-in-one image restoration paradigm that leverages the generic priors from large multi-modal language models (MMLMs) and the pretrained diffusion models.
1 code implementation • 17 Jul 2024 • Jie Zhang, Dongrui Liu, Chen Qian, Ziyue Gan, Yong liu, Yu Qiao, Jing Shao
In this paper, we discover that LLMs' personality traits are closely related to their safety abilities, i. e., toxicity, privacy, and fairness, based on the reliable MBTI-M scale.
no code implementations • 11 Jul 2024 • Zhenhe Wu, Zhongqiu Li, Jie Zhang, Mengxiang Li, Yu Zhao, Ruiyu Fang, Zhongjiang He, Xuelong Li, Zhoujun Li, Shuangyong Song
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task.
1 code implementation • 5 Jul 2024 • Zhongqi Wang, Jie Zhang, Shiguang Shan, Xilin Chen
In this paper, for the first time, we propose a comprehensive defense method named T2IShield to detect, localize, and mitigate such attacks.
no code implementations • 2 Jul 2024 • Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang
This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case.
no code implementations • 1 Jul 2024 • Yurui Huang, Jie Zhang, HengDa Bao, Yang Yang, Jian Yang
Estimated time of arrival (ETA) is a very important factor in the transportation system.
no code implementations • 28 Jun 2024 • Jie Zhang, Jun Li, Zhe Wang, Yu Han, Long Shi, Bin Cao
In this paper, we propose a novel diffusion-decision transformer (D2T) architecture to optimize the beamforming strategies for intelligent reflecting surface (IRS)-assisted multiple-input single-output (MISO) communication systems.
2 code implementations • 27 Jun 2024 • Jie Zhang, Zhongqi Wang, Mengqi Lei, Zheng Yuan, Bei Yan, Shiguang Shan, Xilin Chen
Currently many benchmarks have been proposed to evaluate the perception ability of the Large Vision-Language Models (LVLMs).
1 code implementation • 24 Jun 2024 • Bei Yan, Jie Zhang, Zheng Yuan, Shiguang Shan, Xilin Chen
Furthermore, based on the results of our quality measurement, we construct a High-Quality Hallucination Benchmark (HQH) for LVLMs, which demonstrates superior reliability and validity under our HQM framework.
1 code implementation • 23 Jun 2024 • Debeshee Das, Jie Zhang, Florian Tramèr
Membership inference (MI) attacks try to determine if a data sample was used to train a machine learning model.
no code implementations • 20 Jun 2024 • Tingyi Lin, Pengju Lyu, Jie Zhang, Yuqing Wang, Cheng Wang, Jianjun Zhu
Typical conditional diffusion models commonly generate images with guidance of segmentation labels for medical modal transformation.
1 code implementation • 20 Jun 2024 • Sibo Wang, Xiangkui Cao, Jie Zhang, Zheng Yuan, Shiguang Shan, Xilin Chen, Wen Gao
The emergence of Large Vision-Language Models (LVLMs) marks significant strides towards achieving general artificial intelligence.
1 code implementation • 19 Jun 2024 • Edoardo Debenedetti, Jie Zhang, Mislav Balunović, Luca Beurer-Kellner, Marc Fischer, Florian Tramèr
Unfortunately, AI agents are vulnerable to prompt injection attacks where data returned by external tools hijacks the agent to execute malicious tasks.
no code implementations • 19 Jun 2024 • Jiacheng Du, Zhibo Wang, Jie Zhang, Xiaoyi Pang, Jiahui Hu, Kui Ren
Language Models (LMs) are prone to ''memorizing'' training data, including substantial sensitive user information.
no code implementations • 14 Jun 2024 • Jiajia Tang, Jie Zhang, Jiulou Zhang, Yuxia Tang, Hao Ni, Shouju Wang
For each tumor in the test set, the radiomics signature predicted gold nanoparticle uptake.
1 code implementation • 14 Jun 2024 • Xingming Long, Jie Zhang, Shiguang Shan, Xilin Chen
In this paper, we construct a benchmark named Incremental Shift OOD (IS-OOD) to address the issue, in which we divide the test samples into subsets with different semantic and covariate shift degrees relative to the ID dataset.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
no code implementations • 12 Jun 2024 • Hengyu Li, Kangdi Mei, Zhaoci Liu, Yang Ai, Liping Chen, Jie Zhang, ZhenHua Ling
It was shown in literature that speech representations extracted by self-supervised pre-trained models exhibit similarities with brain activations of human for speech perception and fine-tuning speech representation models on downstream tasks can further improve the similarity.
no code implementations • 8 Jun 2024 • Shihao Chen, Yu Gu, Jie Zhang, Na Li, Rilin Chen, Liping Chen, LiRong Dai
We pretrain a variational autoencoder structure using the noted open-source So-VITS-SVC project based on the VITS framework, which is then used for the LDM training.
no code implementations • 27 May 2024 • Liang Shi, Jie Zhang, Shiguang Shan
Specifically, we train a learnable prompt prefix for text-to-image diffusion models, which forces the model to generate anonymized facial identities, even when prompted to produce images of specific individuals.
1 code implementation • 24 May 2024 • Guanlin Li, Kangjie Chen, Shudong Zhang, Jie Zhang, Tianwei Zhang
Additionally, we introduce three large-scale red-teaming datasets for studying the safety risks associated with text-to-image models.
1 code implementation • 24 May 2024 • Hanlin Gu, Gongxi Zhu, Jie Zhang, Xinyuan Zhao, Yuxing Han, Lixin Fan, Qiang Yang
To facilitate the implementation of the right to be forgotten, the concept of federated machine unlearning (FMU) has also emerged.
no code implementations • 23 May 2024 • Nhat Chung, Sensen Gao, Tuan-Anh Vu, Jie Zhang, Aishan Liu, Yun Lin, Jin Song Dong, Qing Guo
To further explore the risk in AD systems and the transferability of practical threats, we propose to leverage typographic attacks against AD systems relying on the decision-making capabilities of Vision-LLMs.
1 code implementation • 21 May 2024 • Zhifan Wan, Jie Zhang, Changzhen Li, Shiguang Shan
The visual pathway of human brain includes two sub-pathways, ie, the ventral pathway and the dorsal pathway, which focus on object identification and dynamic information modeling, respectively.
no code implementations • 14 May 2024 • Jie Zhang, Yuhan Li, Yude Wang, Stephen Lin, Shiguang Shan
Few-shot segmentation (FSS) aims to train a model which can segment the object from novel classes with a few labeled samples.
1 code implementation • 12 May 2024 • Weiwei Weng, Mahardhika Pratama, Jie Zhang, Chen Chen, Edward Yapp Kien Yee, Ramasamy Savitha
To this end, this article proposes a cross-domain CL approach making possible to deploy a single model in such environments without additional labelling costs.
2 code implementations • 9 May 2024 • Lang He, Kai Chen, Junnan Zhao, Yimeng Wang, Ercheng Pei, Haifeng Chen, Jiewei Jiang, Shiqing Zhang, Jie Zhang, Zhongmin Wang, Tao He, Prayag Tiwari
Depression can significantly impact many aspects of an individual's life, including their personal and social functioning, academic and work performance, and overall quality of life.
1 code implementation • 6 May 2024 • Jie Zhang, Haoyu Bu, Hui Wen, Yongji Liu, Haiqiang Fei, Rongrong Xi, Lun Li, Yun Yang, Hongsong Zhu, Dan Meng
The rapid development of large language models (LLMs) has opened new avenues across various fields, including cybersecurity, which faces an evolving threat landscape and demand for innovative technologies.
2 code implementations • 2 May 2024 • Jianan Zhou, Zhiguang Cao, Yaoxin Wu, Wen Song, Yining Ma, Jie Zhang, Chi Xu
Learning to solve vehicle routing problems (VRPs) has garnered much attention.
2 code implementations • 26 Apr 2024 • Michael Aerni, Jie Zhang, Florian Tramèr
Empirical defenses for machine learning privacy forgo the provable guarantees of differential privacy in the hope of achieving higher utility while resisting realistic adversaries.
1 code implementation • 25 Apr 2024 • Zhijie Rao, Jingcai Guo, Xiaocheng Lu, Jingming Liang, Jie Zhang, Haozhao Wang, Kang Wei, Xiaofeng Cao
Zero-shot learning has consistently yielded remarkable progress via modeling nuanced one-to-one visual-attribute correlation.
1 code implementation • 23 Apr 2024 • Xuanhua He, Quande Liu, Shengju Qian, Xin Wang, Tao Hu, Ke Cao, Keyu Yan, Jie Zhang
In this study, we present \textbf{ID-Animator}, a zero-shot human-video generation approach that can perform personalized video generation given a single reference facial image without further training.
no code implementations • 9 Apr 2024 • Mahmut S. Gokmen, Cody Bumgardner, Jie Zhang, Ge Wang, Jin Chen
The results show that the polynomial noise distribution outperforms the model trained with log-normal noise distribution, yielding a 33. 54 FID score after 100, 000 training steps with constant discretization steps.
no code implementations • 8 Apr 2024 • Jie Zhang, Jun Li, Long Shi, Zhe Wang, Shi Jin, Wen Chen, H. Vincent Poor
By leveraging the power of DT models learned over offline datasets, the proposed architecture is expected to achieve rapid convergence with many fewer training epochs and higher performance in new scenarios with different state and action spaces, compared with DRL.
no code implementations • 2 Apr 2024 • Jiachen Ma, Yijiang Li, Zhiqing Xiao, Anda Cao, Jie Zhang, Chao Ye, Junbo Zhao
In this work, we investigate a more practical and universal attack that does not require the presence of a target model and demonstrate that the high-dimensional text embedding space inherently contains NSFW concepts that can be exploited to generate harmful images.
no code implementations • 25 Mar 2024 • Jie Zhang
The ability to form memories is a basic feature of learning and accumulating knowledge.
1 code implementation • 25 Mar 2024 • Yirong Zeng, Xiao Ding, Yi Zhao, Xiangyu Li, Jie Zhang, Chao Yao, Ting Liu, Bing Qin
Furthermore, we construct RU22Fact, a novel multilingual explainable fact-checking dataset on the Russia-Ukraine conflict in 2022 of 16K samples, each containing real-world claims, optimized evidence, and referenced explanation.
no code implementations • 25 Mar 2024 • Ziyan Wang, Yingpeng Du, Zhu Sun, Haoyan Chua, Kaidong Feng, Wenya Wang, Jie Zhang
However, the former methods struggle with optimal prompts to elicit the correct reasoning of LLMs due to the lack of task-specific feedback, leading to unsatisfactory recommendations.
no code implementations • CVPR 2024 • Sikai Bai, Jie Zhang, Shuaicheng Li, Song Guo, Jingcai Guo, Jun Hou, Tao Han, Xiaocheng Lu
Federated learning (FL) has emerged as a powerful paradigm for learning from decentralized data, and federated domain generalization further considers the test dataset (target domain) is absent from the decentralized training data (source domains).
no code implementations • 8 Mar 2024 • Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang
On top of that, Exemplar-free Class Incremental Learning is even more challenging due to forbidden access to previous task data.
no code implementations • 29 Feb 2024 • Jie Zhang, Xubing Yang, Rui Jiang, Wei Shao, Li Zhang
While the direct application of SAM to remote sensing image segmentation tasks does not yield satisfactory results, we propose RSAM-Seg, which stands for Remote Sensing SAM with Semantic Segmentation, as a tailored modification of SAM for the remote sensing field and eliminates the need for manual intervention to provide prompts.
1 code implementation • 29 Feb 2024 • Chen Qian, Jie Zhang, Wei Yao, Dongrui Liu, Zhenfei Yin, Yu Qiao, Yong liu, Jing Shao
This research provides an initial exploration of trustworthiness modeling during LLM pre-training, seeking to unveil new insights and spur further developments in the field.
2 code implementations • 28 Feb 2024 • Wei zhang, Xiangyuan Guan, Lu Yunhong, Jie Zhang, Shuangyong Song, Xianfu Cheng, Zhenhe Wu, Zhoujun Li
Log parsing, which entails transforming raw log messages into structured templates, constitutes a critical phase in the automation of log analytics.
1 code implementation • 27 Feb 2024 • Yanghao Su, Jie Zhang, Ting Xu, Tianwei Zhang, Weiming Zhang, Nenghai Yu
By accessing the model to obtain hard labels, we construct decision boundaries within the convex combination of three samples.
no code implementations • 21 Feb 2024 • Xiao-Yang Liu, Jie Zhang, Guoxuan Wang, Weiqing Tong, Anwar Walid
However, the resulting model still consumes a large amount of GPU memory.
1 code implementation • 19 Feb 2024 • Xuanhua He, Ke Cao, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhou
To the best of our knowledge, this work is the first attempt in exploring the potential of the Mamba model and establishes a new frontier in the pan-sharpening techniques.
1 code implementation • 19 Feb 2024 • Marcus de Carvalho, Mahardhika Pratama, Jie Zhang, Chua Haoyan, Edward Yapp
In this work, we introduce a novel approach called Cross-Domain Continual Learning (CDCL) that addresses the limitations of being limited to single supervised domains.
no code implementations • 19 Feb 2024 • Zhihao Wen, Jie Zhang, Yuan Fang
Fine-tuning all parameters of large language models (LLMs) necessitates substantial computational power and extended time.
no code implementations • 14 Feb 2024 • Yingpeng Du, Ziyan Wang, Zhu Sun, Haoyan Chua, Hongzhi Liu, Zhonghai Wu, Yining Ma, Jie Zhang, Youchen Sun
To adapt text-based LLMs with structured graphs, We use the LLM as an aggregator in graph processing, allowing it to understand graph-based information step by step.
1 code implementation • 4 Feb 2024 • Jiacheng Chen, Zeyuan Ma, Hongshu Guo, Yining Ma, Jie Zhang, Yue-Jiao Gong
Recent Meta-learning for Black-Box Optimization (MetaBBO) methods harness neural networks to meta-learn configurations of traditional black-box optimizers.
1 code implementation • 2 Feb 2024 • Guanlin Li, Shuai Yang, Jie Zhang, Tianwei Zhang
With the development of generative models, the quality of generated content keeps increasing.
1 code implementation • 30 Jan 2024 • Yijie Lin, Jie Zhang, Zhenyu Huang, Jia Liu, Zujie Wen, Xi Peng
Existing video-language studies mainly focus on learning short video clips, leaving long-term temporal dependencies rarely explored due to over-high computational cost of modeling long videos.
Action Segmentation
Long Video Retrieval (Background Removed)
+2
no code implementations • 26 Jan 2024 • Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, LiMin Wang, Lu Sheng, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He, Yingchun Wang, Yixu Wang, Yongting Zhang, Yu Qiao, Yujiong Shen, Yurong Mou, Yuxi Chen, Zaibin Zhang, Zhelun Shi, Zhenfei Yin, Zhipin Wang
Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents.
no code implementations • 22 Jan 2024 • Shihao Chen, Liping Chen, Jie Zhang, KongAik Lee, ZhenHua Ling, LiRong Dai
For validation, we employ the open-source pre-trained YourTTS model for speech generation and protect the target speaker's speech in the white-box scenario.
1 code implementation • 19 Jan 2024 • Xiangshuo Qiao, Xianxin Li, Xiaozhe Qu, Jie Zhang, Yang Liu, Yu Luo, Cihang Jin, Jin Ma
Differently, video covers in short video search scenarios are presented as user-originated contents that provide important visual summaries of videos.
Ranked #1 on
Image Retrieval
on CBVS
1 code implementation • 17 Jan 2024 • Xingming Long, Jie Zhang, Shiguang Shan
Previous Face Anti-spoofing (FAS) methods face the challenge of generalizing to unseen domains, mainly because most existing FAS datasets are relatively small and lack data diversity.
no code implementations • 15 Jan 2024 • Jie Zhang, Zhifan Wan, Lanqing Hu, Stephen Lin, Shuzhe Wu, Shiguang Shan
Considering the close connection between action recognition and human pose estimation, we design a Collaboratively Self-supervised Video Representation (CSVR) learning framework specific to action recognition by jointly factoring in generative pose prediction and discriminative context matching as pretext tasks.
1 code implementation • CVPR 2024 • Sibo Wang, Jie Zhang, Zheng Yuan, Shiguang Shan
Specifically, PMG-AFT minimizes the distance between the features of adversarial examples in the target model and those in the pre-trained model, aiming to preserve the generalization features already captured by the pre-trained model.
no code implementations • 8 Jan 2024 • Zhongjiang He, Zihan Wang, Xinzhang Liu, Shixuan Liu, Yitong Yao, Yuyao Huang, Xuelong Li, Yongxiang Li, Zhonghao Che, Zhaoxi Zhang, Yan Wang, Xin Wang, Luwen Pu, Huinan Xu, Ruiyu Fang, Yu Zhao, Jie Zhang, Xiaomeng Huang, Zhilong Lu, Jiaxin Peng, Wenjun Zheng, Shiquan Wang, Bingkai Yang, Xuewei he, Zhuoru Jiang, Qiyi Xie, Yanhan Zhang, Zhongqiu Li, Lingling Shi, Weiwei Fu, Yin Zhang, Zilu Huang, Sishi Xiong, Yuxiang Zhang, Chao Wang, Shuangyong Song
Subsequently, the model undergoes fine-tuning to align with human preferences, following a detailed methodology that we describe.
1 code implementation • 7 Jan 2024 • Qiushi Zhu, Jie Zhang, Yu Gu, Yuchen Hu, LiRong Dai
Considering that visual information helps to improve speech recognition performance in noisy scenes, in this work we propose a multichannel multi-modal speech self-supervised learning framework AV-wav2vec2, which utilizes video and multichannel audio data as inputs.
Audio-Visual Speech Recognition
Automatic Speech Recognition
+7
1 code implementation • 4 Jan 2024 • Xuanhua He, Tao Hu, Guoli Wang, Zejin Wang, Run Wang, Qian Zhang, Keyu Yan, Ziyi Chen, Rui Li, Chenjun Xie, Jie Zhang, Man Zhou
However, current methods often ignore the difference between cell phone RAW images and DSLR camera RGB images, a difference that goes beyond the color matrix and extends to spatial structure due to resolution variations.
1 code implementation • 4 Jan 2024 • Xuanhua He, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhou
Pan-sharpening involves reconstructing missing high-frequency information in multi-spectral images with low spatial resolution, using a higher-resolution panchromatic image as guidance.
no code implementations • 3 Jan 2024 • Zheng Yuan, Jie Zhang, Shiguang Shan
In recent years, the Vision Transformer (ViT) model has gradually become mainstream in various computer vision tasks, and the robustness of the model has received increasing attention.
no code implementations • 3 Jan 2024 • Zheng Yuan, Jie Zhang, Yude Wang, Shiguang Shan, Xilin Chen
The attention mechanism has been proven effective on various visual tasks in recent years.
1 code implementation • CVPR 2024 • Zonghui Guo, Xinyu Han, Jie Zhang, Shiguang Shan, Haiyong Zheng
Video harmonization is an important and challenging task that aims to obtain visually realistic composite videos by automatically adjusting the foreground's appearance to harmonize with the background.
1 code implementation • 17 Dec 2023 • Yi Xie, Jie Zhang, Shiqian Zhao, Tianwei Zhang, Xiaofeng Chen
While deep learning models have shown significant performance across various domains, their deployment needs extensive resources and advanced computing infrastructure.
no code implementations • 15 Dec 2023 • Jingcai Guo, Qihua Zhou, Ruibing Li, Xiaocheng Lu, Ziming Liu, Junyang Chen, Xin Xie, Jie Zhang
Then, to facilitate the generalization of local linearities, we construct a maximal margin geometry on the learned features by enforcing low-rank constraints on intra-class samples and high-rank constraints on inter-class samples, resulting in orthogonal subspaces for different classes and each subspace lies on a compact manifold.
1 code implementation • 14 Dec 2023 • Yuan Sun, Xuan Wang, Yunfan Zhang, Jie Zhang, Caigui Jiang, Yu Guo, Fei Wang
We present a method named iComMa to address the 6D camera pose estimation problem in computer vision.
1 code implementation • 12 Dec 2023 • Kangneng Zhou, Daiheng Gao, Xuan Wang, Jie Zhang, Peng Zhang, Xusen Sun, Longhao Zhang, Shiqi Yang, Bang Zhang, Liefeng Bo, Yaxing Wang, Ming-Ming Cheng
This enhances masked-based editing in local areas; second, we present a novel distillation strategy: Conditional Distillation on Geometry and Texture (CDGT).
no code implementations • 12 Dec 2023 • Jiawei Sun, Bin Zhao, Dong Wang, Zhigang Wang, Jie Zhang, Nektarios Koukourakis, Juergen W. Czarske, Xuelong Li
Quantitative phase imaging (QPI) through multi-core fibers (MCFs) has been an emerging in vivo label-free endoscopic imaging modality with minimal invasiveness.
1 code implementation • 11 Dec 2023 • Jiyan He, Weitao Feng, Yaosen Min, Jingwei Yi, Kunsheng Tang, Shuai Li, Jie Zhang, Kejiang Chen, Wenbo Zhou, Xing Xie, Weiming Zhang, Nenghai Yu, Shuxin Zheng
In this study, we aim to raise awareness of the dangers of AI misuse in science, and call for responsible AI development and use in this domain.
1 code implementation • 10 Dec 2023 • Xiaojian Yuan, Kejiang Chen, Wen Huang, Jie Zhang, Weiming Zhang, Nenghai Yu
In response to these identified gaps, we introduce a novel Data-Free Hard-Label Robustness Stealing (DFHL-RS) attack in this paper, which enables the stealing of both model accuracy and robustness by simply querying hard labels of the target model without the help of any natural data.
no code implementations • 4 Dec 2023 • Guanlin Li, Naishan Zheng, Man Zhou, Jie Zhang, Tianwei Zhang
However, these works lack analysis of adversarial information or perturbation, which cannot reveal the mystery of adversarial examples and lose proper interpretation.
no code implementations • 25 Nov 2023 • Ruibin Li, Jingcai Guo, Song Guo, Qihua Zhou, Jie Zhang
Specifically, we find that the very last few steps of the denoising (i. e., generation) process strongly correspond to the stylistic information of images, and based on this, we propose to augment the latent features of both the foreground and background images with Gaussians for a direct denoising-based harmonization.
1 code implementation • 23 Nov 2023 • Zhijie Rao, Jingcai Guo, Xiaocheng Lu, Qihua Zhou, Jie Zhang, Kang Wei, Chenxin Li, Song Guo
In this paper, we propose a simple yet effective Attribute-Aware Representation Rectification framework for GZSL, dubbed $\mathbf{(AR)^{2}}$, to adaptively rectify the feature extractor to learn novel features while keeping original valuable features.
no code implementations • 22 Nov 2023 • Jie Zhang, Qing-Tian Xu, Zhen-Hua Ling, Haizhou Li
In this work, we therefore propose a novel end-to-end brain-assisted speech enhancement network (BASEN), which incorporates the listeners' EEG signals and adopts a temporal convolutional network together with a convolutional multi-layer cross attention module to fuse EEG-audio features.
no code implementations • 18 Nov 2023 • Jiayang Liu, Siyu Zhu, Siyuan Liang, Jie Zhang, Han Fang, Weiming Zhang, Ee-Chien Chang
Various techniques have emerged to enhance the transferability of adversarial attacks for the black-box scenario.
no code implementations • 24 Oct 2023 • Zhiling Zhang, Jie Zhang, Kui Zhang, Wenbo Zhou, Weiming Zhang, Nenghai Yu
To address these concerns, researchers are actively exploring the concept of ``unlearnable examples", by adding imperceptible perturbation to data in the model training stage, which aims to prevent the model from learning discriminate features of the target face.
no code implementations • 24 Oct 2023 • Caixin Wang, Jie Zhang, Matthew A. Wilson, Ralph Etienne-Cummings
By combining the versatility of pixel-wise sampling patterns with the strength of deep neural networks at decoding complex scenes, our method greatly enhances the vision system's adaptability and performance in dynamic conditions.
no code implementations • 18 Oct 2023 • Zengguang Hao, Jie Zhang, Binxia Xu, Yafang Wang, Gerard de Melo, Xiaolong Li
Intent detection and identification from multi-turn dialogue has become a widely explored technique in conversational agents, for example, voice assistants and intelligent customer services.
1 code implementation • 16 Oct 2023 • Jianhao Yuan, Jie Zhang, Shuyang Sun, Philip Torr, Bo Zhao
Synthetic training data has gained prominence in numerous learning tasks and scenarios, offering advantages such as dataset augmentation, generalization evaluation, and privacy preservation.
no code implementations • 11 Oct 2023 • Jie Zhang, Yongshan Zhang, Yicong Zhou
To aggregate the multiview information, a fully-convolutional SED with a U-shape in spectral dimension is introduced to extract a multiview feature map.
no code implementations • 8 Oct 2023 • Md Selim, Jie Zhang, Faraneh Fathi, Michael A. Brooks, Ge Wang, Guoqiang Yu, Jin Chen
Finally, the decoder uses the transformed latent representation to generate a standardized CT image, providing a more consistent basis for downstream analysis.
1 code implementation • IEEE Transactions on Neural Networks and Learning Systems 2023 • Wenhui Huang, Cong Zhang, Jingda Wu, Xiangkun He, Jie Zhang, Chen Lv.
Stochastic exploration is the key to the success of the Deep Q-network (DQN) algorithm.
1 code implementation • 27 Sep 2023 • Guanlin Li, Yifei Chen, Jie Zhang, Shangwei Guo, Han Qiu, Guoyin Wang, Jiwei Li, Tianwei Zhang
AI-Generated Content (AIGC) is rapidly expanding, with services using advanced generative models to create realistic images and fluent text.
no code implementations • 25 Sep 2023 • Haokun Song, Rui Lin, Andrea Sgambelluri, Filippo Cugini, Yajie Li, Jie Zhang, Paolo Monti
We propose a cluster-based method to detect and locate eavesdropping events in optical line systems characterized by small power losses.
no code implementations • 11 Sep 2023 • Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng
Three different structures based on attention-guided feature gathering (AFG) are designed for deep feature fusion.
no code implementations • 7 Sep 2023 • Zhendong Liu, Jie Zhang, Qiangqiang He, Chongjun Wang
In the realm of visual recognition, data augmentation stands out as a pivotal technique to amplify model robustness.
no code implementations • 28 Aug 2023 • Qiushi Zhu, Yu Gu, Rilin Chen, Chao Weng, Yuchen Hu, LiRong Dai, Jie Zhang
Noise-robust TTS models are often trained using the enhanced speech, which thus suffer from speech distortion and background noise that affect the quality of the synthesized speech.
no code implementations • 21 Aug 2023 • Yutong Wu, Jie Zhang, Florian Kerschbaum, Tianwei Zhang
Users can easily download the word embedding from public websites like Civitai and add it to their own stable diffusion model without fine-tuning for personalization.
no code implementations • 21 Aug 2023 • Changzhen Li, Jie Zhang, Yang Wei, Zhilong Ji, Jinfeng Bai, Shiguang Shan
Vision Transformers have achieved great success in computer visions, delivering exceptional performance across various tasks.
no code implementations • 19 Aug 2023 • Jie Zhang, Pengcheng Shi, Zaiwang Gu, Yiyang Zhou, Zhi Wang
In this paper, we present Semantic-Human, a novel method that achieves both photorealistic details and viewpoint-consistent human parsing for the neural rendering of humans.
no code implementations • 18 Aug 2023 • Pengcheng Shi, Jie Zhang, Haozhe Cheng, Junyang Wang, Yiyang Zhou, Chenlin Zhao, Jihua Zhu
Specifically, we propose a plug-and-play Overlap Bias Matching Module (OBMM) comprising two integral components, overlap sampling module and bias prediction module.
no code implementations • 7 Aug 2023 • Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jie Zhang, Jia Jia, Ning Hu
In order to address the problem of pagination trigger mechanism, we propose a completely new module in the pipeline of recommender system named Mobile Supply.
no code implementations • 31 Jul 2023 • Yuzheng Wang, Zhaoyu Chen, Jie Zhang, Dingkang Yang, Zuhao Ge, Yang Liu, Siao Liu, Yunquan Sun, Wenqiang Zhang, Lizhe Qi
Data-Free Knowledge Distillation (DFKD) is a novel task that aims to train high-performance student models using only the pre-trained teacher network without original training data.
1 code implementation • 27 Jul 2023 • Lingdong Kong, Yaru Niu, Shaoyuan Xie, Hanjiang Hu, Lai Xing Ng, Benoit R. Cottereau, Liangjun Zhang, Hesheng Wang, Wei Tsang Ooi, Ruijie Zhu, Ziyang Song, Li Liu, Tianzhu Zhang, Jun Yu, Mohan Jing, Pengwei Li, Xiaohua Qi, Cheng Jin, Yingfeng Chen, Jie Hou, Jie Zhang, Zhen Kan, Qiang Ling, Liang Peng, Minglei Li, Di Xu, Changpeng Yang, Yuanqi Yao, Gang Wu, Jian Kuai, Xianming Liu, Junjun Jiang, Jiamian Huang, Baojun Li, Jiale Chen, Shuang Zhang, Sun Ao, Zhenyu Li, Runze Chen, Haiyong Luo, Fang Zhao, Jingze Yu
In this paper, we summarize the winning solutions from the RoboDepth Challenge -- an academic competition designed to facilitate and advance robust OoD depth estimation.
1 code implementation • ICCV 2023 • Dongyao Zhu, Bowen Lei, Jie Zhang, Yanbo Fang, Ruqi Zhang, Yiqun Xie, Dongkuan Xu
Neural networks trained on distilled data often produce over-confident output and require correction by calibration methods.
no code implementations • 20 Jul 2023 • Yingpeng Du, Di Luo, Rui Yan, Hongzhi Liu, Yang song, HengShu Zhu, Jie Zhang
However, directly leveraging LLMs to enhance recommendation results is not a one-size-fits-all solution, as LLMs may suffer from fabricated generation and few-shot problems, which degrade the quality of resume completion.
1 code implementation • 19 Jul 2023 • Yu-chen Fan, Yitong Ji, Jie Zhang, Aixin Sun
First, there are significant differences in user interactions at the different stages when a user interacts with the MovieLens platform.
no code implementations • 18 Jul 2023 • Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jicong Fan, Jie Zhang, Jia Jia, Ning Hu, Xingyu Chen, Xuguang Lan
We propose a novel Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint (ESMC) and two alternatives: Entire Space Multi-Task Model with Siamese Network (ESMS) and Entire Space Multi-Task Model in Global Domain (ESMG) to address the PSC issue.
no code implementations • 15 Jul 2023 • Wenxin Xu, Hexin Jiang, Xuefeng Liang, Ying Zhou, Yin Zhao, Jie Zhang
In this work, we propose Utopia Label Distribution Approximation (ULDA) for time-series data, which makes the training label distribution closer to real-world but unknown (utopia) label distribution.
no code implementations • 11 Jul 2023 • Sikai Bai, Shuaicheng Li, Weiming Zhuang, Jie Zhang, Song Guo, Kunlin Yang, Jun Hou, Shuai Zhang, Junyu Gao, Shuai Yi
Theoretically, we show the convergence guarantee of the dual regulators.
no code implementations • 7 Jul 2023 • Min Yu, Jie Zhang, Arndt Joedicke, Tom Reddyhoff
Overall, the proposed method is applicable to general lubricated interfaces for the identification of equivalent circuit models, which in turn facilitates in-situ tribo-contacts with electric impedance measurement of oil film thickness.