Search Results for author: Feng Liu

Found 297 papers, 115 papers with code

BCRNet: Enhancing Landmark Detection in Laparoscopic Liver Surgery via Bezier Curve Refinement

no code implementations18 Jun 2025 Qian Li, Feng Liu, Shuojue Yang, Daiyun Shen, Yueming Jin

In this paper, we propose BCRNet (Bezier Curve Refinement Net), a novel framework that significantly enhances landmark detection in laparoscopic liver surgery primarily via the Bezier curve refinement strategy.

CALM: Consensus-Aware Localized Merging for Multi-Task Learning

no code implementations16 Jun 2025 Kunda Yan, Min Zhang, Sen Cui, Zikun Qu, Bo Jiang, Feng Liu, ChangShui Zhang

Model merging aims to integrate the strengths of multiple fine-tuned models into a unified model while preserving task-specific capabilities.

Task Arithmetic

SUSEP-Net: Simulation-Supervised and Contrastive Learning-based Deep Neural Networks for Susceptibility Source Separation

no code implementations16 Jun 2025 Min Li, Chen Chen, Zhenghao Li, Yin Liu, Shanshan Shan, Peng Wu, Pengfei Rong, Feng Liu, G. Bruce Pike, Alan H. Wilman, Hongfu Sun, Yang Gao

Comprehensive experiments were carried out on both simulated and in vivo data, including healthy subjects and patients with pathological conditions, to compare SUSEP-Net with three state-of-the-art susceptibility source separation methods (i. e., APART-QSM, \c{hi}-separation, and \c{hi}-sepnet).

Contrastive Learning

Compositional and Equilibrium-Free Conditions for Power System Stability -- Part II: Method and Application

no code implementations13 Jun 2025 Peng Yang, Yifan Su, Xiaoyu Peng, Hua Geng, Feng Liu

In Part I, we have established the stability theory and proposed stability conditions based on the delta dissipativity.

Distributed Computing

Compositional and Equilibrium-Free Conditions for Power System Stability -- Part I: Theory

no code implementations13 Jun 2025 Peng Yang, Xiaoyu Peng, Xi Ru, Hua Geng, Feng Liu

This two-part paper proposes a compositional and equilibrium-free approach to analyzing power system stability.

Neural Network Reprogrammability: A Unified Theme on Model Reprogramming, Prompt Tuning, and Prompt Instruction

no code implementations5 Jun 2025 Zesheng Ye, Chengyi Cai, Ruijiang Dong, Jianzhong Qi, Lei Feng, Pin-Yu Chen, Feng Liu

As large-scale pre-trained foundation models continue to expand in size and capability, efficiently adapting them to specific downstream tasks has become increasingly critical.

In-Context Learning

OmniEarth-Bench: Towards Holistic Evaluation of Earth's Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data

no code implementations29 May 2025 Fengxiang Wang, Mingshuo Chen, Xuming He, Yifan Zhang, Feng Liu, Zijie Guo, Zhenghao Hu, Jiong Wang, Jingyi Xu, Zhangrui Li, Fenghua Ling, Ben Fei, Weijia Li, Long Lan, Wenjing Yang, Wenlong Zhang, Lei Bai

Existing benchmarks for Earth science multimodal learning exhibit critical limitations in systematic coverage of geosystem components and cross-sphere interactions, often constrained to isolated subsystems (only in Human-activities sphere or atmosphere) with limited evaluation dimensions (less than 16 tasks).

scientific discovery

EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse

1 code implementation28 May 2025 Tianyu Guo, Hande Dong, Yichong Leng, Feng Liu, Cheater Lin, Nong Xiao, Xianwei Zhang

However, in infilling tasks, the KV cache reuse is often hindered by the structure of the prompt format, which typically consists of a prefix and suffix relative to the insertion point.

Bridging the Gap: Self-Optimized Fine-Tuning for LLM-based Recommender Systems

no code implementations27 May 2025 Heng Tang, Feng Liu, Xinbo Chen, Jiawei Chen, Bohao Wang, Changwang Zhang, Jun Wang, Yuegang Sun, Bingde Hu, Can Wang

Then it further utilizes a self-adaptive curriculum scheduler to enable LLMs to gradually learn from simpler data (self-distilled data) to more challenging data (real RS data).

In-Context Learning Recommendation Systems

Flow Matching based Sequential Recommender Model

1 code implementation22 May 2025 Feng Liu, Lixin Zou, Xiangyu Zhao, Min Tang, Liming Dong, Dan Luo, Xiangyang Luo, Chenliang Li

Generative models, particularly diffusion model, have emerged as powerful tools for sequential recommendation.

model Sequential Recommendation

No Black Boxes: Interpretable and Interactable Predictive Healthcare with Knowledge-Enhanced Agentic Causal Discovery

no code implementations22 May 2025 Xiaoxue Han, Pengfei Hu, Jun-En Ding, Chang Lu, Feng Liu, Yue Ning

Deep learning models trained on extensive Electronic Health Records (EHR) data have achieved high accuracy in diagnosis prediction, offering the potential to assist clinicians in decision-making and treatment planning.

Causal Discovery Decision Making

Temporal-Spectral-Spatial Unified Remote Sensing Dense Prediction

1 code implementation18 May 2025 Sijie Zhao, Feng Liu, Xueliang Zhang, Hao Chen, Pengfeng Xiao, Lei Bai

Consequently, variations in data dimensionality or task requirements often lead to significant performance degradation or model incompatibility, necessitating costly retraining or fine-tuning efforts for different application scenarios.

Change Detection Prediction +1

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

no code implementations CVPR 2025 Ozgur Kara, Krishna Kumar Singh, Feng Liu, Duygu Ceylan, James M. Rehg, Tobias Hinz

Current diffusion-based text-to-video methods are limited to producing short video clips of a single shot and lack the capability to generate multi-shot videos with discrete transitions where the same character performs distinct activities across the same or different backgrounds.

Video Generation

TACFN: Transformer-based Adaptive Cross-modal Fusion Network for Multimodal Emotion Recognition

1 code implementation10 May 2025 Feng Liu, Ziwang Fu, Yunlong Wang, Qijian Zheng

Specifically, for the redundant features, we make one modality perform intra-modal feature selection through a self-attention mechanism, so that the selected features can adaptively and efficiently interact with another modality.

feature selection Multimodal Emotion Recognition

Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions

no code implementations16 Apr 2025 Yifei Dong, Fengyi Wu, Sanjian Zhang, Guangyu Chen, Yuzhi Hu, Masumi Yano, Jingdong Sun, Siyu Huang, Feng Liu, Qi Dai, Zhi-Qi Cheng

Unmanned Aerial Vehicles (UAVs) are indispensable for infrastructure inspection, surveillance, and related tasks, yet they also introduce critical security challenges.

Benchmarking Language Modeling +2

MSL: Not All Tokens Are What You Need for Tuning LLM as a Recommender

1 code implementation5 Apr 2025 Bohao Wang, Feng Liu, Jiawei Chen, Xingyu Lou, Changwang Zhang, Jun Wang, Yuegang Sun, Yan Feng, Chun Chen, Can Wang

Large language models (LLMs), known for their comprehension capabilities and extensive knowledge, have been increasingly applied to recommendation systems (RS).

All Language Modeling +3

Could AI Trace and Explain the Origins of AI-Generated Images and Text?

1 code implementation5 Apr 2025 Hongchao Fang, Yixin Liu, Jiangshu Du, Can Qin, ran Xu, Feng Liu, Lichao Sun, Dongwon Lee, Lifu Huang, Wenpeng Yin

AI-generated content is becoming increasingly prevalent in the real world, leading to serious ethical and societal concerns.

Ancestral Mamba: Enhancing Selective Discriminant Space Model with Online Visual Prototype Learning for Efficient and Robust Discriminant Approach

no code implementations26 Mar 2025 Jiahao Qin, Feng Liu, Lu Zong

In the realm of computer graphics, the ability to learn continuously from non-stationary data streams while adapting to new visual patterns and mitigating catastrophic forgetting is of paramount importance.

Continual Learning Mamba

Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation

no code implementations25 Mar 2025 Max W. Y. Lam, Yijin Xing, Weiya You, Jingcheng Wu, Zongyu Yin, Fuqiang Jiang, Hangyu Liu, Feng Liu, Xingda Li, Wei-Tsung Lu, HanYu Chen, Tong Feng, Tianwei Zhao, Chien-Hung Liu, Xuchen Song, Yang Li, Yahui Zhou

However, the conventional next-token prediction paradigm in AR models does not align with the human creative process in music composition, potentially compromising the musicality of generated samples.

Music Generation

Hierarchical Adaptive Expert for Multimodal Sentiment Analysis

no code implementations25 Mar 2025 Jiahao Qin, Feng Liu, Lu Zong

HAEMSA employs a hierarchical structure of adaptive experts to capture both global and local modality representations, enabling more nuanced sentiment analysis.

Emotion Recognition Evolutionary Algorithms +2

OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning

no code implementations20 Mar 2025 Zhiyuan Liu, Yuting Zhang, Feng Liu, Changwang Zhang, Ying Sun, Jun Wang

Multimodal Large Language Models (MLLMs) have gained significant traction for their ability to process diverse input data types and generate coherent, contextually relevant outputs across various applications.

Reinforcement Learning (RL)

Visual Persona: Foundation Model for Full-Body Human Customization

no code implementations CVPR 2025 Jisu Nam, Soowon Son, Zhan Xu, Jing Shi, Difan Liu, Feng Liu, Aashish Misraa, Seungryong Kim, Yang Zhou

We introduce Visual Persona, a foundation model for text-to-image full-body human customization that, given a single in-the-wild human image, generates diverse images of the individual guided by text descriptions.

Appearance Transfer

Mamba-VA: A Mamba-based Approach for Continuous Emotion Recognition in Valence-Arousal Space

1 code implementation13 Mar 2025 Yuheng Liang, Zheyu Wang, Feng Liu, Mingzhou Liu, Yu Yao

Experimental results on the Valence-Arousal (VA) Estimation task of the 8th competition on Affective Behavior Analysis in-the-wild (ABAW) demonstrate that the proposed model achieves valence and arousal scores of 0. 5362 (0. 5036) and 0. 4310 (0. 4119) on the validation (test) set, respectively, outperforming the baseline.

Autonomous Driving Emotion Recognition +1

GRU: Mitigating the Trade-off between Unlearning and Retention for Large Language Models

no code implementations12 Mar 2025 Yue Wang, Qizhou Wang, Feng Liu, Wei Huang, Yali Du, Xiaojiang Du, Bo Han

Specifically, GRU derives a closed-form solution to project the unlearning gradient onto the orthogonal space of that gradient harmful for retention, ensuring minimal deviation from its original direction under the condition that overall performance is retained.

Large Language Model

MF-VITON: High-Fidelity Mask-Free Virtual Try-On with Minimal Input

no code implementations11 Mar 2025 Zhenchen Wan, Yanwu Xu, Dongting Hu, Weilun Cheng, Tianxi Chen, Zhaoqing Wang, Feng Liu, Tongliang Liu, Mingming Gong

To address this, we propose a Mask-Free VITON (MF-VITON) framework that achieves realistic VITON using only a single person image and a target garment, eliminating the requirement for auxiliary masks.

Virtual Try-on

AG-VPReID: A Challenging Large-Scale Benchmark for Aerial-Ground Video-based Person Re-Identification

1 code implementation CVPR 2025 Huy Nguyen, Kien Nguyen, Akila Pemasiri, Feng Liu, Sridha Sridharan, Clinton Fookes

The relatively lower performance of all state-of-the-art approaches, including our proposed approach, on our new dataset highlights its challenging nature.

Video-Based Person Re-Identification

Transforming Weather Data from Pixel to Latent Space

no code implementations9 Mar 2025 Sijie Zhao, Feng Liu, Xueliang Zhang, Hao Chen, Tao Han, Junchao Gong, Ran Tao, Pengfeng Xiao, Lei Bai, Wanli Ouyang

The downstream task further demonstrates that task models can apply to multiple PVS with low data costs in latent space and achieve superior performance compared to models in pixel space.

One Stone, Two Birds: Enhancing Adversarial Defense Through the Lens of Distributional Discrepancy

1 code implementation4 Mar 2025 Jiacheng Zhang, Benjamin I. P. Rubinstein, Jingfeng Zhang, Feng Liu

MMD-OPT first serves as a guiding signal to minimize the distributional discrepancy between CEs and AEs to train a denoiser.

Adversarial Defense

SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model

1 code implementation27 Feb 2025 Xinghao Wang, Feng Liu, Rui Su, Zhihui Wang, Lei Bai, Wanli Ouyang

Recent advances in deep learning have revolutionized seismic monitoring, yet developing a foundation model that performs well across multiple complex tasks remains challenging, particularly when dealing with degraded signals or data scarcity.

Language Modeling Language Modelling +1

Robust Dynamic Facial Expression Recognition

1 code implementation22 Feb 2025 Feng Liu, HanYang Wang, Siyuan Shen

Moreover, to identify the principal expression in a video and enhance the model's capacity for representation learning, comprising a key expression re-sampling framework and a dual-stream hierarchical network is proposed, namely Robust Dynamic Facial Expression Recognition (RDFER).

Dynamic Facial Expression Recognition Facial Expression Recognition

Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding

no code implementations28 Jan 2025 Yun Li, Zhe Liu, Yajing Kong, Guangrui Li, Jiyuan Zhang, Chao Bian, Feng Liu, Lina Yao, Zhenbang Sun

Using STE, we systematically compare implicit and explicit temporal modeling across dimensions such as overall performance, token compression effectiveness, and temporal-specific understanding.

Decoder Video Understanding

Attribute-based Visual Reprogramming for Image Classification with CLIP

1 code implementation23 Jan 2025 Chengyi Cai, Zesheng Ye, Lei Feng, Jianzhong Qi, Feng Liu

Besides, as images of the same class may reflect different attributes after VR, AttrVR iteratively refines patterns using the $k$-nearest DesAttrs and DistAttrs for each image sample, enabling more dynamic and sample-specific optimization.

Attribute Descriptive +2

Lossy Compression with Pretrained Diffusion Models

1 code implementation16 Jan 2025 Jeremy Vonderfecht, Feng Liu

We apply the DiffC algorithm (Theis et al. 2022) to Stable Diffusion 1. 5, 2. 1, XL, and Flux-dev, and demonstrate that these pretrained models are remarkably capable lossy image compressors.

Progressive Growing of Video Tokenizers for Highly Compressed Latent Spaces

no code implementations9 Jan 2025 Aniruddha Mahapatra, Long Mai, Yitian Zhang, David Bourgin, Feng Liu

Evaluation of video benchmarks shows that our method significantly improves reconstruction quality while increasing temporal compression compared to direct extensions of existing video tokenizers.

Video Generation

DispFormer: Pretrained Transformer for Flexible Dispersion Curve Inversion from Global Synthesis to Regional Applications

1 code implementation8 Jan 2025 Feng Liu, Bao Deng, Rui Su, Lei Bai, Wanli Ouyang

Surface wave dispersion curve inversion is essential for estimating subsurface Shear-wave velocity ($v_s$), yet traditional methods often struggle to balance computational efficiency with inversion accuracy.

Computational Efficiency

SapiensID: Foundation for Human Recognition

no code implementations CVPR 2025 Minchul Kim, Dingqiang Ye, Yiyang Su, Feng Liu, Xiaoming Liu

Existing human recognition systems often rely on separate, specialized models for face and body analysis, limiting their effectiveness in real-world scenarios where pose, visibility, and context vary widely.

Face Recognition

DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution

1 code implementation CVPR 2025 Yuzhong Zhao, Feng Liu, Yue Liu, Mingxiang Liao, Chen Gong, Qixiang Ye, Fang Wan

During inference, DynRefer performs selectively multimodal referring by sampling proper region representations for tasks from the set of views based on image and task priors.

Attribute

FloNa: Floor Plan Guided Embodied Visual Navigation

no code implementations24 Dec 2024 Jiaxin Li, Weiqi Huang, Zan Wang, Wei Liang, Huijun Di, Feng Liu

To eliminate this gap, we introduce a novel navigation task: Floor Plan Visual Navigation (FloNa), the first attempt to incorporate floor plan into embodied visual navigation.

Navigate Visual Navigation

Move-in-2D: 2D-Conditioned Human Motion Generation

no code implementations CVPR 2025 Hsin-Ping Huang, Yang Zhou, Jui-Hsien Wang, Difan Liu, Feng Liu, Ming-Hsuan Yang, Zhan Xu

Generating realistic human videos remains a challenging task, with the most effective methods currently relying on a human motion sequence as a control signal.

Motion Generation

Adversarial Purification by Consistency-aware Latent Space Optimization on Data Manifolds

no code implementations11 Dec 2024 Shuhai Zhang, Jiahao Yang, Hui Luo, Jie Chen, Li Wang, Feng Liu, Bo Han, Mingkui Tan

Leveraging this insight, we propose Consistency Model-based Adversarial Purification (CMAP), which optimizes vectors within the latent space of a pre-trained consistency model to generate samples for restoring clean data.

Adversarial Purification

GAF-FusionNet: Multimodal ECG Analysis via Gramian Angular Fields and Split Attention

1 code implementation7 Dec 2024 Jiahao Qin, Feng Liu

Electrocardiogram (ECG) analysis plays a crucial role in diagnosing cardiovascular diseases, but accurate interpretation of these complex signals remains challenging.

ECG Classification Time Series +1

STEAM-EEG: Spatiotemporal EEG Analysis with Markov Transfer Fields and Attentive CNNs

no code implementations7 Dec 2024 Jiahao Qin, Feng Liu

Electroencephalogram (EEG) signals play a pivotal role in biomedical research and clinical applications, including epilepsy diagnosis, sleep disorder analysis, and brain-computer interfaces.

Decision Making EEG +1

Semantic Retrieval at Walmart

no code implementations5 Dec 2024 Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, Ciya Liao

In this paper, we present a hybrid system for e-commerce search deployed at Walmart that combines traditional inverted index and embedding-based neural retrieval to better answer user tail queries.

Re-Ranking Retrieval +1

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

1 code implementation CVPR 2025 Bikang Pan, Qun Li, Xiaoying Tang, Wei Huang, Zhen Fang, Feng Liu, Jingya Wang, Jingyi Yu, Ye Shi

This matrix effectively partitions datasets into clean and noisy subsets, allowing for the application of cross-entropy loss to the clean subset and MAE loss to the noisy subset.

Learning Theory Learning with noisy labels +1

Automatic Differentiation-based Full Waveform Inversion with Flexible Workflows

1 code implementation30 Nov 2024 Feng Liu, Haipeng Li, Guangyuan Zou, Junlun Li

The AD-based framework not only includes forword modeling and associated gradient computations for wave equations in various types of media from isotropic acoustic to vertically or horizontally transverse isotropic elastic, but also incorporates a suite of objective functions, regularization techniques, and optimization algorithms.

Dynamic Time Warping

A Unified Data Representation Learning for Non-parametric Two-sample Testing

no code implementations30 Nov 2024 Xunye Tian, Liuhua Peng, Zhijian Zhou, Mingming Gong, Arthur Gretton, Feng Liu

Common approaches will first split data into training and test sets and then learn data representations purely on the training set.

Representation Learning Two-sample testing

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

2 code implementations CVPR 2025 Feng Liu, Shiwei Zhang, XiaoFeng Wang, Yujie Wei, Haonan Qiu, Yuzhong Zhao, Yingya Zhang, Qixiang Ye, Fang Wan

As a fundamental backbone for video generation, diffusion models are challenged by low inference speed due to the sequential nature of denoising.

Denoising Video Generation

DualCast: Disentangling Aperiodic Events from Traffic Series with a Dual-Branch Model

no code implementations27 Nov 2024 Xinyu Su, Feng Liu, Yanchuan Chang, Egemen Tanin, Majid Sarvi, Jianzhong Qi

Traffic forecasting is an important problem in the operation and optimisation of transportation systems.

TED-VITON: Transformer-Empowered Diffusion Models for Virtual Try-On

1 code implementation26 Nov 2024 Zhenchen Wan, Yanwu Xu, Zhaoqing Wang, Feng Liu, Tongliang Liu, Mingming Gong

Recent advancements in Virtual Try-On (VTO) have demonstrated exceptional efficacy in generating realistic images and preserving garment details, largely attributed to the robust generative capabilities of text-to-image (T2I) diffusion backbones.

Large Language Model Text Generation +1

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

no code implementations13 Nov 2024 XiaoFeng Wang, Kang Zhao, Feng Liu, Jiayu Wang, Guosheng Zhao, Xiaoyi Bao, Zheng Zhu, Yingya Zhang, Xingang Wang

Video generation has emerged as a promising tool for world simulation, leveraging visual data to replicate real-world environments.

Video Generation

World Models: The Safety Perspective

no code implementations12 Nov 2024 Zifan Zeng, Chongzhe Zhang, Feng Liu, Joseph Sifakis, Qunli Zhang, Shiming Liu, Peng Wang

With the proliferation of the Large Language Model (LLM), the concept of World Models (WM) has recently attracted a great deal of attention in the AI research community, especially in the context of AI agents.

AI Agent Language Modeling +1

Adaptive Conditional Expert Selection Network for Multi-domain Recommendation

no code implementations11 Nov 2024 Kuiyao Dong, Xingyu Lou, Feng Liu, Ruian Wang, Wenyi Yu, Ping Wang, Jun Wang

However, such MOE-based method typically employs all experts for each instance, leading to scalability issue and low-discriminability between domains and experts.

Computational Efficiency Mixture-of-Experts

'No' Matters: Out-of-Distribution Detection in Multimodality Long Dialogue

no code implementations31 Oct 2024 Rena Gao, Xuetong Wu, Siwen Luo, Caren Han, Feng Liu

Out-of-distribution (OOD) detection in multimodal contexts is essential for identifying deviations in combined inputs from different modalities, particularly in applications like open-domain dialogue systems or real-life dialogue interactions.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Bayesian-guided Label Mapping for Visual Reprogramming

1 code implementation31 Oct 2024 Chengyi Cai, Zesheng Ye, Lei Feng, Jianzhong Qi, Feng Liu

When adapting the output interface, label mapping methods transform the pretrained labels to downstream labels by establishing a gradient-free one-to-one correspondence between the two sets of labels.

Predicting the Encoding Error of SIRENs

no code implementations29 Oct 2024 Jeremy Vonderfecht, Feng Liu

Towards this goal, we present a method which predicts the encoding error that a popular INR network (SIREN) will reach, given its network hyperparameters and the signal to encode.

Fingerprints of Super Resolution Networks

no code implementations29 Oct 2024 Jeremy Vonderfecht, Feng Liu

Several recent studies have demonstrated that deep-learning based image generation models, such as GANs, can be uniquely identified, and possibly even reverse-engineered, by the fingerprints they leave on their output images.

Image Generation Image Super-Resolution +1

Enhancing Learned Image Compression via Cross Window-based Attention

1 code implementation28 Oct 2024 Priyanka Mudgal, Feng Liu

Therefore, to leverage global features along with local redundancy, we propose a CNN-based solution integrated with a feature encoding module.

Image Compression

Enhancing Multimodal Medical Image Classification using Cross-Graph Modal Contrastive Learning

1 code implementation23 Oct 2024 Jun-En Ding, Chien-Chin Hsu, Chi-Hsiang Chu, Shuqiang Wang, Feng Liu

However, traditional approaches typically focus on unimodal medical image data, neglecting the integration of diverse non-image patient data.

Contrastive Learning Disease Prediction +4

DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control

no code implementations17 Oct 2024 Yujie Wei, Shiwei Zhang, Hangjie Yuan, Xiang Wang, Haonan Qiu, Rui Zhao, Yutong Feng, Feng Liu, Zhizhong Huang, Jiaxin Ye, Yingya Zhang, Hongming Shan

In this paper, we present DreamVideo-2, a zero-shot video customization framework capable of generating videos with a specific subject and motion trajectory, guided by a single image and a bounding box sequence, respectively, and without the need for test-time fine-tuning.

Video Generation

Mind the Gap Between Prototypes and Images in Cross-domain Finetuning

1 code implementation16 Oct 2024 Hongduan Tian, Feng Liu, Zhanke Zhou, Tongliang Liu, Chengqi Zhang, Bo Han

However, in this paper, we find that there naturally exists a gap, which resembles the modality gap, between the prototype and image instance embeddings extracted from the frozen pre-trained backbone, and simply applying the same transformation during the adaptation phase constrains exploring the optimal representations and shrinks the gap between prototype and image representations.

Cross-Domain Few-Shot

Progressive Autoregressive Video Diffusion Models

1 code implementation10 Oct 2024 Desai Xie, Zhan Xu, Yicong Hong, Hao Tan, Difan Liu, Feng Liu, Arie Kaufman, Yang Zhou

In this work, we introduce a more natural formulation of autoregressive long video generation by revisiting the noise level assumption in video diffusion models.

Denoising Video Denoising +1

Optimal Control in Both Steady State and Transient Process with Unknown Disturbances

no code implementations4 Oct 2024 Ming Li, Zhaojian Wang, Feng Liu, Ming Cao, Bo Yang

The scheme of online optimization as a feedback controller is widely used to steer the states of a physical system to the optimal solution of a predefined optimization problem.

AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs

1 code implementation11 Sep 2024 Lijia Lv, Weigang Zhang, Xuehai Tang, Jie Wen, Feng Liu, Jizhong Han, Songlin Hu

Jailbreak vulnerabilities in Large Language Models (LLMs) refer to methods that extract malicious content from the model by carefully crafting prompts or suffixes, which has garnered significant attention from the research community.

Instruction Following Position

Learning Deep Kernels for Non-Parametric Independence Testing

no code implementations10 Sep 2024 Nathaniel Xu, Feng Liu, Danica J. Sutherland

The Hilbert-Schmidt Independence Criterion (HSIC) is a powerful tool for nonparametric detection of dependence between random variables.

An End-to-End Approach for Chord-Conditioned Song Generation

no code implementations10 Sep 2024 Shuochen Gao, Shun Lei, Fan Zhuo, Hangyu Liu, Feng Liu, Boshi Tang, Qiaochu Huang, Shiyin Kang, Zhiyong Wu

The Song Generation task aims to synthesize music composed of vocals and accompaniment from given lyrics.

SongCreator: Lyrics-based Universal Song Generation

no code implementations9 Sep 2024 Shun Lei, Yixuan Zhou, Boshi Tang, Max W. Y. Lam, Feng Liu, Hangyu Liu, Jingcheng Wu, Shiyin Kang, Zhiyong Wu, Helen Meng

While various aspects of song generation have been explored by previous works, such as singing voice, vocal composition and instrumental arrangement, etc., generating songs with both vocals and accompaniment given lyrics remains a significant challenge, hindering the application of music generation models in the real world.

Language Modelling Music Generation

LLM4DSR: Leveraing Large Language Model for Denoising Sequential Recommendation

no code implementations15 Aug 2024 Bohao Wang, Feng Liu, Changwang Zhang, Jiawei Chen, Yudi Wu, Sheng Zhou, Xingyu Lou, Jun Wang, Yan Feng, Chun Chen, Can Wang

However, employing LLMs for denoising in sequential recommendation presents notable challenges: 1) Direct application of pretrained LLMs may not be competent for the denoising task, frequently generating nonsensical responses; 2) Even after fine-tuning, the reliability of LLM outputs remains questionable, especially given the complexity of the denoising task and the inherent hallucinatory issue of LLMs.

Denoising Language Modeling +3

Fast Information Streaming Handler (FisH): A Unified Seismic Neural Network for Single Station Real-Time Earthquake Early Warning

no code implementations13 Aug 2024 Tianning Zhang, Feng Liu, Yuming Yuan, Rui Su, Wanli Ouyang, Lei Bai

FisH is designed to process real-time streaming seismic data and generate simultaneous results for phase picking, location estimation, and magnitude estimation in an end-to-end fashion.

Event Detection

Enhancing Relevance of Embedding-based Retrieval at Walmart

no code implementations9 Aug 2024 Juexin Lin, Sachin Yadav, Feng Liu, Nicholas Rossi, Praveen R. Suram, Satya Chembolu, Prijith Chandran, Hrushikesh Mohapatra, Tony Lee, Alessandro Magnani, Ciya Liao

In addition, we present the techniques to increase the performance of our EBR model, such as typo-aware training, and semi-positive generation.

Reranking Retrieval +1

Relevance Filtering for Embedding-based Retrieval

1 code implementation9 Aug 2024 Nicholas Rossi, Juexin Lin, Feng Liu, Zhen Yang, Tony Lee, Alessandro Magnani, Ciya Liao

This issue is prominent in product search, where the number of relevant products is often small.

Retrieval

Mamba-Spike: Enhancing the Mamba Architecture with a Spiking Front-End for Efficient Temporal Data Processing

1 code implementation4 Aug 2024 Jiahao Qin, Feng Liu

This paper introduces Mamba-Spike, a novel neuromorphic architecture that integrates a spiking front-end with the Mamba backbone to achieve efficient and robust temporal data processing.

Mamba

Open-Set Biometrics: Beyond Good Closed-Set Models

2 code implementations23 Jul 2024 Yiyang Su, Minchul Kim, Feng Liu, Anil Jain, Xiaoming Liu

Biometric recognition has primarily addressed closed-set identification, assuming all probe subjects are in the gallery.

Face Recognition Gait Recognition +1

Optimal Kernel Choice for Score Function-based Causal Discovery

no code implementations14 Jul 2024 Wenjie Wang, Biwei Huang, Feng Liu, Xinge You, Tongliang Liu, Kun Zhang, Mingming Gong

In this paper, we propose a kernel selection method within the generalized score function that automatically selects the optimal kernel that best fits the data.

Causal Discovery

Emotion Loss Attacking: Adversarial Attack Perception for Skeleton based on Multi-dimensional Features

no code implementations28 Jun 2024 Feng Liu, Qing Xu, Qijian Zheng

What's more, we are the first to prove the effectiveness of emotional features, and provide a new idea for measuring the distance between skeletal motions.

Adversarial Attack

Exclusive Style Removal for Cross Domain Novel Class Discovery

no code implementations26 Jun 2024 Yicheng Wang, Feng Liu, Junmin Liu, Kai Sun

In this paper, we explore and establish the solvability of NCD in cross domain setting with the necessary condition that style information must be removed.

Novel Class Discovery

IR2QSM: Quantitative Susceptibility Mapping via Deep Neural Networks with Iterative Reverse Concatenations and Recurrent Modules

no code implementations18 Jun 2024 Min Li, Chen Chen, Zhuang Xiong, Ying Liu, Pengfei Rong, Shanshan Shan, Feng Liu, Hongfu Sun, Yang Gao

Quantitative susceptibility mapping (QSM) is an MRI phase-based post-processing technique to extract the distribution of tissue susceptibilities, demonstrating significant potential in studying neurological diseases.

Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data

1 code implementation15 Jun 2024 Jiahan Zhang, Qi Wei, Feng Liu, Lei Feng

To alleviate this issue, we propose a Candidate Pseudolabel Learning method, termed CPL, to fine-tune VLMs with suitable candidate pseudolabels of unlabeled data in downstream tasks.

Capacity Credit Evaluation of Generalized Energy Storage Considering Strategic Capacity Withholding and Decision-Dependent Uncertainty

no code implementations11 Jun 2024 Ning Qi, Pierre Pinson, Mads R. Almassalkhi, Yingrui Zhuang, Yifan Su, Feng Liu

This paper proposes a novel capacity credit evaluation framework to accurately quantify the contribution of generalized energy storage (GES) to resource adequacy, considering both strategic capacity withholding and decision-dependent uncertainty (DDU).

Decision Making Scheduling

Sample-specific Masks for Visual Reprogramming-based Prompting

1 code implementation5 Jun 2024 Chengyi Cai, Zesheng Ye, Lei Feng, Jianzhong Qi, Feng Liu

Since we generate different masks for individual samples, SMM is theoretically shown to reduce approximation error for the target tasks compared with existing state-of-the-art VR methods.

Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models

1 code implementation5 Jun 2024 Jinhao Li, Haopeng Li, Sarah Erfani, Lei Feng, James Bailey, Feng Liu

The local visual areas are then cross-aligned with the finer descriptions by creating a similarity matrix using the pre-trained VLM.

Few-Shot Learning Language Modeling +5

Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training

1 code implementation2 Jun 2024 Jiacheng Zhang, Feng Liu, Dawei Zhou, Jingfeng Zhang, Tongliang Liu

However, in this paper, we discover that not all pixels contribute equally to the accuracy on AEs (i. e., robustness) and accuracy on natural images (i. e., accuracy).

Robust classification

MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence

1 code implementation29 May 2024 Hongduan Tian, Feng Liu, Tongliang Liu, Bo Du, Yiu-ming Cheung, Bo Han

In cross-domain few-shot classification, \emph{nearest centroid classifier} (NCC) aims to learn representations to construct a metric space where few-shot classification can be performed by measuring the similarities between samples and the prototype of each class.

Cross-Domain Few-Shot

DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution

1 code implementation25 May 2024 Yuzhong Zhao, Feng Liu, Yue Liu, Mingxiang Liao, Chen Gong, Qixiang Ye, Fang Wan

Unfortunately, most of existing methods using fixed visual inputs remain lacking the resolution adaptability to find out precise language descriptions.

Attribute

BDetCLIP: Multimodal Prompting Contrastive Test-Time Backdoor Detection

no code implementations24 May 2024 Yuwei Niu, Shuo He, Qi Wei, Zongyu Wu, Feng Liu, Lei Feng

In this paper, we provide the first attempt at a computationally efficient backdoor detection method to defend against backdoored CLIP in the inference stage.

Contrastive Learning Language Modelling +2

Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Model

1 code implementation1 May 2024 Yu Cui, Feng Liu, Pengbo Wang, Bohao Wang, Heng Tang, Yi Wan, Jun Wang, Jiawei Chen

Owing to their powerful semantic reasoning capabilities, Large Language Models (LLMs) have been effectively utilized as recommenders, achieving impressive performance.

Knowledge Distillation Language Modeling +2

VideoGigaGAN: Towards Detail-rich Video Super-Resolution

no code implementations CVPR 2025 Yiran Xu, Taesung Park, Richard Zhang, Yang Zhou, Eli Shechtman, Feng Liu, Jia-Bin Huang, Difan Liu

We introduce VideoGigaGAN, a new generative VSR model that can produce videos with high-frequency details and temporal consistency.

Video Super-Resolution

On the Learnability of Out-of-distribution Detection

no code implementations7 Apr 2024 Zhen Fang, Yixuan Li, Feng Liu, Bo Han, Jie Lu

Based on this observation, we next give several necessary and sufficient conditions to characterize the learnability of OOD detection in some practical scenarios.

Learning Theory Out-of-Distribution Detection +2

Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game

no code implementations3 Apr 2024 Qianqiao Xu, Zhiliang Tian, Hongyan Wu, Zhen Huang, Yiping Song, Feng Liu, Dongsheng Li

In this paper, we propose a multi-agent attacker-disguiser game approach to achieve a weak defense mechanism that allows the large model to both safely reply to the attacker and hide the defense intent.

Prompt Engineering Safety Alignment

Negative Label Guided OOD Detection with Pretrained Vision-Language Models

2 code implementations29 Mar 2024 Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, Bo Han

In this paper, we propose a novel post hoc OOD detection method, called NegLabel, which takes a vast number of negative labels from extensive corpus databases.

Out of Distribution (OOD) Detection

Benchmarking Video Frame Interpolation

no code implementations25 Mar 2024 Simon Kiefhaber, Simon Niklaus, Feng Liu, Simone Schaub-Meyer

Video frame interpolation, the task of synthesizing new frames in between two or more given ones, is becoming an increasingly popular research target.

Benchmarking Computational Efficiency +1

KeyPoint Relative Position Encoding for Face Recognition

3 code implementations CVPR 2024 Minchul Kim, Yiyang Su, Feng Liu, Anil Jain, Xiaoming Liu

By anchoring the significance of pixels around keypoints, the model can more effectively retain spatial relationships, even when those relationships are disrupted by affine transformations.

Face Recognition Gait Recognition +1

QSMDiff: Unsupervised 3D Diffusion Models for Quantitative Susceptibility Mapping

no code implementations21 Mar 2024 Zhuang Xiong, Wei Jiang, Yang Gao, Feng Liu, Hongfu Sun

In this work, we developed a 3D image patch-based diffusion model, namely QSMDiff, for robust QSM reconstruction across different scan parameters, alongside simultaneous super-resolution and image-denoising tasks.

Image Denoising Image Generation +1

IIDM: Image-to-Image Diffusion Model for Semantic Image Synthesis

1 code implementation20 Mar 2024 Feng Liu, Xiaobin Chang

Semantic image synthesis aims to generate high-quality images given semantic conditions, i. e. segmentation masks and style reference images.

Image Denoising Image Generation

Large Language Multimodal Models for 5-Year Chronic Disease Cohort Prediction Using EHR Data

no code implementations2 Mar 2024 Jun-En Ding, Phan Nguyen Minh Thao, Wen-Chih Peng, Jian-Zhe Wang, Chun-Cheng Chug, Min-Chen Hsieh, Yun-Chien Tseng, Ling Chen, Dongsheng Luo, Chi-Te Wang, Pei-fu Chen, Feng Liu, Fang-Ming Hung

In our experiments, we observe that clinicalBERT and PubMed-BERT, when combined with attention fusion, can achieve an accuracy of 73% in multiclass chronic diseases and diabetes prediction.

Diabetes Prediction

Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models

no code implementations22 Feb 2024 Yixuan Ren, Yang Zhou, Jimei Yang, Jing Shi, Difan Liu, Feng Liu, Mingi Kwon, Abhinav Shrivastava

To address the challenge of one-shot video motion customization, we propose Customize-A-Video that models the motion from a single reference video and adapts it to new subjects and scenes with both spatial and temporal varieties.

Video Generation

Fingerprint Presentation Attack Detector Using Global-Local Model

no code implementations20 Feb 2024 Haozhe Liu, Wentian Zhang, Feng Liu, Haoqian Wu, Linlin Shen

While by using the texture in-painting-based local module, a local spoofness score predicted from fingerprint patches is obtained.

model

Privacy-Preserving Low-Rank Adaptation against Membership Inference Attacks for Latent Diffusion Models

1 code implementation19 Feb 2024 Zihao Luo, Xilie Xu, Feng Liu, Yun Sing Koh, Di Wang, Jingfeng Zhang

To mitigate this issue, we further propose a Stable Membership-Privacy-preserving LoRA (SMP-LoRA) that adapts the LDM by minimizing the ratio of the adaptation loss to the MI gain.

Privacy Preserving

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

2 code implementations6 Feb 2024 Feng Liu, Tengteng Huang, Qianjing Zhang, Haotian Yao, Chi Zhang, Fang Wan, Qixiang Ye, Yanzhao Zhou

Multi-view 3D object detection systems often struggle with generating precise predictions due to the challenges in estimating depth from images, increasing redundant and incorrect detections.

3D Object Detection Denoising +1

From Data to Insights: A Comprehensive Survey on Advanced Applications in Thyroid Cancer Research

no code implementations8 Jan 2024 Xinyu Zhang, Vincent CS Lee, Feng Liu

Thyroid cancer, the most prevalent endocrine cancer, has gained significant global attention due to its impact on public health.

Prognosis

TIGER: Time-Varying Denoising Model for 3D Point Cloud Generation with Diffusion Process

2 code implementations CVPR 2024 Zhiyuan Ren, Minchul Kim, Feng Liu, Xiaoming Liu

However few works study the effect of the architecture of the diffusion model in the 3D point cloud resorting to the typical UNet model developed for 2D images.

Denoising Diversity +1

SNED: Superposition Network Architecture Search for Efficient Video Diffusion Model

no code implementations CVPR 2024 Zhengang Li, Yan Kang, Yuchen Liu, Difan Liu, Tobias Hinz, Feng Liu, Yanzhi Wang

Our method employs a supernet training paradigm that targets various model cost and resolution options using a weight-sharing method.

Video Generation

INFAMOUS-NeRF: ImproviNg FAce MOdeling Using Semantically-Aligned Hypernetworks with Neural Radiance Fields

no code implementations23 Dec 2023 Andrew Hou, Feng Liu, Zhiyuan Ren, Michel Sarkis, Ning Bi, Yiying Tong, Xiaoming Liu

We propose INFAMOUS-NeRF, an implicit morphable face model that introduces hypernetworks to NeRF to improve the representation power in the presence of many training subjects.

Face Model NeRF

Fast View Synthesis of Casual Videos with Soup-of-Planes

no code implementations4 Dec 2023 Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, Feng Liu

Specifically, we build a global static scene model using an extended plane-based scene representation to synthesize temporally coherent novel video.

Novel View Synthesis

Parkinson's Disease Classification Using Contrastive Graph Cross-View Learning with Multimodal Fusion of SPECT Images and Clinical Features

no code implementations25 Nov 2023 Jun-En Ding, Chien-Chin Hsu, Feng Liu

This work proposes a multimodal approach encompassing both image and non-image features, leveraging contrastive cross-view graph fusion for PD classification.

Single Image Compressed Sensing MRI via a Self-Supervised Deep Denoising Approach

no code implementations22 Nov 2023 Marlon Bran Lorenzana, Feng Liu, Shekhar S. Chandra

Popular methods in compressed sensing (CS) are dependent on deep learning (DL), where large amounts of data are used to train non-linear reconstruction models.

compressed sensing Denoising +1

Fast Controllable Diffusion Models for Undersampled MRI Reconstruction

1 code implementation20 Nov 2023 Wei Jiang, Zhuang Xiong, Feng Liu, Nan Ye, Hongfu Sun

Supervised deep learning methods have shown promise in undersampled Magnetic Resonance Imaging (MRI) reconstruction, but their requirement for paired data limits their generalizability to the diverse MRI acquisition parameters.

MRI Reconstruction

Plug-and-Play Latent Feature Editing for Orientation-Adaptive Quantitative Susceptibility Mapping Neural Networks

1 code implementation14 Nov 2023 Yang Gao, Zhuang Xiong, Shanshan Shan, Yin Liu, Pengfei Rong, Min Li, Alan H Wilman, G. Bruce Pike, Feng Liu, Hongfu Sun

The proposed OA-LFE-empowered iQSM, which we refer to as iQSM+, is trained in a self-supervised manner on a specially-designed simulation brain dataset.

LRM: Large Reconstruction Model for Single Image to 3D

1 code implementation8 Nov 2023 Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan

We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds.

Image to 3D NeRF

Out-of-distribution Detection Learning with Unreliable Out-of-distribution Sources

1 code implementation NeurIPS 2023 Haotian Zheng, Qizhou Wang, Zhen Fang, Xiaobo Xia, Feng Liu, Tongliang Liu, Bo Han

To this end, we suggest that generated data (with mistaken OOD generation) can be used to devise an auxiliary OOD detection task to facilitate real OOD detection.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +1

Learning to Augment Distributions for Out-of-Distribution Detection

1 code implementation NeurIPS 2023 Qizhou Wang, Zhen Fang, Yonggang Zhang, Feng Liu, Yixuan Li, Bo Han

Accordingly, we propose Distributional-Augmented OOD Learning (DAL), alleviating the OOD distribution discrepancy by crafting an OOD distribution set that contains all distributions in a Wasserstein ball centered on the auxiliary OOD distribution.

Learning Theory Out-of-Distribution Detection

Partition Speeds Up Learning Implicit Neural Representations Based on Exponential-Increase Hypothesis

1 code implementation ICCV 2023 Ke Liu, Feng Liu, Haishuai Wang, Ning Ma, Jiajun Bu, Bo Han

Based on this fact, we introduce a simple partition mechanism to boost the performance of two INR methods for image reconstruction: one for learning INRs, and the other for learning-to-learn INRs.

Image Reconstruction Semantic Segmentation

Auxiliary Features-Guided Super Resolution for Monte Carlo Rendering

no code implementations20 Oct 2023 Qiqi Hou, Feng Liu

This paper investigates super resolution to reduce the number of pixels to render and thus speed up Monte Carlo rendering algorithms.

Denoising Super-Resolution

PINF: Continuous Normalizing Flows for Physics-Constrained Deep Learning

no code implementations26 Sep 2023 Feng Liu, Faguo Wu, Xiao Zhang

The normalization constraint on probability density poses a significant challenge for solving the Fokker-Planck equation.

Deep Learning

Multi-Stage Expansion Planning for Decarbonizing Thermal Generation Supported Renewable Power Systems Using Hydrogen and Ammonia Storage

no code implementations31 Aug 2023 Zhipeng Yu, Jin Lin, Feng Liu, Jiarong Li, Yingtian Chi, Yonghua Song, Zhengwei Ren

Large-scale centralized development of wind and solar energy and peer-to-grid transmission of renewable energy source (RES) via high voltage direct current (HVDC) has been regarded as one of the most promising ways to achieve goals of peak carbon and carbon neutrality in China.

Quantitative Susceptibility Mapping through Model-based Deep Image Prior (MoDIP)

no code implementations18 Aug 2023 Zhuang Xiong, Yang Gao, Yin Liu, Amir Fazlollahi, Peter Nestor, Feng Liu, Hongfu Sun

The data-driven approach of supervised learning methods has limited applicability in solving dipole inversion in Quantitative Susceptibility Mapping (QSM) with varying scan parameters across different objects.

Image Reconstruction

Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation

no code implementations12 Jul 2023 Ruijiang Dong, Feng Liu, Haoang Chi, Tongliang Liu, Mingming Gong, Gang Niu, Masashi Sugiyama, Bo Han

In this paper, we propose a diversity-enhancing generative network (DEG-Net) for the FHA problem, which can generate diverse unlabeled data with the help of a kernel independence measure: the Hilbert-Schmidt independence criterion (HSIC).

Diversity

Learning Constrained Corner Node Trajectories of a Tether Net System for Space Debris Capture

no code implementations6 Jul 2023 Feng Liu, Achira Boonrath, Prajit KrisshnaKumar, Elenora M. Botta, Souma Chowdhury

The earth's orbit is becoming increasingly crowded with debris that poses significant safety risks to the operation of existing and new spacecraft and satellites.

CoverHunter: Cover Song Identification with Refined Attention and Alignments

1 code implementation15 Jun 2023 Feng Liu, Deyi Tuo, Yinan Xu, Xintong Han

Abstract: Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track.

Cover song identification

Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score

1 code implementation25 May 2023 Shuhai Zhang, Feng Liu, Jiahao Yang, Yifan Yang, Changsheng Li, Bo Han, Mingkui Tan

Last, we propose an EPS-based adversarial detection (EPS-AD) method, in which we develop EPS-based maximum mean discrepancy (MMD) as a metric to measure the discrepancy between the test sample and natural samples.

DADIN: Domain Adversarial Deep Interest Network for Cross Domain Recommender Systems

no code implementations20 May 2023 Menglin Kong, Muzhou Hou, Shaojie Zhao, Feng Liu, Ri Su, Yinghao Chen

Click-Through Rate (CTR) prediction is one of the main tasks of the recommendation system, which is conducted by a user for different items to give the recommendation results.

Click-Through Rate Prediction Domain Adaptation +3

Attacking Perceptual Similarity Metrics

no code implementations15 May 2023 Abhijay Ghildyal, Feng Liu

In our study, we systematically examine the robustness of these metrics to imperceptible adversarial perturbations.

Adversarial Attack Experimental Design

Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularization

1 code implementation NeurIPS 2023 Xilie Xu, Jingfeng Zhang, Feng Liu, Masashi Sugiyama, Mohan Kankanhalli

To improve transferability, the existing work introduced the standard invariant regularization (SIR) to impose style-independence property to SCL, which can exempt the impact of nuisance style factors in the standard representation.

Contrastive Learning

Detecting Out-of-distribution Data through In-distribution Class Prior

1 code implementation ICML 2023 Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, Bo Han

In this paper, we show that this assumption makes the above methods incapable when the ID model is trained with class-imbalanced data. Fortunately, by analyzing the causal relations between ID/OOD classes and features, we identify several common scenarios where the OOD-to-ID probabilities should be the ID-class-prior distribution and propose two strategies to modify existing inference-time detection methods: 1) replace the uniform distribution with the ID-class-prior distribution if they explicitly use the uniform distribution; 2) otherwise, reweight their scores according to the similarity between the ID-class-prior distribution and the softmax outputs of the pre-trained model.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

DCFace: Synthetic Face Generation with Dual Condition Diffusion Model

1 code implementation CVPR 2023 Minchul Kim, Feng Liu, Anil Jain, Xiaoming Liu

Our novel Patch-wise style extractor and Time-step dependent ID loss enables DCFace to consistently produce face images of the same subject under different styles with precise control.

Dataset Generation Face Generation +1

AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation

no code implementations CVPR 2023 Hyunyoung Jung, Zhuo Hui, Lei Luo, Haitao Yang, Feng Liu, Sungjoo Yoo, Rakesh Ranjan, Denis Demandolx

To apply optical flow in practice, it is often necessary to resize the input to smaller dimensions in order to reduce computational costs.

Optical Flow Estimation

Optimal Sizing of Isolated Renewable Power Systems with Ammonia Synthesis: Model and Solution Approach

no code implementations10 Mar 2023 Zhipeng Yu, Jin Lin, Feng Liu, Jiarong Li, Yuxuan Zhao, Yonghua Song

However, multi-timescale electricity, hydrogen, and ammonia storages, minimum power supply for system safety, and the multi-year uncertainty of renewable generation lead to difficulties in planning.

Out-of-distribution Detection with Implicit Outlier Transformation

1 code implementation9 Mar 2023 Qizhou Wang, Junjie Ye, Feng Liu, Quanyu Dai, Marcus Kalander, Tongliang Liu, Jianye Hao, Bo Han

It leads to a min-max learning scheme -- searching to synthesize OOD data that leads to worst judgments and learning from such OOD data for uniform performance in OOD detection.

Out-of-Distribution Detection

AliasNet: Alias Artefact Suppression Network for Accelerated Phase-Encode MRI

no code implementations17 Feb 2023 Marlon E. Bran Lorenzana, Shekhar S. Chandra, Feng Liu

Sparse reconstruction is an important aspect of MRI, helping to reduce acquisition time and improve spatial-temporal resolution.

compressed sensing Denoising +1

Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection

1 code implementation NeurIPS 2023 Xilie Xu, Jingfeng Zhang, Feng Liu, Masashi Sugiyama, Mohan Kankanhalli

Adversarial contrastive learning (ACL) does not require expensive data annotations but outputs a robust representation that withstands adversarial attacks and also generalizes to a wide range of downstream tasks.

Contrastive Learning

Shift from Texture-bias to Shape-bias: Edge Deformation-based Augmentation for Robust Object Recognition

no code implementations ICCV 2023 Xilin He, Qinliang Lin, Cheng Luo, Weicheng Xie, Siyang Song, Feng Liu, Linlin Shen

Recent studies have shown the vulnerability of CNNs under perturbation noises, which is partially caused by the reason that the well-trained CNNs are too biased toward the object texture, i. e., they make predictions mainly based on texture cues.

Object Recognition

Learning Implicit Functions for Dense 3D Shape Correspondence of Generic Objects

no code implementations29 Dec 2022 Feng Liu, Xiaoming Liu

The objective of this paper is to learn dense 3D shape correspondence for topology-varying generic objects in an unsupervised manner.

Semantic correspondence

Continuous Semi-Supervised Nonnegative Matrix Factorization

no code implementations19 Dec 2022 Michael R. Lindstrom, Xiaofu Ding, Feng Liu, Anand Somayajula, Deanna Needell

Nonnegative matrix factorization can be used to automatically detect topics within a corpus in an unsupervised fashion.

regression

3D-EPI Blip-Up/Down Acquisition (BUDA) with CAIPI and Joint Hankel Structured Low-Rank Reconstruction for Rapid Distortion-Free High-Resolution T2* Mapping

no code implementations1 Dec 2022 Zhifeng Chen, Congyu Liao, Xiaozhi Cao, Benedikt A. Poser, Zhongbiao Xu, Wei-Ching Lo, Manyi Wen, Jaejin Cho, Qiyuan Tian, Yaohui Wang, Yanqiu Feng, Ling Xia, Wufan Chen, Feng Liu, Berkin Bilgic

Purpose: This work aims to develop a novel distortion-free 3D-EPI acquisition and image reconstruction technique for fast and robust, high-resolution, whole-brain imaging as well as quantitative T2* mapping.

Image Reconstruction

Affine Transformation Edited and Refined Deep Neural Network for Quantitative Susceptibility Mapping

no code implementations25 Nov 2022 Zhuang Xiong, Yang Gao, Feng Liu, Hongfu Sun

We propose an end-to-end AFfine Transformation Edited and Refined (AFTER) deep neural network for QSM, which is robust against arbitrary acquisition orientation and spatial resolution up to 0. 6 mm isotropic at the finest.

Watermarking for Out-of-distribution Detection

1 code implementation27 Oct 2022 Qizhou Wang, Feng Liu, Yonggang Zhang, Jing Zhang, Chen Gong, Tongliang Liu, Bo Han

Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models.

Out-of-Distribution Detection

Is Out-of-Distribution Detection Learnable?

no code implementations26 Oct 2022 Zhen Fang, Yixuan Li, Jie Lu, Jiahua Dong, Bo Han, Feng Liu

Based on this observation, we next give several necessary and sufficient conditions to characterize the learnability of OOD detection in some practical scenarios.

Diversity Learning Theory +3

Cluster and Aggregate: Face Recognition with Large Probe Set

1 code implementation19 Oct 2022 Minchul Kim, Feng Liu, Anil Jain, Xiaoming Liu

Advances in attention and recurrent modules have led to feature fusion that can model the relationship among the images in the input set.

 Ranked #1 on Face Verification on IJB-B (TAR @ FAR=0.001 metric)

Face Recognition Face Verification +4

Mix and Reason: Reasoning over Semantic Topology with Data Mixing for Domain Generalization

no code implementations14 Oct 2022 Chaoqi Chen, Luyao Tang, Feng Liu, Gangming Zhao, Yue Huang, Yizhou Yu

Domain generalization (DG) enables generalizing a learning machine from multiple seen source domains to an unseen target one.

Domain Generalization Relational Reasoning

A Perceptual Quality Metric for Video Frame Interpolation

1 code implementation4 Oct 2022 Qiqi Hou, Abhijay Ghildyal, Feng Liu

In this paper, we present a dedicated perceptual quality metric for measuring video frame interpolation results.

Video Frame Interpolation

A Uniform Representation Learning Method for OCT-based Fingerprint Presentation Attack Detection and Reconstruction

no code implementations25 Sep 2022 Wentian Zhang, Haozhe Liu, Feng Liu, Raghavendra Ramachandra

For reconstruction performance, our method achieves the best performance with 0. 834 mIOU and 0. 937 PA. By comparing with the recognition performance on surface 2D fingerprints, the effectiveness of our proposed method on high quality subsurface fingerprint reconstruction is further proved.

Representation Learning Semantic Segmentation

Two-Stage Submodular Optimization of Dynamic Thermal Rating for Risk Mitigation Considering Placement and Operation Schedule

no code implementations20 Sep 2022 Qinfei Long, Junhong Liu, Chenhao Ren, Wenqian Yin, Feng Liu, Yunhe Hou

From the perspectives of service life and Braess paradox, it is important and challenging to jointly optimize the DTR placement and operation schedule for changing system state, which is a two-stage combinatorial problem with only discrete variables, suffering from no approximation guarantee and dimension curse only based on traditional models.

Neighborhood Collective Estimation for Noisy Label Identification and Correction

1 code implementation5 Aug 2022 Jichang Li, Guanbin Li, Feng Liu, Yizhou Yu

Specifically, our method is divided into two steps: 1) Neighborhood Collective Noise Verification to separate all training samples into a clean or noisy subset, 2) Neighborhood Collective Label Correction to relabel noisy samples, and then auxiliary techniques are used to assist further model optimization.

Learning with noisy labels Model Optimization

Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels

1 code implementation29 Jul 2022 Ganlong Zhao, Guanbin Li, Yipeng Qin, Feng Liu, Yizhou Yu

In this paper, we propose a two-stage clean samples identification method to address the aforementioned challenge.

Ranked #4 on Image Classification on Clothing1M (using extra training data)

Image Classification

Shift-tolerant Perceptual Similarity Metric

2 code implementations27 Jul 2022 Abhijay Ghildyal, Feng Liu

This paper studies the effect of small misalignment, specifically a small shift between the input and reference image, on existing metrics, and accordingly develops a shift-tolerant similarity metric.

Video Quality Assessment

Controllable and Guided Face Synthesis for Unconstrained Face Recognition

3 code implementations20 Jul 2022 Feng Liu, Minchul Kim, Anil Jain, Xiaoming Liu

To address this problem, we propose a controllable face synthesis model (CFSM) that can mimic the distribution of target datasets in a style latent space.

Diversity Face Generation +2

2D GANs Meet Unsupervised Single-view 3D Reconstruction

no code implementations20 Jul 2022 Feng Liu, Xiaoming Liu

In light of this, we propose a novel image-conditioned neural implicit field, which can leverage 2D supervisions from GAN-generated multi-view images and perform the single-view reconstruction of generic objects.

3D geometry 3D Reconstruction +2

A simple normalization technique using window statistics to improve the out-of-distribution generalization on medical images

1 code implementation7 Jul 2022 Chengfeng Zhou, Songchang Chen, Chenming Xu, Jun Wang, Feng Liu, Chun Zhang, Juan Ye, Hefeng Huang, Dahong Qian

In this study, we present a novel normalization technique called window normalization (WIN) to improve the model generalization on heterogeneous medical images, which is a simple yet effective alternative to existing normalization methods.

Breast Cancer Detection Out-of-Distribution Generalization

Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack

1 code implementation15 Jun 2022 Ruize Gao, Jiongxiao Wang, Kaiwen Zhou, Feng Liu, Binghui Xie, Gang Niu, Bo Han, James Cheng

The AutoAttack (AA) has been the most reliable method to evaluate adversarial robustness when considerable computational resources are available.

Adversarial Robustness Computational Efficiency

Bilateral Dependency Optimization: Defending Against Model-inversion Attacks

2 code implementations11 Jun 2022 Xiong Peng, Feng Liu, Jingfen Zhang, Long Lan, Junjie Ye, Tongliang Liu, Bo Han

To defend against MI attacks, previous work utilizes a unilateral dependency optimization strategy, i. e., minimizing the dependency between inputs (i. e., features) and outputs (i. e., labels) during training the classifier.

model

Multi-class Classification with Fuzzy-feature Observations: Theory and Algorithms

1 code implementation9 Jun 2022 Guangzhi Ma, Jie Lu, Feng Liu, Zhen Fang, Guangquan Zhang

Hence, in this paper, we propose a novel framework to address a new realistic problem called multi-class classification with imprecise observations (MCIMO), where we need to train a classifier with fuzzy-feature observations.

Multi-class Classification

Incentive Mechanism Design for Emergency Frequency Control in Multi-Infeed Hybrid AC-DC System

no code implementations28 May 2022 Ye Liu, Chen Shen, Zhaojian Wang, Feng Liu

In multi-infeed hybrid AC-DC (MIDC) systems, the emergency frequency control (EFC) with LCC-HVDC systems participating is of vital importance for system frequency stability.

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

3 code implementations ICCV 2023 Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye

Except for the backbone networks, however, other components such as the detector head and the feature pyramid network (FPN) remain trained from scratch, which hinders fully tapping the potential of representation models.

Decoder Few-Shot Object Detection +3

Robust Representation via Dynamic Feature Aggregation

1 code implementation16 May 2022 Haozhe Liu, Haoqin Ji, Yuexiang Li, Nanjun He, Haoqian Wu, Feng Liu, Linlin Shen, Yefeng Zheng

With the regularization and orthogonal classifier, a more compact embedding space can be obtained, which accordingly improves the model robustness against adversarial attacks.

Out of Distribution (OOD) Detection

A novel stereo matching pipeline with robustness and unfixed disparity search range

no code implementations11 Apr 2022 Jiazhi Liu, Feng Liu

The new stereo matching pipeline have the following advantages: It 1) has better generalization performance than most of the current stereo matching methods; 2) relaxes the limitation of a fixed disparity search range; 3) can handle the scenes that involve both positive and negative disparities, which has more potential applications, such as view synthesis in 3D multimedia and VR/AR.

Stereo Matching

BFRnet: A deep learning-based MR background field removal method for QSM of the brain containing significant pathological susceptibility sources

1 code implementation6 Apr 2022 Xuanyu Zhu, Yang Gao, Feng Liu, Stuart Crozier, Hongfu Sun

The BFRnet method is compared with three conventional BFR methods and one previous deep learning method using simulated and in vivo brains from 4 healthy and 2 hemorrhagic subjects.

Undersampled MRI Reconstruction with Side Information-Guided Normalisation

no code implementations7 Mar 2022 Xinwen Liu, Jing Wang, Cheng Peng, Shekhar S. Chandra, Feng Liu, S. Kevin Zhou

In this paper, we investigate the use of such side information as normalisation parameters in a convolutional neural network (CNN) to improve undersampled MRI reconstruction.

MRI Reconstruction

Multi-channel deep convolutional neural networks for multi-classifying thyroid disease

no code implementations6 Mar 2022 Xinyu Zhang, Vincent CS. Lee, Jia Rong, James C. Lee, Jiangning Song, Feng Liu

Therefore, this study proposed a novel multi-channel convolutional neural network (CNN) architecture to address the multi-class classification task of thyroid disease.

Benchmarking Binary Classification +3

Adversarial Attack and Defense for Non-Parametric Two-Sample Tests

1 code implementation7 Feb 2022 Xilie Xu, Jingfeng Zhang, Feng Liu, Masashi Sugiyama, Mohan Kankanhalli

Furthermore, we theoretically find that the adversary can also degrade the lower bound of a TST's test power, which enables us to iteratively minimize the test criterion in order to search for adversarial pairs.

Adversarial Attack Vocal Bursts Valence Prediction

Balanced Graph Structure Learning for Multivariate Time Series Forecasting

1 code implementation24 Jan 2022 Weijun Chen, Yanze Wang, Chengshuo Du, Zhenglong Jia, Feng Liu, Ran Chen

However, current models do not incorporate the trade-off between efficiency and flexibility and lack the guidance of domain knowledge in the design of graph structure learning algorithms.

Graph Generation Graph Learning +3

The State of Aerial Surveillance: A Survey

no code implementations9 Jan 2022 Kien Nguyen, Clinton Fookes, Sridha Sridharan, YingLi Tian, Feng Liu, Xiaoming Liu, Arun Ross

The rapid emergence of airborne platforms and imaging sensors are enabling new forms of aerial surveillance due to their unprecedented advantages in scale, mobility, deployment and covert observation capabilities.

Survey

Motion-Adjustable Neural Implicit Video Representation

no code implementations CVPR 2022 Long Mai, Feng Liu

The model is trained end-to-end on a video to jointly determine the phase-shift values at each time with the mapping from the phase-shifted sinusoidal functions to the corresponding frame, enabling an implicit video representation.

Motion Magnification

Achieving an Accurate Random Process Model for PV Power using Cheap Data: Leveraging the SDE and Public Weather Reports

no code implementations27 Nov 2021 Yiwei Qiu, Jin Lin, Zhipeng Zhou, Ningyi Dai, Feng Liu, Yonghua Song

To fill this gap, this article finds that an accurate SDE model for PV power can be constructed by only using the cheap data from low-resolution public weather reports.

Time Series Time Series Analysis +1

FRT-PAD: Effective Presentation Attack Detection Driven by Face Related Task

2 code implementations22 Nov 2021 Wentian Zhang, Haozhe Liu, Feng Liu, Raghavendra Ramachandra, Christoph Busch

The proposed method, first introduces task specific features from other face related task, then, we design a Cross-Modal Adapter using a Graph Attention Network (GAT) to re-map such features to adapt to PAD task.

Attribute Face Presentation Attack Detection +2

Fingerprint Presentation Attack Detection by Channel-wise Feature Denoising

1 code implementation15 Nov 2021 Feng Liu, Zhe Kong, Haozhe Liu, Wentian Zhang, Linlin Shen

The proposed method learns important features of fingerprint images by weighing the importance of each channel and identifying discriminative channels and "noise" channels.

Denoising

Instant tissue field and magnetic susceptibility mapping from MR raw phase using Laplacian enabled deep neural networks

2 code implementations15 Nov 2021 Yang Gao, Zhuang Xiong, Amir Fazlollahi, Peter J Nestor, Viktor Vegh, Fatima Nasrallah, Craig Winter, G. Bruce Pike, Stuart Crozier, Feng Liu, Hongfu Sun

In addition, experiments on patients with intracranial hemorrhage and multiple sclerosis were also performed to test the generalization of the novel neural networks.

Voxel-based 3D Detection and Reconstruction of Multiple Objects from a Single Image

no code implementations NeurIPS 2021 Feng Liu, Xiaoming Liu

With complementary supervision from both 3D detection and reconstruction, one enables the 3D voxel features to be geometry and context preserving, benefiting both tasks. The effectiveness of our approach is demonstrated through 3D detection and reconstruction in single object and multiple object scenarios.

Keypoint Detection Object

A cross-modal fusion network based on self-attention and residual structure for multimodal emotion recognition

1 code implementation3 Nov 2021 Ziwang Fu, Feng Liu, HanYang Wang, Jiayin Qi, Xiangling Fu, Aimin Zhou, Zhibin Li

Firstly, we perform representation learning for audio and video modalities to obtain the semantic features of the two modalities by efficient ResNeXt and 1D CNN, respectively.

Multimodal Emotion Recognition Representation Learning

EvoGAN: An Evolutionary Computation Assisted GAN

1 code implementation22 Oct 2021 Feng Liu, HanYang Wang, Jiahao Zhang, Ziwang Fu, Aimin Zhou, Jiayin Qi, Zhibin Li

Quantitative and Qualitative results are presented on several compound expressions, and the experimental results demonstrate the feasibility and the potential of EvoGAN.

Image Generation

Approaching the Transient Stability Boundary of a Power System: Theory and Applications

no code implementations26 Sep 2021 Peng Yang, Feng Liu, Wei Wei, Zhaojian Wang

Estimating the stability boundary is a fundamental and challenging problem in transient stability studies.

Manifold-preserved GANs

no code implementations18 Sep 2021 Haozhe Liu, Hanbang Liang, Xianxu Hou, Haoqian Wu, Feng Liu, Linlin Shen

Generative Adversarial Networks (GANs) have been widely adopted in various fields.

Cannot find the paper you are looking for? You can Submit a new open access paper.