Search Results for author: XiaoFeng Wang

Found 68 papers, 20 papers with code

WonderTurbo: Generating Interactive 3D World in 0.72 Seconds

no code implementations3 Apr 2025 Chaojun Ni, XiaoFeng Wang, Zheng Zhu, Weijie Wang, Haoyun Li, Guosheng Zhao, Jie Li, Wenkang Qin, Guan Huang, Wenjun Mei

Interactive 3D generation is gaining momentum and capturing extensive attention for its potential to create immersive virtual experiences.

3D Generation Depth Completion +1

HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation

no code implementations31 Mar 2025 Boyuan Wang, XiaoFeng Wang, Chaojun Ni, Guosheng Zhao, Zhiqin Yang, Zheng Zhu, Muyang Zhang, Yukun Zhou, Xinze Chen, Guan Huang, Lihong Liu, Xingang Wang

To address this, we propose HumanDreamer, a decoupled human video generation framework that first generates diverse poses from text prompts and then leverages these poses to generate human-motion videos.

Video Generation

ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation

no code implementations24 Mar 2025 Guosheng Zhao, XiaoFeng Wang, Chaojun Ni, Zheng Zhu, Wenkang Qin, Guan Huang, Xingang Wang

Specifically, on Waymo, ReconDreamer++ achieves performance comparable to Street Gaussians for the original trajectory while significantly outperforming ReconDreamer on novel trajectories.

Autonomous Driving

Rethinking Lanes and Points in Complex Scenarios for Monocular 3D Lane Detection

no code implementations8 Mar 2025 Yifan Chang, JunJie Huang, XiaoFeng Wang, Yun Ye, Zhujin Liang, Yi Shan, Dalong Du, Xingang Wang

Although sparse-point methods lower computational load and maintain high accuracy in complex lane geometries, current methods fail to fully leverage the geometric structure of lanes in both lane geometry representations and model design.

3D Lane Detection Autonomous Driving

Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models

no code implementations3 Feb 2025 Yuyang Gong, Zhuo Chen, Miaokun Chen, Fengchang Yu, Wei Lu, XiaoFeng Wang, Xiaozhong Liu, Jiawei Liu

Retrieval-Augmented Generation (RAG) systems based on Large Language Models (LLMs) have become essential for tasks such as question answering and content generation.

Question Answering RAG +1

Do Large Language Models Truly Understand Geometric Structures?

1 code implementation23 Jan 2025 XiaoFeng Wang, Yiming Wang, Wenhong Zhu, Rui Wang

Geometric ability is a significant challenge for large language models (LLMs) due to the need for advanced spatial comprehension and abstract thinking.

RAG-WM: An Efficient Black-Box Watermarking Approach for Retrieval-Augmented Generation of Large Language Models

no code implementations9 Jan 2025 Peizhuo Lv, Mengjie Sun, Hao Wang, XiaoFeng Wang, Shengzhi Zhang, Yuxuan Chen, Kai Chen, Limin Sun

To address those problems, we propose a novel black-box "knowledge watermark" approach, named RAG-WM, to detect IP infringement of RAGs.

RAG

OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation

no code implementations15 Dec 2024 Bohan Li, Xin Jin, Jianan Wang, Yukai Shi, Yasheng Sun, XiaoFeng Wang, Zhuang Ma, Baao Xie, Chao Ma, Xiaokang Yang, Wenjun Zeng

Within OccScene, the perception module can be effectively improved with customized and diverse generated scenes, while the perception priors in return enhance the generation performance for mutual benefits.

Mamba Scene Generation

Buster: Implanting Semantic Backdoor into Text Encoder to Mitigate NSFW Content Generation

no code implementations10 Dec 2024 Xin Zhao, Xiaojun Chen, Yuexin Xuan, Zhendong Zhao, Xiaojun Jia, Xinfeng Li, XiaoFeng Wang

The rise of deep learning models in the digital era has raised substantial concerns regarding the generation of Not-Safe-for-Work (NSFW) content.

Image Generation

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

no code implementations28 Nov 2024 Feng Liu, Shiwei Zhang, XiaoFeng Wang, Yujie Wei, Haonan Qiu, Yuzhong Zhao, Yingya Zhang, Qixiang Ye, Fang Wan

As a fundamental backbone for video generation, diffusion models are challenged by low inference speed due to the sequential nature of denoising.

Denoising Video Generation

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

no code implementations13 Nov 2024 XiaoFeng Wang, Kang Zhao, Feng Liu, Jiayu Wang, Guosheng Zhao, Xiaoyi Bao, Zheng Zhu, Yingya Zhang, Xingang Wang

Video generation has emerged as a promising tool for world simulation, leveraging visual data to replicate real-world environments.

Video Generation

Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model

no code implementations24 Oct 2024 Wenhong Zhu, Zhiwei He, XiaoFeng Wang, PengFei Liu, Rui Wang

Aligning language models (LMs) with human preferences has become a key area of research, enabling these models to meet diverse user needs better.

DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation

no code implementations17 Oct 2024 Guosheng Zhao, Chaojun Ni, XiaoFeng Wang, Zheng Zhu, Xueyang Zhang, Yida Wang, Guan Huang, Xinze Chen, Boyuan Wang, Youyi Zhang, Wenjun Mei, Xingang Wang

Contemporary sensor simulation methods, such as NeRF and 3DGS, rely predominantly on conditions closely aligned with training data distributions, which are largely confined to forward-driving scenarios.

3DGS 4D reconstruction +3

Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models

1 code implementation15 Oct 2024 Kai Yao, Penglei Gao, Lichun Li, Yuan Zhao, XiaoFeng Wang, Wei Wang, Jianke Zhu

Extensive experiments on a range of LLMs, PEFTs, and downstream tasks substantiate the effectiveness of our proposed method, showcasing IST's capacity to enhance existing layer-based PEFT methods.

parameter-efficient fine-tuning

Revising the Problem of Partial Labels from the Perspective of CNNs' Robustness

no code implementations24 Jul 2024 Xin Zhang, Yuqi Song, Wyatt McCurdy, XiaoFeng Wang, Fei Zuo

These remarkable achievements are greatly attributed to the support of extensive datasets with precise labels.

OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting

no code implementations15 Jul 2024 Penglei Gao, Kai Yao, Tiandi Ye, Steven Wang, Yuan YAO, XiaoFeng Wang

In this paper, we tackle the recently popular topic of generating 360-degree images given the conventional narrow field of view (NFoV) images that could be taken from a single camera or cellphone.

Image Generation Mamba

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

1 code implementation6 May 2024 Zheng Zhu, XiaoFeng Wang, Wangbo Zhao, Chen Min, Nianchen Deng, Min Dou, Yuqi Wang, Botian Shi, Kai Wang, Chi Zhang, Yang You, Zhaoxiang Zhang, Dawei Zhao, Liang Xiao, Jian Zhao, Jiwen Lu, Guan Huang

General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems.

Autonomous Driving Decision Making +2

Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering

no code implementations26 Apr 2024 Zhentao Xu, Mark Jerome Cruz, Matthew Guevara, Tie Wang, Manasi Deshpande, XiaoFeng Wang, Zheng Li

In customer service technical support, swiftly and accurately retrieving relevant past issues is critical for efficiently resolving customer inquiries.

Knowledge Graphs Question Answering +4

DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation

no code implementations11 Mar 2024 Guosheng Zhao, XiaoFeng Wang, Zheng Zhu, Xinze Chen, Guan Huang, Xiaoyi Bao, Xingang Wang

DriveDreamer-2 is the first world model to generate customized driving videos, it can generate uncommon driving videos (e. g., vehicles abruptly cut in) in a user-friendly manner.

Autonomous Driving Language Modeling +3

DPAdapter: Improving Differentially Private Deep Learning through Noise Tolerance Pre-training

no code implementations5 Mar 2024 ZiHao Wang, Rui Zhu, Dongruo Zhou, Zhikun Zhang, John Mitchell, Haixu Tang, XiaoFeng Wang

DPAdapter modifies and enhances the sharpness-aware minimization (SAM) technique, utilizing a two-batch strategy to provide a more accurate perturbation estimate and an efficient gradient descent, thereby improving parameter robustness against noise.

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

no code implementations18 Jan 2024 XiaoFeng Wang, Zheng Zhu, Guan Huang, Boyuan Wang, Xinze Chen, Jiwen Lu

World models play a crucial role in understanding and predicting the dynamics of the world, which is essential for video generation.

Video Editing Video Generation

Malla: Demystifying Real-world Large Language Model Integrated Malicious Services

1 code implementation6 Jan 2024 Zilong Lin, Jian Cui, Xiaojing Liao, XiaoFeng Wang

The underground exploitation of large language models (LLMs) for malicious services (i. e., Malla) is witnessing an uptick, amplifying the cyber threat landscape and posing questions about the trustworthiness of LLM technologies.

Language Modeling Language Modelling +1

Nighttime Person Re-Identification via Collaborative Enhancement Network with Multi-domain Learning

1 code implementation25 Dec 2023 Andong Lu, Chenglong Li, Tianrui Zha, Jin Tang, XiaoFeng Wang, Bin Luo

Prevalent nighttime person re-identification (ReID) methods typically combine image relighting and ReID networks in a sequential manner.

Image Relighting Person Re-Identification

On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving

1 code implementation9 Nov 2023 Licheng Wen, Xuemeng Yang, Daocheng Fu, XiaoFeng Wang, Pinlong Cai, Xin Li, Tao Ma, Yingxuan Li, Linran Xu, Dengke Shang, Zheng Zhu, Shaoyan Sun, Yeqi Bai, Xinyu Cai, Min Dou, Shuanglu Hu, Botian Shi, Yu Qiao

This has been a significant bottleneck, particularly in the development of common sense reasoning and nuanced scene understanding necessary for safe and reliable autonomous driving.

Autonomous Driving Common Sense Reasoning +6

The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks

1 code implementation24 Oct 2023 Xiaoyi Chen, Siyuan Tang, Rui Zhu, Shijun Yan, Lei Jin, ZiHao Wang, Liya Su, Zhikun Zhang, XiaoFeng Wang, Haixu Tang

In our research, we propose a novel attack, Janus, which exploits the fine-tuning interface to recover forgotten PIIs from the pre-training data in LLMs.

In-Context Learning

Large Language Model Soft Ideologization via AI-Self-Consciousness

no code implementations28 Sep 2023 Xiaotian Zhou, Qian Wang, XiaoFeng Wang, Haixu Tang, Xiaozhong Liu

Large language models (LLMs) have demonstrated human-level performance on a vast spectrum of natural language tasks.

Language Modeling Language Modelling +1

Reliable Majority Vote Computation with Complementary Sequences for UAV Waypoint Flight Control

no code implementations26 Sep 2023 Alphan Sahin, XiaoFeng Wang

In this study, we propose a non-coherent over-the-air computation scheme to calculate the majority vote (MV) reliably in fading channels.

DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

no code implementations18 Sep 2023 XiaoFeng Wang, Zheng Zhu, Guan Huang, Xinze Chen, Jiagang Zhu, Jiwen Lu

The established world model holds immense potential for the generation of high-quality driving videos, and driving policies for safe maneuvering.

Autonomous Driving Video Generation

CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification

1 code implementation ICCV 2023 Rabab Abdelfattah, Qing Guo, Xiaoguang Li, XiaoFeng Wang, Song Wang

Using the aggregated similarity scores as the initial pseudo labels at the training stage, we propose an optimization framework to train the parameters of the classification network and refine pseudo labels for unobserved labels.

Classification Multi-Label Image Classification +2

Towards Imbalanced Large Scale Multi-label Classification with Partially Annotated Labels

no code implementations31 Jul 2023 Xin Zhang, Yuqi Song, Fei Zuo, XiaoFeng Wang

In this work, we address the issue of label imbalance and investigate how to train classifiers using partial labels in large labeling spaces.

Multi-Label Classification MUlTI-LABEL-ClASSIFICATION

Prompt Injection attack against LLM-integrated Applications

1 code implementation8 Jun 2023 Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, ZiHao Wang, XiaoFeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, Yang Liu

We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection.

MAWSEO: Adversarial Wiki Search Poisoning for Illicit Online Promotion

no code implementations22 Apr 2023 Zilong Lin, Zhengyi Li, Xiaojing Liao, XiaoFeng Wang, Xiaozhong Liu

As a prominent instance of vandalism edits, Wiki search poisoning for illicit promotion is a cybercrime in which the adversary aims at editing Wiki articles to promote illicit businesses through Wiki search results of relevant queries.

D-Score: A White-Box Diagnosis Score for CNNs Based on Mutation Operators

no code implementations3 Apr 2023 Xin Zhang, Yuqi Song, XiaoFeng Wang, Fei Zuo

However, concerns have been raised with respect to the trustworthiness of these models: The standard testing method evaluates the performance of a model on a test set, while low-quality and insufficient test sets can lead to unreliable evaluation results, which can have unforeseeable consequences.

Autonomous Driving Data Augmentation +2

Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion

1 code implementation24 Mar 2023 Bohan Li, Yasheng Sun, Zhujin Liang, Dalong Du, Zhuanghui Zhang, XiaoFeng Wang, Yunnan Wang, Xin Jin, Wenjun Zeng

However, due to the inherent representation gap between stereo geometry and BEV features, it is non-trivial to bridge them for dense prediction task of SSC.

3D Semantic Scene Completion Hallucination +2

SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning

1 code implementation16 Mar 2023 Mengxin Zheng, Jiaqi Xue, ZiHao Wang, Xun Chen, Qian Lou, Lei Jiang, XiaoFeng Wang

Using a pre-trained SSL image encoder and subsequently training a downstream classifier, impressive performance can be achieved on various tasks with very little labeled data.

Self-Supervised Learning

OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception

1 code implementation ICCV 2023 XiaoFeng Wang, Zheng Zhu, Wenbo Xu, Yunpeng Zhang, Yi Wei, Xu Chi, Yun Ye, Dalong Du, Jiwen Lu, Xingang Wang

Towards a comprehensive benchmarking of surrounding perception algorithms, we propose OpenOccupancy, which is the first surrounding semantic occupancy perception benchmark.

Autonomous Driving Benchmarking +1

CSDR-BERT: a pre-trained scientific dataset match model for Chinese Scientific Dataset Retrieval

no code implementations30 Jan 2023 Xintao Chu, Jianping Liu, Jian Wang, XiaoFeng Wang, Yingfei Wang, Meng Wang, Xunxun Gu

As the number of open and shared scientific datasets on the Internet increases under the open science movement, efficiently retrieving these datasets is a crucial task in information retrieval (IR) research.

Information Retrieval Retrieval +2

Gradient Shaping: Enhancing Backdoor Attack Against Reverse Engineering

no code implementations29 Jan 2023 Rui Zhu, Di Tang, Siyuan Tang, Guanhong Tao, Shiqing Ma, XiaoFeng Wang, Haixu Tang

Finally, we perform both theoretical and experimental analysis, showing that the GRASP enhancement does not reduce the effectiveness of the stealthy attacks against the backdoor detection methods based on weight analysis, as well as other backdoor mitigation methods without using detection.

Backdoor Attack

FE-TCM: Filter-Enhanced Transformer Click Model for Web Search

no code implementations19 Jan 2023 Yingfei Wang, Jianping Liu, Jian Wang, XiaoFeng Wang, Meng Wang, Xintao Chu

In this paper, We use Transformer as the backbone network of feature extraction, add filter layer innovatively, and propose a new Filter-Enhanced Transformer Click Model (FE-TCM) for web search.

Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark

1 code implementation CVPR 2023 XiaoFeng Wang, Zheng Zhu, Yunpeng Zhang, Guan Huang, Yun Ye, Wenbo Xu, Ziwei Chen, Xingang Wang

To mitigate the problem, we propose the Autonomous-driving StreAming Perception (ASAP) benchmark, which is the first benchmark to evaluate the online performance of vision-centric perception in autonomous driving.

Depth Estimation Motion Forecasting

Selective Amnesia: On Efficient, High-Fidelity and Blind Suppression of Backdoor Effects in Trojaned Machine Learning Models

no code implementations9 Dec 2022 Rui Zhu, Di Tang, Siyuan Tang, XiaoFeng Wang, Haixu Tang

Our idea is to retrain a given DNN model on randomly labeled clean data, to induce a CF on the model, leading to a sudden forget on both primary and backdoor tasks; then we recover the primary task by retraining the randomized model on correctly labeled clean data.

Continual Learning

Depth Monocular Estimation with Attention-based Encoder-Decoder Network from Single Image

no code implementations24 Oct 2022 Xin Zhang, Rabab Abdelfattah, Yuqi Song, Samuel A. Dauchert, XiaoFeng Wang

Depth information is the foundation of perception, essential for autonomous driving, robotics, and other source-constrained applications.

Autonomous Driving Decoder +1

An Effective Approach for Multi-label Classification with Missing Labels

no code implementations24 Oct 2022 Xin Zhang, Rabab Abdelfattah, Yuqi Song, XiaoFeng Wang

Through comprehensive experiments on three large-scale multi-label image datasets, i. e. MS-COCO, NUS-WIDE, and Pascal VOC12, we show that our method can handle the imbalance between positive labels and negative labels, while still outperforming existing missing-label learning approaches in most cases, and in some cases even approaches with fully labeled datasets.

Classification Missing Labels +3

G2NetPL: Generic Game-Theoretic Network for Partial-Label Image Classification

no code implementations20 Oct 2022 Rabab Abdelfattah, Xin Zhang, Mostafa M. Fouda, XiaoFeng Wang, Song Wang

To effectively address partial-label classification, this paper proposes an end-to-end Generic Game-theoretic Network (G2NetPL) for partial-label learning, which can be applied to most partial-label settings, including a very challenging, but annotation-efficient case where only a subset of the training images are labeled, each with only one positive label, while the rest of the training images remain unlabeled.

Multi-Label Classification MUlTI-LABEL-ClASSIFICATION +3

Understanding Impacts of Task Similarity on Backdoor Attack and Detection

no code implementations12 Oct 2022 Di Tang, Rui Zhu, XiaoFeng Wang, Haixu Tang, Yi Chen

With extensive studies on backdoor attack and detection, still fundamental questions are left unanswered regarding the limits in the adversary's capability to attack and the defender's capability to detect.

Backdoor Attack Multi-Task Learning

Scenario-Adaptive and Self-Supervised Model for Multi-Scenario Personalized Recommendation

no code implementations24 Aug 2022 Yuanliang Zhang, XiaoFeng Wang, Jinxin Hu, Ke Gao, Chenyi Lei, Fei Fang

we summarize three practical challenges which are not well solved for multi-scenario modeling: (1) Lacking of fine-grained and decoupled information transfer controls among multiple scenarios.

Contrastive Learning Disentanglement +1

Crafting Monocular Cues and Velocity Guidance for Self-Supervised Multi-Frame Depth Learning

1 code implementation19 Aug 2022 XiaoFeng Wang, Zheng Zhu, Guan Huang, Xu Chi, Yun Ye, Ziwei Chen, Xingang Wang

In contrast, multi-frame depth estimation methods improve the depth accuracy thanks to the success of Multi-View Stereo (MVS), which directly makes use of geometric constraints.

Depth Estimation

MVSTER: Epipolar Transformer for Efficient Multi-View Stereo

1 code implementation15 Apr 2022 XiaoFeng Wang, Zheng Zhu, Fangbo Qin, Yun Ye, Guan Huang, Xu Chi, Yijia He, Xingang Wang

Therefore, we present MVSTER, which leverages the proposed epipolar Transformer to learn both 2D semantics and 3D spatial associations efficiently.

New Benchmark for Household Garbage Image Recognition

no code implementations24 Feb 2022 Zhize Wu, Huanyi Li, XiaoFeng Wang, Zijun Wu, Le Zou, Lixiang Xu, Ming Tan

Household garbage images are usually faced with complex backgrounds, variable illuminations, diverse angles, and changeable shapes, which bring a great difficulty in garbage image classification.

Classification Image Classification +1

Context-aware Heterogeneous Graph Attention Network for User Behavior Prediction in Local Consumer Service Platform

no code implementations24 Jun 2021 Peiyuan Zhu, XiaoFeng Wang, Zisen Sang, Aiquan Yuan, Guodong Cao

Hence, in this paper, we propose a context-aware heterogeneous graph attention network (CHGAT) to dynamically generate the representation of the user and to estimate the probability for future behavior.

Graph Attention

SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

1 code implementation19 Mar 2021 Yuxuan Chen, Jiangshan Zhang, Xuejing Yuan, Shengzhi Zhang, Kai Chen, XiaoFeng Wang, Shanqing Guo

In this paper, we present our systematization of knowledge for ASR security and provide a comprehensive taxonomy for existing work based on a modularized workflow.

Adversarial Attack Automatic Speech Recognition +3

The effect of aspherical stellar wind of giant stars on the symbiotic channel of type Ia supernovae

no code implementations18 Feb 2021 Chengyuan Wu, Dongdong Liu, XiaoFeng Wang, Bo wang

The progenitor systems accounting for explosions of type Ia supernovae (SNe Ia) is still under debate.

Solar and Stellar Astrophysics

HyMap: eliciting hypotheses in early-stage software startups using cognitive mapping

no code implementations18 Feb 2021 Jorge Melegati, Eduardo Guerra, XiaoFeng Wang

Regarding the first, it provides a better understanding of the guidance founders use to develop their startups and, for the latter, a technique to identify hypotheses in early-stage software startups.

Computers and Society

Towards Dark Jargon Interpretation in Underground Forums

no code implementations5 Nov 2020 Dominic Seyler, Wei Liu, XiaoFeng Wang, ChengXiang Zhai

Dark jargons are benign-looking words that have hidden, sinister meanings and are used by participants of underground forums for illicit behavior.

TTPLA: An Aerial-Image Dataset for Detection and Segmentation of Transmission Towers and Power Lines

1 code implementation20 Oct 2020 Rabab Abdelfattah, XiaoFeng Wang, Song Wang

Accurate detection and segmentation of transmission towers~(TTs) and power lines~(PLs) from aerial images plays a key role in protecting power-grid security and low-altitude UAV safety.

Instance Segmentation object-detection +3

Query-Free Attacks on Industry-Grade Face Recognition Systems under Resource Constraints

no code implementations13 Feb 2018 Di Tang, XiaoFeng Wang, Kehuan Zhang

To launch black-box attacks against a Deep Neural Network (DNN) based Face Recognition (FR) system, one needs to build \textit{substitute} models to simulate the target model, so the adversarial examples discovered from substitute models could also mislead the target model.

Face Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.