Search Results for author: Kai Wang

Found 302 papers, 147 papers with code

Empowering Economic Simulation for Massively Multiplayer Online Games through Generative Agent-Based Modeling

no code implementations5 Jun 2025 Bihan Xu, Shiwei Zhao, Runze Wu, Zhenya Huang, Jiawei Wang, Zhipeng Hu, Kai Wang, Haoyu Liu, Tangjie Lv, Le Li, Changjie Fan, Xin Tong, Jiangze Han

Within the domain of Massively Multiplayer Online (MMO) economy research, Agent-Based Modeling (ABM) has emerged as a robust tool for analyzing game economics, evolving from rule-based agents to decision-making agents enhanced by reinforcement learning.

Decision Making

When Thinking LLMs Lie: Unveiling the Strategic Deception in Representations of Reasoning Models

no code implementations5 Jun 2025 Kai Wang, Yihao Zhang, Meng Sun

The honesty of large language models (LLMs) is a critical alignment challenge, especially as advanced systems with chain-of-thought (CoT) reasoning may strategically deceive humans.

Hallucination Misinformation

MVP-Shapley: Feature-based Modeling for Evaluating the Most Valuable Player in Basketball

no code implementations5 Jun 2025 Haifeng Sun, Yu Xiong, Runze Wu, Kai Wang, Lan Zhang, Changjie Fan, Shaojie Tang, Xiang-Yang Li

The burgeoning growth of the esports and multiplayer online gaming community has highlighted the critical importance of evaluating the Most Valuable Player (MVP).

Solving Euler equations with Multiple Discontinuities via Separation-Transfer Physics-Informed Neural Networks

no code implementations26 May 2025 Chuanxing Wang, Hui Luo, Kai Wang, Guohuai Zhu, Mingxing Luo

Despite the remarkable progress of physics-informed neural networks (PINNs) in scientific computing, they continue to face challenges when solving hydrodynamic problems with multiple discontinuities.

Transfer Learning

REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training

1 code implementation22 May 2025 Ziqiao Wang, Wangbo Zhao, Yuhao Zhou, Zekai Li, Zhiyuan Liang, Mingjia Shi, Xuanlei Zhao, Pengfei Zhou, Kaipeng Zhang, Zhangyang Wang, Kai Wang, Yang You

Phase II then performs one-shot termination that deactivates the alignment loss, once a simple trigger such as a fixed iteration is hit, freeing the DiT to focus on denoising and exploit its generative capacity.

Denoising

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

no code implementations21 May 2025 Ao Liu, Botong Zhou, Can Xu, Chayse Zhou, Chenchen Zhang, Chengcheng Xu, Chenhao Wang, Decheng Wu, Dengpeng Wu, Dian Jiao, Dong Du, Dong Wang, Feng Zhang, Fengzong Lian, Guanghui Xu, Guanwei Zhang, Hai Wang, Haipeng Luo, Han Hu, Huilin Xu, Jiajia Wu, Jianchen Zhu, Jianfeng Yan, Jiaqi Zhu, Jinbao Xue, Jun Xia, Junqiang Zheng, Kai Liu, Kai Zhang, Kai Zheng, Kejiao Li, Keyao Wang, Lan Jiang, Lixin Liu, Lulu Wu, Mengyuan Huang, Peijie Yu, Peiqi Wang, Qian Wang, Qianbiao Xiang, Qibin Liu, Qingfeng Sun, Richard Guo, Ruobing Xie, Saiyong Yang, Shaohua Chen, Shihui Hu, Shuai Li, Shuaipeng Li, Shuang Chen, Suncong Zheng, Tao Yang, Tian Zhang, TingHao Yu, Weidong Han, Weijie Liu, Weijin Zhou, Weikang Wang, Wesleye Chen, Xiao Feng, Xiaoqin Ren, Xingwu Sun, Xiong Kuang, Xuemeng Huang, Xun Cao, Yanfeng Chen, Yang Du, Yang Zhen, Yaping Deng, Yi Shen, Yigeng Hong, Yiqi Chen, Yiqing Huang, Yuchi Deng, Yue Mao, Yulong Wang, Yuyuan Zeng, Zenan Xu, Zhanhui Kang, Zhenxiang Yan, Zheng Fang, Zhichao Hu, Zhongzhi Chen, Zhuoyu Li, Zongwei Li, Alex Yan, Ande Liang, Baitong Liu, Beiping Pan, Bin Xing, Binghong Wu, Bingxin Qu, Bolin Ni, Boyu Wu, Chen Li, Cheng Jiang, Cheng Zhang, Chengjun Liu, Chengxu Yang, Chiyu Wang, Chong Zha, Daisy Yi, Di Wang, Fanyang Lu, Fei Chen, Feifei Liu, Feng Zheng, Guanghua Yu, Guiyang Li, Guohua Wang, Haisheng Lin, Han Liu, Han Wang, Hao Fei, Hao Lu, Haoqing Jiang, Haoran Sun, Haotian Zhu, Huangjin Dai, Huankui Chen, Huawen Feng, Huihui Cai, Huxin Peng, Jackson Lv, Jiacheng Shi, Jiahao Bu, Jianbo Li, Jianglu Hu, Jiangtao Guan, Jianing Xu, Jianwei Cai, Jiarong Zhang, Jiawei Song, Jie Jiang, Jie Liu, Jieneng Yang, Jihong Zhang, Jin lv, Jing Zhao, Jinjian Li, JinXing Liu, Jun Zhao, Juntao Guo, Kai Wang, Kan Wu, Lei Fu, Lei He, Lei Wang, Li Liu, Liang Dong, Liya Zhan, Long Cheng, Long Xu, Mao Zheng, Meng Liu, Mengkang Hu, Nanli Chen, Peirui Chen, Peng He, Pengju Pan, Pengzhi Wei, Qi Yang, Qi Yi, Roberts Wang, Rongpeng Chen, Rui Sun, Rui Yang, Ruibin Chen, Ruixu Zhou, Shaofeng Zhang, Sheng Zhang, Shihao Xu, Shuaishuai Chang, Shulin Liu, Siqi Wang, Songjia Feng, Songling Yuan, Tao Zhang, Tianjiao Lang, Tongkai Li, Wei Deng, Wei Li, Weichao Wang, Weigang Zhang, Weixuan Sun, Wen Ouyang, Wenxiang Jiao, Wenzhi Sun, Wenzhuo Jia, Xiang Zhang, Xiangyu He, Xianshun Ren, Xiaoying Zhu, Xiaolong Guo, Xiaoxue Li, Xiaoyu Ma, Xican Lu, Xinhua Feng, Xinting Huang, Xinyu Guan, Xirui Li, Xu Zhang, Xudong Gao, Xun Luo, Xuxiang Qi, Yangkun Chen, Yangyu Tao, Yanling Xiao, Yantao Mai, Yanze Chen, Yao Ding, Yeting Yang, YiFan Song, Yifan Yang, Yijiao Zhu, Yinhe Wu, Yixian Liu, Yong Yang, Yuanjun Cai, Yuanlin Tu, Yue Zhang, Yufei Huang, YuHang Zhou, Yuhao Jiang, Yuhong Liu, Yuhui Hu, YuJin Lin, Yun Yang, Yunhao Wang, Yusong Zhang, Zekun Wu, Zelong Zhang, Zhan Yu, Zhaoliang Yang, Zhe Zhao, Zheng Li, Zhenyu Huang, Zhiguang Liu, Zhiqing Kui, Zhiyin Zeng, Zhiyuan Xiong, Zhuo Han, Zifan Wu, Zigang Geng, Zilong Zhao, Ziyan Tang, Ziyuan Zhu, Zonglei Zhu, Zhijiang Xu

As Large Language Models (LLMs) rapidly advance, we introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba Mixture of Experts (MoE) model.

Chatbot Instruction Following +2

Unsupervised Learning for Class Distribution Mismatch

1 code implementation11 May 2025 Pan Du, Wangbo Zhao, Xinai Lu, Nian Liu, Zhikai Li, Chaoyu Gong, Suyun Zhao, Hong Chen, Cuiping Li, Kai Wang, Yang You

Specifically, with a 60% mismatch proportion on Tiny-ImageNet dataset, our approach, without relying on labeled data, surpasses OpenMatch (with 40 labels per class) by 35. 1%, 63. 7%, and 72. 5% in classifying known, unknown, and new classes.

Multimodal Integrated Knowledge Transfer to Large Language Models through Preference Optimization with Biomedical Applications

1 code implementation9 May 2025 Da Wu, Zhanliang Wang, Quan Nguyen, Zhuoran Xu, Kai Wang

To address this challenge, we introduce MINT (Multimodal Integrated kNowledge Transfer), a framework that aligns unimodal large decoder models with domain-specific decision patterns from multimodal biomedical data through preference optimization.

Disease Prediction RAG +1

FilterTS: Comprehensive Frequency Filtering for Multivariate Time Series Forecasting

1 code implementation7 May 2025 Yulong Wang, YuShuo Liu, Xiaoyi Duan, Kai Wang

Multivariate time series forecasting is crucial across various industries, where accurate extraction of complex periodic and trend components can significantly enhance prediction performance.

Computational Efficiency Multivariate Time Series Forecasting +1

STRGCN: Capturing Asynchronous Spatio-Temporal Dependencies for Irregular Multivariate Time Series Forecasting

no code implementations7 May 2025 Yulong Wang, Xiaofeng Hu, Xiaojian Cui, Kai Wang

Irregular multivariate time series (IMTS) are prevalent in real-world applications across many fields, where varying sensor frequencies and asynchronous measurements pose significant modeling challenges.

Multivariate Time Series Forecasting Time Series

Quantitative Analysis of Performance Drop in DeepSeek Model Quantization

1 code implementation5 May 2025 Enbo Zhao, Yi Shen, Shuming Shi, Jieyun Huang, Zhihao Chen, Ning Wang, Siqi Xiao, Jian Zhang, Kai Wang, Shiguo Lian

Recently, there is a high demand for deploying DeepSeek-R1 and V3 locally, possibly because the official service often suffers from being busy and some organizations have data privacy concerns.

Quantization

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

no code implementations22 Apr 2025 Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Shicheng Xu, Junyuan Mao, Yu Wang, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Wenjie Qu, Yue Liu, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Zhaoxin Fan, Kai Wang, Yi Ding, Donghai Hong, Jiaming Ji, Yingxin Lai, Zitong Yu, Xinfeng Li, Yifan Jiang, Yanhui Li, Xinyu Deng, Junlin Wu, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Qiufeng Wang, Xiaolong Jin, Wenxuan Wang, Dongrui Liu, Yanwei Yue, Wenke Huang, Guancheng Wan, Heng Chang, Tianlin Li, Yi Yu, Chenghao Li, Jiawei Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Jiaheng Zhang, Tianwei Zhang, Xingjun Ma, Jindong Gu, Liang Pang, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Lingjuan Lyu, Yuval Elovici, Bhavya Kailkhura, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, XiaoFeng Wang, DaCheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu

Currently, existing surveys on LLM safety primarily focus on specific stages of the LLM lifecycle, e. g., deployment phase or fine-tuning phase, lacking a comprehensive understanding of the entire "lifechain" of LLMs.

Model Editing

Landmark-Free Preoperative-to-Intraoperative Registration in Laparoscopic Liver Resection

no code implementations21 Apr 2025 Jun Zhou, Bingchen Gao, Kai Wang, Jialun Pei, Pheng-Ann Heng, Jing Qin

Further, a structure-regularized deformation network is designed to adjust the preoperative model to align with the intraoperative liver surface.

Anatomy Self-Supervised Learning

Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing

1 code implementation14 Apr 2025 Taihang Hu, Linxuan Li, Kai Wang, Yaxing Wang, Jian Yang, Ming-Ming Cheng

However, existing editing techniques designed for diffusion models fail to translate directly to AR models due to fundamental differences in structural control.

Text to Image Generation Text-to-Image Generation

DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation

1 code implementation9 Apr 2025 Wangbo Zhao, Yizeng Han, Jiasheng Tang, Kai Wang, Hao Luo, Yibing Song, Gao Huang, Fan Wang, Yang You

Our investigations reveal that these costs primarily stem from the \emph{static} inference paradigm, which inevitably introduces redundant computation in certain \emph{diffusion timesteps} and \emph{spatial regions}.

Text to Image Generation Text-to-Image Generation +1

Graph-based Approaches and Functionalities in Retrieval-Augmented Generation: A Comprehensive Survey

no code implementations8 Apr 2025 Zulun Zhu, Tiancheng Huang, Kai Wang, Junda Ye, Xinghe Chen, Siqiang Luo

Large language models (LLMs) struggle with the factual error during inference due to the lack of sufficient training data and the most updated knowledge, leading to the hallucination problem.

Graph Learning Hallucination +4

Dynamic Vision Mamba

1 code implementation7 Apr 2025 Mengxuan Wu, Zekai Li, Zhiyuan Liang, Moyang Li, Xuanlei Zhao, Samir Khaki, Zheng Zhu, Xiaojiang Peng, Konstantinos N. Plataniotis, Kai Wang, Wangbo Zhao, Yang You

For block redundancy, we allow each image to select SSM blocks dynamically based on an empirical observation that the inference speed of Mamba-based vision models is largely affected by the number of SSM blocks.

Mamba

Slow-Fast Architecture for Video Multi-Modal Large Language Models

1 code implementation2 Apr 2025 Min Shi, Shihao Wang, Chieh-Yun Chen, Jitesh Jain, Kai Wang, Junjun Xiong, Guilin Liu, Zhiding Yu, Humphrey Shi

Balancing temporal resolution and spatial detail under limited compute budget remains a key challenge for video-based multi-modal large language models (MLLMs).

Video Understanding

ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion

no code implementations31 Mar 2025 Rana Muhammad Shahroz Khan, Dongwen Tang, Pingzhi Li, Kai Wang, Tianlong Chen

Parameter generation has emerged as a novel paradigm for neural network development, offering an alternative to traditional neural network training by synthesizing high-quality model weights directly.

Free-Lunch Color-Texture Disentanglement for Stylized Image Generation

no code implementations18 Mar 2025 Jiang Qin, Senmao Li, Alexandra Gomez-Villa, Shiqi Yang, Yaxing Wang, Kai Wang, Joost Van de Weijer

This paper introduces the first tuning-free approach to achieve free-lunch color-texture disentanglement in stylized T2I generation, addressing the need for independently controlled style elements for the Disentangled Stylized Image Generation (DisIG) problem.

Disentanglement Image Generation

Safety Evaluation and Enhancement of DeepSeek Models in Chinese Contexts

1 code implementation18 Mar 2025 Wenjing Zhang, Xuejiao Lei, Zhaoxiang Liu, Limin Han, Jiaojiao Zhao, Junting Guo, Zhenhong Long, Shu Yang, Meijuan An, Beibei Huang, Rongjia Du, Ning Wang, Kai Wang, Shiguo Lian

The objective is to assess the safety capabilities of these models in Chinese contexts both before and after distillation, and to further elucidate the adverse effects of distillation on model safety.

MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification

no code implementations16 Mar 2025 Zhaopan Xu, Pengfei Zhou, Jiaxin Ai, Wangbo Zhao, Kai Wang, Xiaojiang Peng, Wenqi Shao, Hongxun Yao, Kaipeng Zhang

Reasoning is an essential capacity for large language models (LLMs) to address complex tasks, where the identification of process errors is vital for improving this ability.

Multimodal Reasoning

PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models

no code implementations16 Mar 2025 Zhaopan Xu, Pengfei Zhou, Weidong Tang, Jiaxin Ai, Wangbo Zhao, Xiaojiang Peng, Kai Wang, Yang You, Wenqi Shao, Hongxun Yao, Kaipeng Zhang

In recent years, Multimodal Large Language Models (MLLMs) have demonstrated remarkable advancements in tasks such as visual question answering, visual understanding, and reasoning.

Machine Unlearning Privacy Preserving +2

Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

no code implementations15 Mar 2025 Da Wu, Zhanliang Wang, Quan Nguyen, Kai Wang

We also showed that RAG-driven CoT and CoT-driven RAG both outperform foundation models in candidate gene prioritization from clinical notes; in particular, both methods with DeepSeek backbone resulted in a top-10 gene accuracy of over 40% on Phenopacket-derived clinical notes.

RAG Retrieval +1

Leveraging Semantic Attribute Binding for Free-Lunch Color Control in Diffusion Models

no code implementations12 Mar 2025 Héctor Laria, Alexandra Gomez-Villa, Jiang Qin, Muhammad Atif Butt, Bogdan Raducanu, Javier Vazquez-Corral, Joost Van de Weijer, Kai Wang

In this work, we introduce ColorWave, a novel training-free approach that achieves exact RGB-level color control in diffusion models without fine-tuning.

Attribute Diversity +1

A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis

1 code implementation10 Mar 2025 Xiang Liu, Zhaoxiang Liu, Huan Hu, Zezhou Chen, Kohou Wang, Kai Wang, Shiguo Lian

We demonstrate the utility of the dataset by finetuning state-of-the-art multimodal models, showcasing significant improvements in crop disease diagnosis.

Question Answering

DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models

no code implementations6 Mar 2025 Yi Shen, Jian Zhang, Jieyun Huang, Shuming Shi, Wenjing Zhang, Jiangze Yan, Ning Wang, Kai Wang, Shiguo Lian

Recent advancements in slow-thinking reasoning models have shown exceptional performance in complex reasoning tasks.

Optimizing for the Shortest Path in Denoising Diffusion Model

1 code implementation CVPR 2025 Ping Chen, Xingpeng Zhang, Zhaoxiang Liu, Huan Hu, Xiang Liu, Kai Wang, Min Wang, Yanlin Qian, Shiguo Lian

In this research, we propose a novel denoising diffusion model based on shortest-path modeling that optimizes residual propagation to enhance both denoising efficiency and quality. Drawing on Denoising Diffusion Implicit Models (DDIM) and insights from graph theory, our model, termed the Shortest Path Diffusion Model (ShortDF), treats the denoising process as a shortest-path problem aimed at minimizing reconstruction error.

Denoising

Digital Player: Evaluating Large Language Models based Human-like Agent in Games

1 code implementation28 Feb 2025 Jiawei Wang, Kai Wang, Shaojie Lin, Runze Wu, Bihan Xu, Lingeng Jiang, Shiwei Zhao, Renyu Zhu, Haoyu Liu, Zhipeng Hu, Zhong Fan, Le Li, Tangjie Lyu, Changjie Fan

With the rapid advancement of Large Language Models (LLMs), LLM-based autonomous agents have shown the potential to function as digital employees, such as digital analysts, teachers, and programmers.

Decision Making

External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation

no code implementations20 Feb 2025 Mingfu Liang, Xi Liu, Rong Jin, Boyang Liu, Qiuling Suo, Qinghai Zhou, Song Zhou, Laming Chen, Hua Zheng, Zhiyuan Li, Shali Jiang, Jiyan Yang, Xiaozhen Xia, Fan Yang, Yasmine Badr, Ellie Wen, Shuyu Xu, Hansey Chen, Zhengyu Zhang, Jade Nie, Chunzhi Yang, Zhichen Zeng, Weilin Zhang, Xingliang Huang, Qianru Li, Shiquan Wang, Evelyn Lyu, Wenjing Lu, Rui Zhang, Wenjun Wang, Jason Rudy, Mengyue Hang, Kai Wang, Yinbin Ma, Shuaiwen Wang, Sihan Zeng, Tongyi Tang, Xiaohan Wei, Longhao Jin, Jamey Zhang, Marcus Chen, Jiayi Xu, Angie Huang, Xihuan Zeng, Chi Zhang, Zhengli Zhao, Jared Yang, Qiang Jin, Xian Chen, Amit Anand Amlesahwaram, Lexi Song, Liang Luo, Yuchen Hao, Nan Xiao, Yavuz Yetim, Luoshang Pan, Gaoxiang Liu, Yuxi Hu, Yuzhen Huang, Jackie Xu, Rich Zhu, Xin Zhang, Yiqun Liu, Hang Yin, Yuxin Chen, Buyun Zhang, Xiaoyi Liu, Xingyuan Wang, Wenguang Mao, Zhijing Li, Zhehui Zhou, Feifan Gu, Qin Huang, Chonglin Sun, Nancy Yu, Shuo Gu, Shupin Mao, Benjamin Au, Jingzheng Qin, Peggy Yao, Jae-Woo Choi, Bin Gao, Ernest Wang, Lei Zhang, Wen-Yen Chen, Ted Lee, Jay Zha, Yi Meng, Alex Gong, Edison Gao, Alireza Vahdatpour, Yiping Han, Yantao Yao, Toshinari Kureha, Shuo Chang, Musharaf Sultan, John Bocharov, Sagar Chordia, Xiaorui Gan, Peng Sun, Rocky Liu, Bo Long, Wenlin Chen, Santanu Kolay, Huayu Li

Second, large-volume data arrive in a streaming mode with data distributions dynamically shifting, as new users/ads join and existing users/ads leave the system.

Data Augmentation

What's in a Query: Polarity-Aware Distribution-Based Fair Ranking

no code implementations17 Feb 2025 Aparna Balagopalan, Kai Wang, Olawale Salaudeen, Asia Biega, Marzyeh Ghassemi

Unlike prior methods that operate on expected amortized attention for each individual, we define new divergence-based measures for attention distribution-based fairness in ranking (DistFaiR), characterizing unfairness as the divergence between the distribution of attention and relevance corresponding to an individual over time.

Fairness

Safety Evaluation of DeepSeek Models in Chinese Contexts

no code implementations16 Feb 2025 Wenjing Zhang, Xuejiao Lei, Zhaoxiang Liu, Ning Wang, Zhenhong Long, Peijun Yang, Jiaojiao Zhao, Minjie Hua, Chaoyang Ma, Kai Wang, Shiguo Lian

In response to this gap, this study introduces CHiSafetyBench, a Chinese-specific safety evaluation benchmark.

Enhance-A-Video: Better Generated Video for Free

1 code implementation11 Feb 2025 Yang Luo, Xuanlei Zhao, Mengzhao Chen, Kaipeng Zhang, Wenqi Shao, Kai Wang, Zhangyang Wang, Yang You

DiT-based video generation has achieved remarkable results, but research into enhancing existing models remains relatively unexplored.

Video Generation

One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt

1 code implementation23 Jan 2025 Tao Liu, Kai Wang, Senmao Li, Joost Van de Weijer, Fahad Shahbaz Khan, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng

Drawing inspiration from the inherent context consistency, we propose a novel training-free method for consistent text-to-image (T2I) generation, termed "One-Prompt-One-Story" (1Prompt1Story).

Story Generation Text to Image Generation +1

Recurrent Diffusion for Large-Scale Parameter Generation

1 code implementation20 Jan 2025 Kai Wang, Dongwen Tang, Wangbo Zhao, Yang You

The recurrent model's outputs, as conditions, are then fed into a diffusion model to generate the neural network parameters.

Matching Free Depth Recovery from Structured Light

no code implementations13 Jan 2025 Zhuohang Yu, Kai Wang, Juyong Zhang

We present a novel approach for depth estimation from images captured by structured light systems.

Depth Estimation

MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare

no code implementations11 Jan 2025 Ye Chen, Dongdong Huang, Haoyun Xu, Cong Fu, Lin Sheng, Qingli Zhou, Yuqiang Shen, Kai Wang

We introduce the world's first clinical terminology for the Chinese healthcare community, namely MedCT, accompanied by a clinical foundation model MedBERT and an entity linking model MedLink.

Diagnostic Entity Linking +1

InpDiffusion: Image Inpainting Localization via Conditional Diffusion Models

no code implementations6 Jan 2025 Kai Wang, Shaozhang Niu, Qixian Hao, Jiwei Zhang

Balancing the diffusion model's stochastic sampling with edge supervision of tampered image regions mitigates the risk of incorrect predictions from overconfidence and prevents the loss of subtle boundaries that can result from overly stochastic processes.

Denoising Image Inpainting

On LLM-Enhanced Mixed-Type Data Imputation with High-Order Message Passing

1 code implementation4 Jan 2025 Jianwei Wang, Kai Wang, Ying Zhang, Wenjie Zhang, Xiwei Xu, Xuemin Lin

Missing data imputation, which aims to impute the missing values in the raw datasets to achieve the completeness of datasets, is crucial for modern data-driven models like large language models (LLMs) and has attracted increasing interest over the past decades.

Chunking Imputation +1

One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models

no code implementations CVPR 2025 Senmao Li, Lei Wang, Kai Wang, Tao Liu, Jiehang Xie, Joost Van de Weijer, Fahad Shahbaz Khan, Shiqi Yang, Yaxing Wang, Jian Yang

Our findings suggest that, for T2I diffusion models, decoders are more adept at capturing richer and more explicit semantic information, while encoders can be effectively shared across decoders from diverse time steps. Based on these observations, we introduce the first Time-independent Unified Encoder (TiUE) for the student model UNet architecture, which is a loop-free image generation approach for distilling T2I diffusion models.

Computational Efficiency Diversity +1

PowerRadio: Manipulate Sensor Measurementvia Power GND Radiation

no code implementations24 Dec 2024 Yan Jiang, Xiaoyu Ji, Yancheng Jiang, Kai Wang, Chenren Xu, Wenyuan Xu

Sensors are key components enabling various applications, e. g., home intrusion detection and environmental monitoring.

Intrusion Detection

CALLIC: Content Adaptive Learning for Lossless Image Compression

no code implementations23 Dec 2024 Daxin Li, Yuanchao Bai, Kai Wang, Junjun Jiang, Xianming Liu, Wen Gao

To address this challenge, we explore the connection between the Minimum Description Length (MDL) principle and Parameter-Efficient Transfer Learning (PETL), leading to the development of a novel content-adaptive approach for learned lossless image compression, dubbed CALLIC.

Image Compression Transfer Learning

TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction

no code implementations22 Dec 2024 Xuying Zhang, Yutong Liu, Yangguang Li, Renrui Zhang, Yufei Liu, Kai Wang, Wanli Ouyang, Zhiwei Xiong, Peng Gao, Qibin Hou, Ming-Ming Cheng

We present TAR3D, a novel framework that consists of a 3D-aware Vector Quantized-Variational AutoEncoder (VQ-VAE) and a Generative Pre-trained Transformer (GPT) to generate high-quality 3D assets.

Image to 3D Text to 3D

Covariances for Free: Exploiting Mean Distributions for Federated Learning with Pre-Trained Models

no code implementations18 Dec 2024 Dipam Goswami, Simone Magistri, Kai Wang, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost Van de Weijer

Our method, which only uses first-order statistics in the form of class means communicated by clients to the server, incurs only a fraction of the communication costs required by methods based on communicating second-order statistics.

Federated Learning

Single-View Graph Contrastive Learning with Soft Neighborhood Awareness

1 code implementation12 Dec 2024 Qingqiang Sun, Chaoqi Chen, Ziyue Qiao, Xubin Zheng, Kai Wang

Most graph contrastive learning (GCL) methods heavily rely on cross-view contrast, thus facing several concomitant challenges, such as the complexity of designing effective augmentations, the potential for information loss between views, and increased computational costs.

Contrastive Learning Semantic Similarity +2

Multi-Modal Environmental Sensing Based Path Loss Prediction for V2I Communications

no code implementations10 Dec 2024 Kai Wang, Li Yu, Jianhua Zhang, Yixuan Tian, Eryu Guo, Guangyi Liu

The stability and reliability of wireless data transmission in vehicular networks face significant challenges due to the high dynamics of path loss caused by the complexity of rapidly changing environments.

A multimodal ensemble approach for clear cell renal cell carcinoma treatment outcome prediction

no code implementations10 Dec 2024 Meixu Chen, Kai Wang, Payal Kapur, James Brugarolas, Raquibul Hannan, Jing Wang

Using predicted risk medians to stratify high- and low-risk groups, log-rank tests showed improved performance in both OS and DFS compared to single-modality models.

feature selection Prognosis

A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs

1 code implementation CVPR 2025 Wangbo Zhao, Yizeng Han, Jiasheng Tang, Zhikai Li, Yibing Song, Kai Wang, Zhangyang Wang, Yang You

Vision-language models (VLMs) have shown remarkable success across various multi-modal tasks, yet large VLMs encounter significant efficiency challenges due to processing numerous visual tokens.

Visual Question Answering

Deep Sparse Latent Feature Models for Knowledge Graph Completion

no code implementations24 Nov 2024 Haotian Li, Rui Zhang, Lingzhi Wang, Bin Yu, Youwei Wang, Yuliang Wei, Kai Wang, Richard Yi Da Xu, Bailing Wang

Recent progress in knowledge graph completion (KGC) has focused on text-based approaches to address the challenges of large-scale knowledge graphs (KGs).

Link Prediction

Computational metaoptics for imaging

no code implementations14 Nov 2024 Charles Roques-Carmes, Kai Wang, Yuanmu Yang, Arka Majumdar, Zin Lin

Advanced applications enabled by computational metaoptics are highlighted, including phase imaging and quantum state measurement, which benefit from the metasurfaces' ability to manipulate complex light fields and the computational algorithms' capacity to reconstruct high-dimensional information.

Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

1 code implementation11 Nov 2024 Taihang Hu, Linxuan Li, Joost Van de Weijer, Hongcheng Gao, Fahad Shahbaz Khan, Jian Yang, Ming-Ming Cheng, Kai Wang, Yaxing Wang

In this paper, we define semantic binding as the task of associating a given object with its attribute, termed attribute binding, or linking it to other related sub-objects, referred to as object binding.

Attribute Image Generation +1

Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier

1 code implementation29 Oct 2024 Kai Wang, Fei Yang, Bogdan Raducanu, Joost Van de Weijer

However, in many realistic scenarios, we only have access to few samples and knowledge of the class names (e. g., when considering instances of classes).

Prompt Learning

Primal-Dual Spectral Representation for Off-policy Evaluation

no code implementations23 Oct 2024 Yang Hu, Tianyi Chen, Na Li, Kai Wang, Bo Dai

We highlight that our algorithm, SpectralDICE, is the first to leverage the linear representation of primal-dual variables that is both computation and sample efficient, the performance of which is supported by a rigorous theoretical sample complexity guarantee and a thorough empirical evaluation on various benchmarks.

Off-policy evaluation

Learning Lossless Compression for High Bit-Depth Volumetric Medical Image

no code implementations23 Oct 2024 Kai Wang, Yuanchao Bai, Daxin Li, Deming Zhai, Junjun Jiang, Xianming Liu

The BD-LVIC framework skillfully divides the high bit-depth volume into two lower bit-depth segments: the Most Significant Bit-Volume (MSBV) and the Least Significant Bit-Volume (LSBV).

Image Compression

Assessing Open-world Forgetting in Generative Image Model Customization

no code implementations18 Oct 2024 Héctor Laria, Alex Gomez-Villa, Kai Wang, Bogdan Raducanu, Joost Van de Weijer

Our work presents the first systematic investigation into open-world forgetting in diffusion models, focusing on semantic and appearance drift of representations.

Image Generation Zero-Shot Learning

GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning

1 code implementation17 Oct 2024 Guibin Zhang, Haonan Dong, Yuchen Zhang, ZHIXUN LI, Dingshuo Chen, Kai Wang, Tianlong Chen, Yuxuan Liang, Dawei Cheng, Kun Wang

Training high-quality deep models necessitates vast amounts of data, resulting in overwhelming computational and memory demands.

Graph Embedding

Towards Graph Foundation Models: Training on Knowledge Graphs Enables Transferability to General Graphs

no code implementations16 Oct 2024 Kai Wang, Siqiang Luo, Caihua Shan, Yifei Shen

In this paper, we introduce SCR, a unified graph reasoning framework designed to train on knowledge graphs and effectively generalize across a wide range of graph tasks and domains.

Knowledge Graphs Zero-Shot Learning

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

1 code implementation14 Oct 2024 Jiacheng Chen, Tianhao Liang, Sherman Siu, Zhengqing Wang, Kai Wang, YuBo Wang, Yuansheng Ni, Wang Zhu, Ziyan Jiang, Bohan Lyu, Dongfu Jiang, Xuan He, YuAn Liu, Hexiang Hu, Xiang Yue, Wenhu Chen

We evaluate a wide variety of frontier vision-language models on MEGA-Bench to understand their capabilities across these dimensions.

Dynamic Diffusion Transformer

2 code implementations4 Oct 2024 Wangbo Zhao, Yizeng Han, Jiasheng Tang, Kai Wang, Yibing Song, Gao Huang, Fan Wang, Yang You

In addition, we design a Spatial-wise Dynamic Token (SDT) strategy to avoid redundant computation at unnecessary spatial locations.

Image Generation

Causal Deciphering and Inpainting in Spatio-Temporal Dynamics via Diffusion Model

no code implementations29 Sep 2024 Yifan Duan, Jian Zhao, Pengcheng, Junyuan Mao, Hao Wu, Jingyu Xu, Shilong Wang, Caoyuan Ma, Kai Wang, Kun Wang, Xuelong Li

To this end, we establish a causal framework for ST predictions, termed CaPaint, which targets to identify causal regions in data and endow model with causal reasoning ability in a two-stage process.

Causal Discovery Image Inpainting

What is the Right Notion of Distance between Predict-then-Optimize Tasks?

no code implementations11 Sep 2024 Paula Rodriguez-Diaz, Lingkai Kong, Kai Wang, David Alvarez-Melis, Milind Tambe

Comparing datasets is a fundamental task in machine learning, essential for various learning paradigms; from evaluating train and test datasets for model generalization to using dataset similarity for detecting data drift.

Informativeness

Distilling Long-tailed Datasets

1 code implementation CVPR 2025 Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang, Kai Wang, Yan Yan

It reduces the distance between the student and the biased expert trajectories and prevents the tail class bias from being distilled to the synthetic dataset.

Dataset Distillation Efficient Neural Network

Real-Time Video Generation with Pyramid Attention Broadcast

1 code implementation22 Aug 2024 Xuanlei Zhao, Xiaolong Jin, Kai Wang, Yang You

We present Pyramid Attention Broadcast (PAB), a real-time, high quality and training-free approach for DiT-based video generation.

Video Generation

GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models

1 code implementation14 Aug 2024 Lei Kang, Fei Yang, Kai Wang, Mohamed Ali Souibgui, Lluis Gomez, Alicia Fornés, Ernest Valveny, Dimosthenis Karatzas

In this paper, we introduce a diffusion-based method, termed \ourmethod, to generate fonts that vividly embody specific impressions, utilizing an input consisting of a single letter and a set of descriptive impression keywords.

Descriptive Font Generation

The Dial-a-Ride Problem with Limited Pickups per Trip

no code implementations14 Aug 2024 Boshuai Zhao, Kai Wang, Wenchao Wei, Roel Leus

This results in the Dial-a-Ride Problem with Limited Pickups per Trip (DARP-LPT).

ARC

LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library

no code implementations12 Aug 2024 Tianhao Yu, Cai Yao, Zhuorui Sun, Feng Shi, Lin Zhang, Kangjie Lyu, Xuan Bai, Andong Liu, Xicheng Zhang, Jiali Zou, Wenshou Wang, Chris Lai, Kai Wang

To the best of our knowledge, this is the first successful demonstration of the capability of a pre-trained language model on virtual lipids and its effectiveness in downstream tasks using web-lab data.

Language Modeling Language Modelling +3

Prioritize Alignment in Dataset Distillation

1 code implementation6 Aug 2024 Zekai Li, Ziyao Guo, Wangbo Zhao, Tianle Zhang, Zhi-Qi Cheng, Samir Khaki, Kaipeng Zhang, Ahmad Sajedi, Konstantinos N Plataniotis, Kai Wang, Yang You

To achieve this, existing methods use the agent model to extract information from the target dataset and embed it into the distilled dataset.

Dataset Distillation

More Than Positive and Negative: Communicating Fine Granularity in Medical Diagnosis

no code implementations5 Aug 2024 Xiangyu Peng, Kai Wang, Jianfei Yang, Yingying Zhu, Yang You

Specifically, we devise a division rule based on medical knowledge to divide positive cases into two subcategories, namely atypical positive and typical positive.

Medical Diagnosis

TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks

1 code implementation3 Aug 2024 Yang Yu, Chen Xu, Kai Wang

Adapter based fine-tuning has been studied for improving the performance of SAM on downstream tasks.

Decoder parameter-efficient fine-tuning

Piculet: Specialized Models-Guided Hallucination Decrease for MultiModal Large Language Models

no code implementations2 Aug 2024 Kohou Wang, Xiang Liu, Zhaoxiang Liu, Kai Wang, Shiguo Lian

Multimodal Large Language Models (MLLMs) have made significant progress in bridging the gap between visual and language modalities.

Hallucination

MVPbev: Multi-view Perspective Image Generation from BEV with Test-time Controllability and Generalizability

1 code implementation28 Jul 2024 Buyu Liu, Kai Wang, Yansong Liu, Jun Bao, Tingting Han, Jun Yu

Unlike prior methods that neglect layout consistency, lack the ability to handle detailed text prompts, or are incapable of generalizing to unseen view points, MVPbev simultaneously generates cross-view consistent images of different perspective views with a two-stage design, allowing object-level control and novel view generation at test-time.

Image Generation

Exemplar-free Continual Representation Learning via Learnable Drift Compensation

1 code implementation11 Jul 2024 Alex Gomez-Villa, Dipam Goswami, Kai Wang, Andrew D. Bagdanov, Bartlomiej Twardowski, Joost Van de Weijer

Prototype-based approaches, when continually updated, face the critical issue of semantic drift due to which the old class prototypes drift to different positions in the new feature space.

class-incremental learning Class Incremental Learning +3

ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement

1 code implementation9 Jul 2024 Muhammad Atif Butt, Kai Wang, Javier Vazquez-Corral, Joost Van de Weijer

To overcome this, we generate several basic geometric objects in the target color, allowing for color and shape disentanglement during the color prompt learning.

Attribute Disentanglement +1

CUPID: Improving Battle Fairness and Position Satisfaction in Online MOBA Games with a Re-matchmaking System

no code implementations28 Jun 2024 Ge Fan, Chaoyun Zhang, Kai Wang, Yingjie Li, Junyang Chen, Zenglin Xu

Enhancing the gaming experience requires a deep understanding of player behavior, and a crucial aspect of MOBA games is matchmaking, which aims to assemble teams of comparable skill levels.

Fairness Position

Methodology of Adapting Large English Language Models for Specific Cultural Contexts

no code implementations26 Jun 2024 Wenjing Zhang, Siqi Xiao, Xuejiao Lei, Ning Wang, Huazheng Zhang, Meijuan An, Bikun Yang, Zhaoxiang Liu, Kai Wang, Shiguo Lian

The rapid growth of large language models(LLMs) has emerged as a prominent trend in the field of artificial intelligence.

OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser

1 code implementation24 Jun 2024 Jingze Shi, Ting Xie, Bingheng Wu, Chunjun Zheng, Kai Wang

Recent research has shown that combining Mamba with Transformer architecture, which has selective state space and quadratic self-attention mechanism, outperforms using Mamba or Transformer architecture alone in language modeling tasks.

Language Modeling Language Modelling +3

First-Order Methods for Linearly Constrained Bilevel Optimization

no code implementations18 Jun 2024 Guy Kornowski, Swati Padmanabhan, Kai Wang, Zhe Zhang, Suvrit Sra

For linear equality constraints, we attain $\epsilon$-stationarity in $\widetilde{O}(\epsilon^{-2})$ gradient oracle calls, which is nearly-optimal.

Bilevel Optimization

Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality

1 code implementation13 Jun 2024 Tianle Zhang, Langtian Ma, Yuchen Yan, Yuchen Zhang, Kai Wang, Yue Yang, Ziyao Guo, Wenqi Shao, Yang You, Yu Qiao, Ping Luo, Kaipeng Zhang

To address these challenges, this paper introduces the Text-to-Video Human Evaluation (T2VHE) protocol, a comprehensive and standardized protocol for T2V models.

Aligning Large Language Models with Representation Editing: A Control Perspective

1 code implementation10 Jun 2024 Lingkai Kong, Haorui Wang, Wenhao Mu, Yuanqi Du, Yuchen Zhuang, Yifei Zhou, Yue Song, Rongzhi Zhang, Kai Wang, Chao Zhang

To achieve alignment for specific objectives, we introduce external control signals into the state space of this language dynamical system.

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

2 code implementations3 Jun 2024 YuBo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, Wenhu Chen

In the age of large-scale language models, benchmarks like the Massive Multitask Language Understanding (MMLU) have been pivotal in pushing the boundaries of what AI can achieve in language comprehension and reasoning across diverse domains.

MMLU Multi-task Language Understanding

Dataset Growth

1 code implementation28 May 2024 Ziheng Qin, Zhaopan Xu, Yukun Zhou, Zangwei Zheng, Zebang Cheng, Hao Tang, Lei Shang, Baigui Sun, Xiaojiang Peng, Radu Timofte, Hongxun Yao, Kai Wang, Yang You

To tackle this challenge, we propose InfoGrowth, an efficient online algorithm for data cleaning and selection, resulting in a growing dataset that keeps up to date with awareness of cleanliness and diversity.

Diversity

FAITH: Frequency-domain Attention In Two Horizons for Time Series Forecasting

1 code implementation22 May 2024 RuiQi Li, Maowei Jiang, Kai Wang, Kaiduo Feng, Quangao Liu, Yue Sun, Xiufang Zhou

Time Series Forecasting plays a crucial role in various fields such as industrial equipment maintenance, meteorology, energy consumption, traffic flow and financial investment.

Time Series Time Series Forecasting

Advancing Head and Neck Cancer Survival Prediction via Multi-Label Learning and Deep Model Interpretation

no code implementations9 May 2024 Meixu Chen, Kai Wang, Jing Wang

We also present Grad-TEAM, a Gradient-weighted Time-Event Activation Mapping approach specifically developed for deep survival model visual explanation, to generate patient-specific time-to-event activation maps.

Decision Making Multi-Label Learning +3

TransAnaNet: Transformer-based Anatomy Change Prediction Network for Head and Neck Cancer Patient Radiotherapy

no code implementations9 May 2024 Meixu Chen, Kai Wang, Michael Dohopolski, Howard Morgan, David Sher, Jing Wang

The predicted image from the proposed method yielded the best similarity to the real image (CBCT21) over pCT, CBCT01, and predicted CBCTs from other comparison models.

Anatomy SSIM

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

1 code implementation6 May 2024 Zheng Zhu, XiaoFeng Wang, Wangbo Zhao, Chen Min, Nianchen Deng, Min Dou, Yuqi Wang, Botian Shi, Kai Wang, Chi Zhang, Yang You, Zhaoxiang Zhang, Dawei Zhao, Liang Xiao, Jian Zhao, Jiwen Lu, Guan Huang

General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems.

Autonomous Driving Decision Making +2

ATOM: Attention Mixer for Efficient Dataset Distillation

1 code implementation2 May 2024 Samir Khaki, Ahmad Sajedi, Kai Wang, Lucy Z. Liu, Yuri A. Lawryshyn, Konstantinos N. Plataniotis

To address these challenges in dataset distillation, we propose the ATtentiOn Mixer (ATOM) module to efficiently distill large datasets using a mixture of channel and spatial-wise attention in the feature matching process.

Dataset Distillation Neural Architecture Search

GroupedMixer: An Entropy Model with Group-wise Token-Mixers for Learned Image Compression

no code implementations2 May 2024 Daxin Li, Yuanchao Bai, Kai Wang, Junjun Jiang, Xianming Liu, Wen Gao

To further expedite the network inference, we introduce context cache optimization to GroupedMixer, which caches attention activation values in cross-group token-mixers and avoids complex and duplicated computation.

Image Compression

LocInv: Localization-aware Inversion for Text-Guided Image Editing

1 code implementation2 May 2024 Chuanming Tang, Kai Wang, Fei Yang, Joost Van de Weijer

Large-scale Text-to-Image (T2I) diffusion models demonstrate significant generation capabilities based on textual prompts.

Denoising text-guided-image-editing

$ν$-DBA: Neural Implicit Dense Bundle Adjustment Enables Image-Only Driving Scene Reconstruction

no code implementations29 Apr 2024 Yunxuan Mao, Bingqi Shen, Yifei Yang, Kai Wang, Rong Xiong, Yiyi Liao, Yue Wang

The joint optimization of the sensor trajectory and 3D map is a crucial characteristic of bundle adjustment (BA), essential for autonomous driving.

Autonomous Driving Novel View Synthesis +2

EasyTrack: Efficient and Compact One-stream 3D Point Clouds Tracker

no code implementations9 Apr 2024 Baojie Fan, Wuyang Zhou, Kai Wang, Shijun Zhou, Fengyu Xu, Jiandong Tian

Most of 3D single object trackers (SOT) in point clouds follow the two-stream multi-stage 3D Siamese or motion tracking paradigms, which process the template and search area point clouds with two parallel branches, built on supervised point cloud backbones.

A dataset of primary nasopharyngeal carcinoma MRI with multi-modalities segmentation

no code implementations4 Apr 2024 Yin Li, Qi Chen, Kai Wang, Meige Li, Liping Si, Yingwei Guo, Yu Xiong, Qixing Wang, Yang Qin, Ling Xu, Patrick van der Smagt, Jun Tang, Nutan Chen

Multi-modality magnetic resonance imaging data with various sequences facilitate the early diagnosis, tumor segmentation, and disease staging in the management of nasopharyngeal carcinoma (NPC).

Management Tumor Segmentation

GNSS Spoofing Detection by Crowdsourcing Double Differential Pseudorange Spatial Distribution

no code implementations3 Apr 2024 Xin Chen, Kai Wang

It is widely known that spoofing is a major threat that adversely impacts the reliability and accuracy of GNSS applications.

Assessing the Utility of Large Language Models for Phenotype-Driven Gene Prioritization in Rare Genetic Disorder Diagnosis

no code implementations21 Mar 2024 Junyoung Kim, Jingye Yang, Kai Wang, Chunhua Weng, Cong Liu

A similar increasing trend was observed for the task completion rate, with complicated prompts more likely to increase task completeness in models smaller than GPT-4.

Knowledge Graphs

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

1 code implementation18 Mar 2024 Wangbo Zhao, Jiasheng Tang, Yizeng Han, Yibing Song, Kai Wang, Gao Huang, Fan Wang, Yang You

Existing parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.

Mixture-of-Experts parameter-efficient fine-tuning +2

Difference Learning for Air Quality Forecasting Transport Emulation

no code implementations22 Feb 2024 Reed River Chen, Christopher Ribaudo, Jennifer Sleeman, Chace Ashcraft, Collin Kofroth, Marisa Hughes, Ivanka Stajner, Kevin Viner, Kai Wang

Due to a recent increase in extreme air quality events, both globally and locally in the United States, finer resolution air quality forecasting guidance is needed to effectively adapt to these events.

Neural Network Diffusion

2 code implementations20 Feb 2024 Kai Wang, Dongwen Tang, Boya Zeng, Yida Yin, Zhaopan Xu, Yukun Zhou, Zelin Zang, Trevor Darrell, Zhuang Liu, Yang You

The autoencoder extracts latent representations of a subset of the trained neural network parameters.

Decoder

LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge Graphs

1 code implementation19 Feb 2024 Kai Wang, Yuwei Xu, Zhiyong Wu, Siqiang Luo

Knowledge Graph (KG) inductive reasoning, which aims to infer missing facts from new KGs that are not seen during training, has been widely adopted in various applications.

Knowledge Graphs

Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching

2 code implementations7 Feb 2024 Yuchen Zhang, Tianle Zhang, Kai Wang, Ziyao Guo, Yuxuan Liang, Xavier Bresson, Wei Jin, Yang You

Specifically, we employ a curriculum learning strategy to train expert trajectories with more diverse supervision signals from the original graph, and then effectively transfer the information into the condensed graph with expanding window matching.

Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching

1 code implementation7 Feb 2024 Tianle Zhang, Yuchen Zhang, Kun Wang, Kai Wang, Beining Yang, Kaipeng Zhang, Wenqi Shao, Ping Liu, Joey Tianyi Zhou, Yang You

Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have raised growing concerns.

Graph Representation Learning

Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness

no code implementations2 Feb 2024 Guibin Zhang, Yanwei Yue, Kun Wang, Junfeng Fang, Yongduo Sui, Kai Wang, Yuxuan Liang, Dawei Cheng, Shirui Pan, Tianlong Chen

Specifically, GST initially constructs a topology & semantic anchor at a low training cost, followed by performing dynamic sparse training to align the sparse graph with the anchor.

Adversarial Defense Graph Learning

Sequential Model for Predicting Patient Adherence in Subcutaneous Immunotherapy for Allergic Rhinitis

1 code implementation21 Jan 2024 Yin Li, Yu Xiong, Wenxin Fan, Kai Wang, Qingqing Yu, Liping Si, Patrick van der Smagt, Jun Tang, Nutan Chen

How to enhance the adherence of patients to maximize the benefit of allergen immunotherapy (AIT) plays a crucial role in the management of AIT.

Management Prediction

Advancing Large Multi-modal Models with Explicit Chain-of-Reasoning and Visual Question Generation

no code implementations18 Jan 2024 Kohei Uehara, Nabarun Goswami, Hanqin Wang, Toshiaki Baba, Kohtaro Tanaka, Tomohiro Hashimoto, Kai Wang, Rei Ito, Takagi Naoya, Ryo Umagami, Yingyi Wen, Tanachai Anakewat, Tatsuya Harada

The increasing demand for intelligent systems capable of interpreting and reasoning about visual content requires the development of large Vision-and-Language Models (VLMs) that are not only accurate but also have explicit reasoning capabilities.

Caption Generation Language Modeling +5

Must: Maximizing Latent Capacity of Spatial Transcriptomics Data

1 code implementation15 Jan 2024 Zelin Zang, Liangyu Li, Yongjie Xu, Chenrui Duan, Kai Wang, Yang You, Yi Sun, Stan Z. Li

MuST integrates the multi-modality information contained in the ST data effectively into a uniform latent space to provide a foundation for all the downstream tasks.

Understanding YTHDF2-mediated mRNA Degradation By m6A-BERT-Deg

1 code implementation15 Jan 2024 Ting-He Zhang, Sumin Jo, Michelle Zhang, Kai Wang, Shou-Jiang Gao, Yufei Huang

N6-methyladenosine (m6A) is the most abundant mRNA modification within mammalian cells, holding pivotal significance in the regulation of mRNA stability, translation, and splicing.

GestaltMML: Enhancing Rare Genetic Disease Diagnosis through Multimodal Machine Learning Combining Facial Images and Clinical Texts

2 code implementations23 Dec 2023 Da Wu, Jingye Yang, Cong Liu, Tzung-Chien Hsieh, Elaine Marchi, Justin Blair, Peter Krawitz, Chunhua Weng, Wendy Chung, Gholson J. Lyon, Ian D. Krantz, Jennifer M. Kalish, Kai Wang

Many rare genetic diseases have distinctive facial features, which can be used by artificial intelligence algorithms to facilitate clinical diagnosis, in prioritizing candidate diseases to be further examined by lab tests or genetic assays, or in helping the phenotype-driven reinterpretation of genome/exome sequencing data.

Diagnostic

Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling

no code implementations23 Dec 2023 Xianjie Zhang, Jiahao Sun, Chen Gong, Kai Wang, Yifei Cao, Hao Chen, Yu Liu

The emergence of on-demand ride pooling services allows each vehicle to serve multiple passengers at a time, thus increasing drivers' income and enabling passengers to travel at lower prices than taxi/car on-demand services (only one passenger can be assigned to a car at a time like UberX and Lyft).

Reinforcement Learning (RL)

Exploring the Reversal Curse and Other Deductive Logical Reasoning in BERT and GPT-Based Large Language Models

1 code implementation6 Dec 2023 Da Wu, Jingye Yang, Kai Wang

The term "Reversal Curse" refers to the scenario where auto-regressive decoder large language models (LLMs), such as ChatGPT, trained on "A is B" fail to learn "B is A," assuming that B and A are distinct and can be uniquely identified from each other, demonstrating a basic failure of logical deduction.

Decoder Knowledge Graphs +1

MLLMs-Augmented Visual-Language Representation Learning

1 code implementation30 Nov 2023 Yanqing Liu, Kai Wang, Wenqi Shao, Ping Luo, Yu Qiao, Mike Zheng Shou, Kaipeng Zhang, Yang You

Visual-language pre-training has achieved remarkable success in many multi-modal tasks, largely attributed to the availability of large-scale image-text datasets.

Image-text Retrieval Representation Learning +1

Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in Meta

no code implementations16 Nov 2023 Wei zhang, Dai Li, Chen Liang, Fang Zhou, Zhongke Zhang, Xuewei Wang, Ru Li, Yi Zhou, Yaning Huang, Dong Liang, Kai Wang, Zhangyuan Wang, Zhengxing Chen, Fenggang Wu, Minghai Chen, Huayu Li, Yunnan Wu, Zhan Shu, Mindi Yuan, Sri Reddy

To address these challenges, we present Scaling User Modeling (SUM), a framework widely deployed in Meta's ads ranking system, designed to facilitate efficient and scalable sharing of online user representation across hundreds of ads models.

Representation Learning

Towards Long-term Annotators: A Supervised Label Aggregation Baseline

no code implementations15 Nov 2023 Haoyu Liu, Fei Wang, Minmin Lin, Runze Wu, Renyu Zhu, Shiwei Zhao, Kai Wang, Tangjie Lv, Changjie Fan

These annotators could leave substantial historical annotation records on the crowdsourcing platforms, which can benefit label aggregation, but are ignored by previous works.

LiPar: A Lightweight Parallel Learning Model for Practical In-Vehicle Network Intrusion Detection

1 code implementation14 Nov 2023 Aiheng Zhang, Qiguang Jiang, Kai Wang, Ming Li

As the main network of in-vehicle networks, the controller area network (CAN) has many potential security hazards, resulting in higher generalization capability and lighter security requirements for intrusion detection systems to ensure safety.

Cloud Computing Network Intrusion Detection

STATGRAPH: Effective In-vehicle Intrusion Detection via Multi-view Statistical Graph Learning

1 code implementation13 Nov 2023 Kai Wang, Qiguang Jiang, Bailing Wang, Yulei Wu, Hongke Zhang

In-vehicle network (IVN) is facing complex external cyber-attacks, especially the emerging masquerade attacks with extremely high difficulty of detection while serious damaging effects.

Graph Learning Intrusion Detection

AViTMP: A Tracking-Specific Transformer for Single-Branch Visual Tracking

1 code implementation30 Oct 2023 Chuanming Tang, Kai Wang, Joost Van de Weijer, Jianlin Zhang, YongMei Huang

Specifically, in the proposed encoder AViT encoder, we introduce a tracking-tailored Adaptor module for vanilla ViT and a joint target state embedding to enrich the target-prior embedding paradigm.

Decoder Visual Object Tracking +1

IterInv: Iterative Inversion for Pixel-Level T2I Models

1 code implementation30 Oct 2023 Chuanming Tang, Kai Wang, Joost Van de Weijer

Based on this observation, we develop an iterative inversion (IterInv) technique for this category of T2I models and verify IterInv with the open-source DeepFloyd-IF model. Specifically, IterInv employ NTI as the inversion and reconstruction of low-resolution image generation.

Image Generation Super-Resolution

DREAM+: Efficient Dataset Distillation by Bidirectional Representative Matching

1 code implementation23 Oct 2023 Yanqing Liu, Jianyang Gu, Kai Wang, Zheng Zhu, Kaipeng Zhang, Wei Jiang, Yang You

Dataset distillation plays a crucial role in creating compact datasets with similar training performance compared with original large-scale ones.

Dataset Distillation Transfer Learning

Does Graph Distillation See Like Vision Dataset Counterpart?

2 code implementations NeurIPS 2023 Beining Yang, Kai Wang, Qingyun Sun, Cheng Ji, Xingcheng Fu, Hao Tang, Yang You, JianXin Li

We validate the proposed SGDD across 9 datasets and achieve state-of-the-art results on all of them: for example, on the YelpChi dataset, our approach maintains 98. 6% test accuracy of training on the original graph dataset with 1, 000 times saving on the scale of the graph.

Anomaly Detection Dataset Distillation +2

PRIOR: Personalized Prior for Reactivating the Information Overlooked in Federated Learning

1 code implementation13 Oct 2023 Mingjia Shi, Yuhao Zhou, Kai Wang, Huaizheng Zhang, Shudong Huang, Qing Ye, Jiangcheng Lv

Personalized FL (PFL) addresses this by synthesizing personalized models from a global model via training on local data.

Federated Learning

Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching

1 code implementation9 Oct 2023 Ziyao Guo, Kai Wang, George Cazenavette, Hui Li, Kaipeng Zhang, Yang You

The ultimate goal of Dataset Distillation is to synthesize a small synthetic dataset such that a model trained on this synthetic set will perform equally well as a model trained on the full, real dataset.

Dataset Distillation

Can pre-trained models assist in dataset distillation?

1 code implementation5 Oct 2023 Yao Lu, Xuguang Chen, Yuchen Zhang, Jianyang Gu, Tianle Zhang, Yifan Zhang, Xiaoniu Yang, Qi Xuan, Kai Wang, Yang You

Dataset Distillation (DD) is a prominent technique that encapsulates knowledge from a large-scale original dataset into a small synthetic dataset for efficient training.

Dataset Distillation Diversity

Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing

1 code implementation NeurIPS 2023 Kai Wang, Fei Yang, Shiqi Yang, Muhammad Atif Butt, Joost Van de Weijer

Large-scale text-to-image generative models have been a ground-breaking development in generative AI, with diffusion models showing their astounding ability to synthesize convincing images following an input text prompt.

Prompt Learning Text-based Image Editing

KERMIT: Knowledge Graph Completion of Enhanced Relation Modeling with Inverse Transformation

1 code implementation26 Sep 2023 Haotian Li, Bin Yu, Yuliang Wei, Kai Wang, Richard Yi Da Xu, Bailing Wang

Knowledge graph completion (KGC) revolves around populating missing triples in a knowledge graph using available information.

Diversity Link Prediction +1

Multi-user passive beamforming in RIS-aided communications and experimental validations

no code implementations18 Sep 2023 Zhibo Zhou, Haifan Yin, Li Tan, Ruikun Zhang, Kai Wang, Yingzhuang Liu

To generate the reflection coefficients with the aim of maximizing the spectral efficiency, we propose a quadratic transform-based low-rank multi-user beamforming (QTLM) algorithm.

Plasticity-Optimized Complementary Networks for Unsupervised Continual Learning

1 code implementation12 Sep 2023 Alex Gomez-Villa, Bartlomiej Twardowski, Kai Wang, Joost Van de Weijer

In the second phase, we combine this new knowledge with the previous network in an adaptation-retrospection phase to avoid forgetting and initialize a new expert with the knowledge of the old network.

Exemplar-Free Representation Learning +2

DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation

1 code implementation10 Sep 2023 Zelin Zang, Hao Luo, Kai Wang, Panpan Zhang, Fan Wang, Stan. Z Li, Yang You

With the help of iterative training of the semantic encoder and diffusion model, DiffAug improves the representation ability in an uninterrupted and unsupervised manner.

Contrastive Learning Data Augmentation +2

Region Generation and Assessment Network for Occluded Person Re-Identification

no code implementations7 Sep 2023 Shuting He, Weihua Chen, Kai Wang, Hao Luo, Fan Wang, Wei Jiang, Henghui Ding

Then, to measure the importance of each generated region, we introduce a Region Assessment Module (RAM) that assigns confidence scores to different regions and reduces the negative impact of the occlusion regions by lower scores.

Occluded Person Re-Identification

Recurrence-Free Survival Prediction for Anal Squamous Cell Carcinoma Chemoradiotherapy using Planning CT-based Radiomics Model

no code implementations5 Sep 2023 Shanshan Tang, Kai Wang, David Hein, Gloria Lin, Nina N. Sanford, Jing Wang

Conclusions: A treatment planning CT based radiomics and clinical combined model had improved prognostic performance in predicting RFS for ASCC patients treated with CRT as compared to a model using clinical features only.

feature selection Survival Prediction

ScrollNet: Dynamic Weight Importance for Continual Learning

1 code implementation31 Aug 2023 Fei Yang, Kai Wang, Joost Van de Weijer

The importance of weights for each task can be determined either explicitly through learning a task-specific mask during training (e. g., parameter isolation-based approaches) or implicitly by introducing a regularization term (e. g., regularization-based approaches).

Continual Learning

Dataset Quantization

1 code implementation ICCV 2023 Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, Jiashi Feng

Extensive experiments demonstrate that DQ is able to generate condensed small datasets for training unseen network architectures with state-of-the-art compression ratios for lossless model training.

Dataset Distillation object-detection +3

The Snowflake Hypothesis: Training Deep GNN with One Node One Receptive field

no code implementations19 Aug 2023 Kun Wang, Guohao Li, Shilong Wang, Guibin Zhang, Kai Wang, Yang You, Xiaojiang Peng, Yuxuan Liang, Yang Wang

Despite Graph Neural Networks demonstrating considerable promise in graph representation learning tasks, GNNs predominantly face significant issues with over-fitting and over-smoothing as they go deeper as models of computer vision realm.

Graph Representation Learning

Enhancing Phenotype Recognition in Clinical Notes Using Large Language Models: PhenoBCBERT and PhenoGPT

1 code implementation11 Aug 2023 Jingye Yang, Cong Liu, Wendy Deng, Da Wu, Chunhua Weng, Yunyun Zhou, Kai Wang

We hypothesize that large language models (LLMs) based on the transformer architecture can enable automated detection of clinical phenotype terms, including terms not documented in the HPO.

Negation

DVPT: Dynamic Visual Prompt Tuning of Large Pre-trained Models for Medical Image Analysis

1 code implementation19 Jul 2023 Along He, Kai Wang, Zhihong Wang, Tao Li, Huazhu Fu

Firstly, the frozen features are transformed by an lightweight bottleneck layer to learn the domain-specific distribution of downstream medical tasks, and then a few learnable visual prompts are used as dynamic queries and then conduct cross-attention with the transformed features, attempting to acquire sample-specific knowledge that are suitable for each sample.

Medical Image Analysis parameter-efficient fine-tuning +1

Evidential Detection and Tracking Collaboration: New Problem, Benchmark and Algorithm for Robust Anti-UAV System

1 code implementation27 Jun 2023 Xue-Feng Zhu, Tianyang Xu, Jian Zhao, Jia-Wei Liu, Kai Wang, Gang Wang, Jianan Li, Qiang Wang, Lei Jin, Zheng Zhu, Junliang Xing, Xiao-Jun Wu

Still, previous works have simplified such an anti-UAV task as a tracking problem, where the prior information of UAVs is always provided; such a scheme fails in real-world anti-UAV tasks (i. e. complex scenes, indeterminate-appear and -reappear UAVs, and real-time UAV surveillance).

Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?

no code implementations30 May 2023 Jianfei Yang, Hanjie Qian, Yuecong Xu, Kai Wang, Lihua Xie

Unsupervised domain adaptation (UDA) involves adapting a model trained on a label-rich source domain to an unlabeled target domain.

Unsupervised Domain Adaptation

Generating Driving Scenes with Diffusion

no code implementations29 May 2023 Ethan Pronovost, Kai Wang, Nick Roy

In this paper we describe a learned method of traffic scene generation designed to simulate the output of the perception system of a self-driving car.

object-detection Object Detection +1

Summarizing Stream Data for Memory-Constrained Online Continual Learning

2 code implementations26 May 2023 Jianyang Gu, Kai Wang, Wei Jiang, Yang You

Through maintaining the consistency of training gradients and relationship to the past tasks, the summarized samples are more representative for the stream data compared to the original images.

Continual Learning Informativeness

SAMScore: A Content Structural Similarity Metric for Image Translation Evaluation

1 code implementation24 May 2023 Yunxiang Li, Meixu Chen, Kai Wang, Jun Ma, Alan C. Bovik, You Zhang

Image translation has wide applications, such as style transfer and modality conversion, usually aiming to generate images having both high degrees of realism and faithfulness.

Semantic Similarity Semantic Textual Similarity +2

River of No Return: Graph Percolation Embeddings for Efficient Knowledge Graph Reasoning

no code implementations17 May 2023 Kai Wang, Siqiang Luo, Dan Lin

We study Graph Neural Networks (GNNs)-based embedding techniques for knowledge graph (KG) reasoning.

A Soft Coordination Method of Heterogeneous Devices in Distribution System Voltage Control

no code implementations4 May 2023 Licheng Wang, Tao Wang, Gang Huang, Ruifeng Yan, Kai Wang, Youbing Zhang, Shijie Cheng

The proposed method achieves the soft coordination by establishing a modified actor-critic algorithm to train a proxy model of inverters.

Decision Making

Zero-shot Generative Model Adaptation via Image-specific Prompt Learning

1 code implementation CVPR 2023 Jiayi Guo, Chaofei Wang, You Wu, Eric Zhang, Kai Wang, Xingqian Xu, Shiji Song, Humphrey Shi, Gao Huang

Recently, CLIP-guided image synthesis has shown appealing performance on adapting a pre-trained source-domain generator to an unseen target domain.

Diversity Image Generation +1

Motion-R3: Fast and Accurate Motion Annotation via Representation-based Representativeness Ranking

no code implementations4 Apr 2023 Jubo Yu, Tianxiang Ren, Shihui Guo, Fengyi Fang, Kai Wang, Zijiao Zeng, Yazhan Zhang, Andreas Aristidou, Yipeng Qin

In this paper, we follow a data-centric philosophy and propose a novel motion annotation method based on the inherent representativeness of motion data in a given dataset.

Philosophy

Classification of integers based on residue classes via modern deep learning algorithms

1 code implementation3 Apr 2023 Da Wu, Jingye Yang, Mian Umair Ahsan, Kai Wang

Judging whether an integer can be divided by prime numbers such as 2 or 3 may appear trivial to human beings, but can be less straightforward for computers.

AutoML Feature Engineering

Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models

1 code implementation30 Mar 2023 Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi

The unlearning problem of deep learning models, once primarily an academic concern, has become a prevalent issue in the industry.

Disentanglement Memorization +2

CSSL-MHTR: Continual Self-Supervised Learning for Scalable Multi-script Handwritten Text Recognition

no code implementations16 Mar 2023 Marwa Dhiaf, Mohamed Ali Souibgui, Kai Wang, Yuyang Liu, Yousri Kessentini, Alicia Fornés, Ahmed Cheikh Rouhou

In this paper, we explore the potential of continual self-supervised learning to alleviate the catastrophic forgetting problem in handwritten text recognition, as an example of sequence recognition.

Continual Self-Supervised Learning Handwritten Text Recognition +1

MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID

1 code implementation CVPR 2023 Jianyang Gu, Kai Wang, Hao Luo, Chen Chen, Wei Jiang, Yuqiang Fang, Shanghang Zhang, Yang You, Jian Zhao

Neural Architecture Search (NAS) has been increasingly appealing to the society of object Re-Identification (ReID), for that task-specific architectures significantly improve the retrieval performance.

image-classification Image Classification +4

Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models

2 code implementations ICCV 2023 Zangwei Zheng, Mingyuan Ma, Kai Wang, Ziheng Qin, Xiangyu Yue, Yang You

To address this challenge, we propose a novel method ZSCL to prevent zero-shot transfer degradation in the continual learning of vision-language models in both feature and parameter space.

class-incremental learning Class Incremental Learning +1

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning

1 code implementation8 Mar 2023 Ziheng Qin, Kai Wang, Zangwei Zheng, Jianyang Gu, Xiangyu Peng, Zhaopan Xu, Daquan Zhou, Lei Shang, Baigui Sun, Xuansong Xie, Yang You

To solve this problem, we propose \textbf{InfoBatch}, a novel framework aiming to achieve lossless training acceleration by unbiased dynamic data pruning.

Semantic Segmentation

DiM: Distilling Dataset into Generative Model

2 code implementations8 Mar 2023 Kai Wang, Jianyang Gu, Daquan Zhou, Zheng Zhu, Wei Jiang, Yang You

To the best of our knowledge, we are the first to achieve higher accuracy on complex architectures than simple ones, such as 75. 1\% with ResNet-18 and 72. 6\% with ConvNet-3 on ten images per class of CIFAR-10.

Dataset Distillation model

DREAM: Efficient Dataset Distillation by Representative Matching

2 code implementations ICCV 2023 Yanqing Liu, Jianyang Gu, Kai Wang, Zheng Zhu, Wei Jiang, Yang You

Although there are various matching objectives, currently the strategy for selecting original images is limited to naive random sampling.

Dataset Distillation Diversity

Bioformer: an efficient transformer language model for biomedical text mining

1 code implementation3 Feb 2023 Li Fang, Qingyu Chen, Chih-Hsuan Wei, Zhiyong Lu, Kai Wang

We thoroughly evaluated the performance of Bioformer as well as existing biomedical BERT models including BioBERT and PubMedBERT on 15 benchmark datasets of four different biomedical NLP tasks: named entity recognition, relation extraction, question answering and document classification.

Articles Document Classification +7

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models To Learn Any Unseen Style

no code implementations CVPR 2023 Haoming Lu, Hazarapet Tunanyan, Kai Wang, Shant Navasardyan, Zhangyang Wang, Humphrey Shi

Diffusion models have demonstrated impressive capability of text-conditioned image synthesis, and broader application horizons are emerging by personalizing those pretrained diffusion models toward generating some specialized target object or style.

Disentanglement Image Generation

CORE: Co-planarity Regularized Monocular Geometry Estimation with Weak Supervision

no code implementations ICCV 2023 Yuguang Li, Kai Wang, Hui Li, Seon-Min Rhee, Seungju Han, JiHye Kim, Min Yang, Ran Yang, Feng Zhu

Meanwhile, SANE easily establishes multi-task learning with CORE loss functions on both depth and surface normal estimation, leading to the whole performance leap.

3D geometry Depth Estimation +3

Expanding Small-Scale Datasets with Guided Imagination

1 code implementation NeurIPS 2023 Yifan Zhang, Daquan Zhou, Bryan Hooi, Kai Wang, Jiashi Feng

Specifically, GIF conducts data imagination by optimizing the latent features of the seed data in the semantically meaningful space of the prior model, resulting in the creation of photo-realistic images with new content.

Self adaptive global-local feature enhancement for radiology report generation

no code implementations21 Nov 2022 Yuhao Wang, Kai Wang, Xiaohong Liu, Tianrun Gao, Jingyue Zhang, Guangyu Wang

Automated radiology report generation aims at automatically generating a detailed description of medical images, which can greatly alleviate the workload of radiologists and provide better medical services to remote areas.

Anatomy

Versatile Diffusion: Text, Images and Variations All in One Diffusion Model

3 code implementations ICCV 2023 Xingqian Xu, Zhangyang Wang, Eric Zhang, Kai Wang, Humphrey Shi

In this work, we expand the existing single-flow diffusion pipeline into a multi-task multimodal network, dubbed Versatile Diffusion (VD), that handles multiple flows of text-to-image, image-to-text, and variations in one unified model.

All Disentanglement +7

Dataset Factorization for Condensation

1 code implementation NIPS 2022 Songhua Liu, Kai Wang, Xingyi Yang, Jingwen Ye, Xinchao Wang

In this paper, we study dataset distillation (DD), from a novel perspective and introduce a \emph{dataset factorization} approach, termed \emph{HaBa}, which is a plug-and-play strategy portable to any existing DD baseline.

Dataset Distillation Diversity +2

Dataset Distillation via Factorization

3 code implementations30 Oct 2022 Songhua Liu, Kai Wang, Xingyi Yang, Jingwen Ye, Xinchao Wang

In this paper, we study \xw{dataset distillation (DD)}, from a novel perspective and introduce a \emph{dataset factorization} approach, termed \emph{HaBa}, which is a plug-and-play strategy portable to any existing DD baseline.

Dataset Distillation Hallucination +1

MV-HAN: A Hybrid Attentive Networks based Multi-View Learning Model for Large-scale Contents Recommendation

no code implementations14 Oct 2022 Ge Fan, Chaoyun Zhang, Kai Wang, Junyang Chen

In this paper, we introduce a novel Multi-View Approach with Hybrid Attentive Networks (MV-HAN) for contents retrieval at the matching stage of recommender systems.

MULTI-VIEW LEARNING Recommendation Systems +1

Vision-Based Defect Classification and Weight Estimation of Rice Kernels

no code implementations6 Oct 2022 Xiang Wang, Kai Wang, Xiaohong Li, Shiguo Lian

To compensate for the imbalance of different kernel numbers and classify kernels with multiple flaws accurately, we propose a multi-stage workflow which is able to locate the kernels in the captured image and classify their properties.

Attention Distillation: self-supervised vision transformer students need more guidance

1 code implementation3 Oct 2022 Kai Wang, Fei Yang, Joost Van de Weijer

In experiments on ImageNet-Subset and ImageNet-1K, we show that our method AttnDistill outperforms existing self-supervised knowledge distillation (SSKD) methods and achieves state-of-the-art k-NN accuracy compared with self-supervised learning (SSL) methods learning from scratch (with the ViT-S model).

Knowledge Distillation Self-Supervised Learning

Uncertainty estimations methods for a deep learning model to aid in clinical decision-making -- a clinician's perspective

no code implementations2 Oct 2022 Michael Dohopolski, Kai Wang, Biling Wang, Ti Bai, Dan Nguyen, David Sher, Steve Jiang, Jing Wang

Especially for smaller, single institutional datasets, it may be important to evaluate multiple estimations techniques before incorporating a model into clinical practice.

Decision Making Specificity +1

RIGA: Rotation-Invariant and Globally-Aware Descriptors for Point Cloud Registration

1 code implementation27 Sep 2022 Hao Yu, Ji Hou, Zheng Qin, Mahdi Saleh, Ivan Shugurov, Kai Wang, Benjamin Busam, Slobodan Ilic

More specifically, 3D structures of the whole frame are first represented by our global PPF signatures, from which structural descriptors are learned to help geometric descriptors sense the 3D world beyond local regions.

Point Cloud Registration

Recurrence-free Survival Prediction under the Guidance of Automatic Gross Tumor Volume Segmentation for Head and Neck Cancers

1 code implementation22 Sep 2022 Kai Wang, Yunxiang Li, Michael Dohopolski, Tao Peng, Weiguo Lu, You Zhang, Jing Wang

For Head and Neck Cancers (HNC) patient management, automatic gross tumor volume (GTV) segmentation and accurate pre-treatment cancer recurrence prediction are of great importance to assist physicians in designing personalized management plans, which have the potential to improve the treatment outcome and quality of life for HNC patients.

Management Prediction +3

Deep Lossy Plus Residual Coding for Lossless and Near-lossless Image Compression

1 code implementation11 Sep 2022 Yuanchao Bai, Xianming Liu, Kai Wang, Xiangyang Ji, Xiaolin Wu, Wen Gao

In the lossless mode, the DLPR coding system first performs lossy compression and then lossless coding of residuals.

Image Compression

Prompt Vision Transformer for Domain Generalization

1 code implementation18 Aug 2022 Zangwei Zheng, Xiangyu Yue, Kai Wang, Yang You

In this paper, we propose a novel approach DoPrompt based on prompt learning to embed the knowledge of source domains in domain prompts for target domain prediction.

Domain Generalization Prompt Learning +1

QuickSkill: Novice Skill Estimation in Online Multiplayer Games

no code implementations15 Aug 2022 Chaoyun Zhang, Kai Wang, Hao Chen, Ge Fan, Yingjie Li, Lifang Wu, Bingchao Zheng

However, the skill rating of a novice is usually inaccurate, as current matchmaking rating algorithms require considerable amount of games for learning the true skill of a new player.

Fairness

Cannot find the paper you are looking for? You can Submit a new open access paper.