Search Results for author: Xin Wang

Found 530 papers, 189 papers with code

OIE@OIA: an Adaptable and Efficient Open Information Extraction Framework

no code implementations ACL 2022 Xin Wang, Minlong Peng, Mingming Sun, Ping Li

OIE@OIA follows the methodology of Open Information eXpression (OIX): parsing a sentence to an Open Information Annotation (OIA) Graph and then adapting the OIA graph to different OIE tasks with simple rules.

Open Information Extraction Sentence

A Predicate-Function-Argument Annotation of Natural Language for Open-Domain Information eXpression

no code implementations EMNLP 2020 Mingming Sun, Wenyue Hua, Zoey Liu, Xin Wang, Kangjie Zheng, Ping Li

Based on the same platform of OIX, the OIE strategies are reusable, and people can select a set of strategies to assemble their algorithm for a specific task so that the adaptability may be significantly increased.

Open Information Extraction Sentence

Dependency Position Encoding for Relation Extraction

no code implementations Findings (NAACL) 2022 Qiushi Guo, Xin Wang, Dehong Gao

Leveraging the dependency tree of the input sentence is able to improve the model performance for relation extraction.

Position Relation +2

NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls

no code implementations4 Sep 2024 Kinjal Basu, Ibrahim Abdelaziz, Kelsey Bradford, Maxwell Crouse, Kiran Kate, Sadhana Kumaravel, Saurabh Goyal, Asim Munawar, Yara Rizk, Xin Wang, Luis Lastras, Pavan Kapanipathi

In this paper, we present NESTFUL, a benchmark to evaluate LLMs on nested sequences of API calls, i. e., sequences where the output of one API call is passed as input to a subsequent call.

Malacopula: adversarial automatic speaker verification attacks using a neural-based generalised Hammerstein model

1 code implementation17 Aug 2024 Massimiliano Todisco, Michele Panariello, Xin Wang, Héctor Delgado, Kong Aik Lee, Nicholas Evans

We present Malacopula, a neural-based generalised Hammerstein model designed to introduce adversarial perturbations to spoofed speech utterances so that they better deceive automatic speaker verification (ASV) systems.

Speaker Verification

An Explainable Non-local Network for COVID-19 Diagnosis

no code implementations8 Aug 2024 Jingfu Yang, Peng Huang, Jing Hu, Shu Hu, Siwei Lyu, Xin Wang, Jun Guo, Xi Wu

The network is embedded with a nonlocal module to capture global information, while a 3D attention module is embedded to focus on the details of the lesion so that it can directly analyze the 3D lung CT and output the classification results.

COVID-19 Diagnosis

MGFs: Masked Gaussian Fields for Meshing Building based on Multi-View Images

no code implementations6 Aug 2024 Tengfei Wang, Zongqian Zhan, Rui Xia, Linxia Ji, Xin Wang

Compared to the traditional photogrammetric and NeRF-based solutions, recently, Gaussian fields-based methods have exhibited significant potential in generating surface meshes due to their time-efficient training and detailed 3D information preservation.

Novel View Synthesis Surface Reconstruction

Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models

no code implementations5 Aug 2024 Tongtong Feng, Qing Li, Xin Wang, Mingzi Wang, Guangyao Li, Wenwu Zhu

For image restoration, MCGF incorporates a shared encoder and a lightweight restoration module to help the backbone eliminate weather-specific information.

Denoising Image Restoration

AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning

1 code implementation4 Aug 2024 Xin Wang, Kai Chen, Xingjun Ma, Zhineng Chen, Jingjing Chen, Yu-Gang Jiang

During this process, the queries made to the target model are intermediate adversarial examples crafted at the previous attack step, which share high similarities in the pixel space.

U-MedSAM: Uncertainty-aware MedSAM for Medical Image Segmentation

no code implementations3 Aug 2024 Xin Wang, Xiaoyu Liu, Peng Huang, Pu Huang, Shu Hu, Hongtu Zhu

Medical Image Foundation Models have proven to be powerful tools for mask prediction across various datasets.

Image Segmentation Medical Image Segmentation +1

DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention

no code implementations30 Jul 2024 Wei Wang, Jixing He, Xin Wang

This challenge is mainly attributed to the fact that polyps are similar to other pathologies and have complex features influenced by texture, color, and morphology.

Image Classification

RRAM-Based Bio-Inspired Circuits for Mobile Epileptic Correlation Extraction and Seizure Prediction

no code implementations29 Jul 2024 Hao Wang, Lingfeng Zhang, Erjia Xiao, Xin Wang, Zhongrui Wang, Renjing Xu

Non-invasive mobile electroencephalography (EEG) acquisition systems have been utilized for long-term monitoring of seizures, yet they suffer from limited battery life.

EEG Seizure prediction

Apple Intelligence Foundation Language Models

no code implementations29 Jul 2024 Tom Gunter, ZiRui Wang, Chong Wang, Ruoming Pang, Aonan Zhang, BoWen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek, Sam Wiseman, Syd Evans, Tao Lei, Vivek Rathod, Xiang Kong, Xianzhi Du, Yanghao Li, Yongqiang Wang, Yuan Gao, Zaid Ahmed, Zhaoyang Xu, Zhiyun Lu, Al Rashid, Albin Madappally Jose, Alec Doane, Alfredo Bencomo, Allison Vanderby, Andrew Hansen, Ankur Jain, Anupama Mann Anupama, Areeba Kamal, Bugu Wu, Carolina Brum, Charlie Maalouf, Chinguun Erdenebileg, Chris Dulhanty, Dominik Moritz, Doug Kang, Eduardo Jimenez, Evan Ladd, Fangping Shi, Felix Bai, Frank Chu, Fred Hohman, Hadas Kotek, Hannah Gillis Coleman, Jane Li, Jeffrey Bigham, Jeffery Cao, Jeff Lai, Jessica Cheung, Jiulong Shan, Joe Zhou, John Li, Jun Qin, Karanjeet Singh, Karla Vega, Kelvin Zou, Laura Heckman, Lauren Gardiner, Margit Bowler, Maria Cordell, Meng Cao, Nicole Hay, Nilesh Shahdadpuri, Otto Godwin, Pranay Dighe, Pushyami Rachapudi, Ramsey Tantawi, Roman Frigg, Sam Davarnia, Sanskruti Shah, Saptarshi Guha, Sasha Sirovica, Shen Ma, Shuang Ma, Simon Wang, Sulgi Kim, Suma Jayaram, Vaishaal Shankar, Varsha Paidi, Vivek Kumar, Xin Wang, Xin Zheng, Walker Cheng, Yael Shrager, Yang Ye, Yasu Tanaka, Yihao Guo, Yunsong Meng, Zhao Tang Luo, Zhi Ouyang, Alp Aygar, Alvin Wan, Andrew Walkingshaw, Andy Narayanan, Antonie Lin, Arsalan Farooq, Brent Ramerth, Colorado Reed, Chris Bartels, Chris Chaney, David Riazati, Eric Liang Yang, Erin Feldman, Gabriel Hochstrasser, Guillaume Seguin, Irina Belousova, Joris Pelemans, Karen Yang, Keivan Alizadeh Vahid, Liangliang Cao, Mahyar Najibi, Marco Zuliani, Max Horton, Minsik Cho, Nikhil Bhendawade, Patrick Dong, Piotr Maj, Pulkit Agrawal, Qi Shan, Qichen Fu, Regan Poston, Sam Xu, Shuangning Liu, Sushma Rao, Tashweena Heeramun, Thomas Merth, Uday Rayala, Victor Cui, Vivek Rangarajan Sridhar, Wencong Zhang, Wenqi Zhang, Wentao Wu, Xingyu Zhou, Xinwen Liu, Yang Zhao, Yin Xia, Zhile Ren, Zhongzheng Ren

We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute.

Language Modelling

A Multimodal Knowledge-enhanced Whole-slide Pathology Foundation Model

no code implementations22 Jul 2024 Yingxue Xu, Yihui Wang, Fengtao Zhou, Jiabo Ma, Shu Yang, Huangjing Lin, Xin Wang, Jiguang Wang, Li Liang, Anjia Han, Ronald Cheong Kin Chan, Hao Chen

To our knowledge, this is the first attempt to incorporate multimodal knowledge at the slide level for enhancing pathology FMs, expanding the modelling context from unimodal to multimodal knowledge and from patch-level to slide-level.

whole slide images

Multi-sentence Video Grounding for Long Video Generation

no code implementations18 Jul 2024 Wei Feng, Xin Wang, Hong Chen, Zeyang Zhang, Wenwu Zhu

(iii) We also attempt video morphing and personalized generation methods to improve the subject consistency of long video generation, providing ablation experimental results for the subtasks of long video generation.

Moment Retrieval Retrieval +4

A Benchmark for Multi-speaker Anonymization

no code implementations8 Jul 2024 Xiaoxiao Miao, Ruijie Tao, Chang Zeng, Xin Wang

To achieve that, a cascaded system uses speaker diarization to aggregate the speech of each speaker and speaker anonymization to conceal speaker privacy and preserve speech content.

Benchmarking Privacy Preserving +2

SfM on-the-fly: Get better 3D from What You Capture

no code implementations4 Jul 2024 Zongqian Zhan, Yifei Yu, Rui Xia, Wentian Gan, Hong Xie, Giulio Perda, Luca Morelli, Fabio Remondino, Xin Wang

In the last twenty years, Structure from Motion (SfM) has been a constant research hotspot in the fields of photogrammetry, computer vision, robotics etc., whereas real-time performance is just a recent topic of growing interest.

Non-Adversarial Learning: Vector-Quantized Common Latent Space for Multi-Sequence MRI

1 code implementation3 Jul 2024 Luyi Han, Tao Tan, Tianyu Zhang, Xin Wang, Yuan Gao, Chunyao Lu, Xinglong Liang, Haoran Dou, Yunzhi Huang, Ritse Mann

We propose a generative model that compresses discrete representations of each sequence to estimate the Gaussian distribution of vector-quantized common (VQC) latent space between multiple sequences.

Contrastive Learning One-Shot Segmentation

Large Language Model Enhanced Knowledge Representation Learning: A Survey

no code implementations1 Jul 2024 Xin Wang, Zirui Chen, Haofen Wang, Leong Hou U, Zhao Li, Wenbin Guo

The integration of Large Language Models (LLM) with Knowledge Representation Learning (KRL) signifies a significant advancement in the field of artificial intelligence (AI), enhancing the ability to capture and utilize both structure and textual information.

Language Modelling Large Language Model +1

PM-VIS+: High-Performance Video Instance Segmentation without Video Annotation

1 code implementation28 Jun 2024 Zhangjing Yang, Dun Liu, Xin Wang, Zhe Li, Barathwaj Anandan, Yi Wu

This method achieves high video instance segmentation performance without manual video annotations, offering a cost-effective solution and new perspectives for video instance segmentation applications.

Instance Segmentation Segmentation +2

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

1 code implementation25 Jun 2024 Lei Chen, Yuan Meng, Chen Tang, Xinzhu Ma, Jingyan Jiang, Xin Wang, Zhi Wang, Wenwu Zhu

Specifically, when quantizing DiT-XL/2 to W8A8 on ImageNet 256x256, Q-DiT achieves a remarkable reduction in FID by 1. 26 compared to the baseline.

Image Generation Quantization

Robustly Optimized Deep Feature Decoupling Network for Fatty Liver Diseases Detection

1 code implementation25 Jun 2024 Peng Huang, Shu Hu, Bo Peng, Jiashu Zhang, Xi Wu, Xin Wang

This can lead to significant differences in recognition accuracy between classes and obvious recognition weaknesses.

Image Classification Medical Image Classification

Towards Lightweight Graph Neural Network Search with Curriculum Graph Sparsification

no code implementations24 Jun 2024 Beini Xie, Heng Chang, Ziwei Zhang, Zeyang Zhang, Simin Wu, Xin Wang, Yuan Meng, Wenwu Zhu

To search for optimal lightweight Graph Neural Networks (GNNs), we propose a Lightweight Graph Neural Architecture Search with Graph SparsIfication and Network Pruning (GASSIP) method.

Graph Neural Network Network Pruning +2

Meta-GCN: A Dynamically Weighted Loss Minimization Method for Dealing with the Data Imbalance in Graph Neural Networks

no code implementations24 Jun 2024 Mahdi Mohammadizadeh, Arash Mozhdehi, Yani Ioannou, Xin Wang

Although many real-world applications, such as disease prediction, and fault detection suffer from class imbalance, most existing graph-based classification methods ignore the skewness of the distribution of classes; therefore, tend to be biased towards the majority class(es).

Disease Prediction Fault Detection +1

Belief Information based Deep Channel Estimation for Massive MIMO Systems

no code implementations23 Jun 2024 Jialong Xu, Liu Liu, Xin Wang, Lan Chen

In the next generation wireless communication system, transmission rates should continue to rise to support emerging scenarios, e. g., the immersive communications.

Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark

1 code implementation21 Jun 2024 Yili Wang, Yixin Liu, Xu Shen, Chenyu Li, Kaize Ding, Rui Miao, Ying Wang, Shirui Pan, Xin Wang

To bridge the gap, in this work, we present a Unified Benchmark for unsupervised Graph-level OOD and anomaly Detection (our method), a comprehensive evaluation framework that unifies GLAD and GLOD under the concept of generalized graph-level OOD detection.

Anomaly Detection Out-of-Distribution Detection +1

Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models

no code implementations21 Jun 2024 Jiayu Wang, Yifei Ming, Zhenmei Shi, Vibhav Vineet, Xin Wang, Neel Joshi

Large language models (LLMs) and vision-language models (VLMs) have demonstrated remarkable performance across a wide range of tasks and domains.

Trustworthy Enhanced Multi-view Multi-modal Alzheimer's Disease Prediction with Brain-wide Imaging Transcriptomics Data

1 code implementation21 Jun 2024 Shan Cong, Zhoujie Fan, Hongwei Liu, Yinghan Zhang, Xin Wang, Haoran Luo, Xiaohui Yao

Here, we propose TMM, a trusted multiview multimodal graph attention framework for AD diagnosis, using extensive brain-wide transcriptomics and imaging data.

Disease Prediction Graph Attention +1

Efficient Sharpness-Aware Minimization for Molecular Graph Transformer Models

1 code implementation ICLR 2024 Yili Wang, Kaixiong Zhou, Ninghao Liu, Ying Wang, Xin Wang

Sharpness-aware minimization (SAM) has received increasing attention in computer vision since it can effectively eliminate the sharp local minima from the training trajectory and mitigate generalization degradation.

D2O: Dynamic Discriminative Operations for Efficient Generative Inference of Large Language Models

no code implementations18 Jun 2024 Zhongwei Wan, Xinjian Wu, Yu Zhang, Yi Xin, Chaofan Tao, Zhihong Zhu, Xin Wang, Siqi Luo, Jing Xiong, Mi Zhang

Efficient inference in Large Language Models (LLMs) is impeded by the growing memory demands of key-value (KV) caching, especially for longer sequences.

Text Generation

Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis

1 code implementation16 Jun 2024 Xin Wang, Tomi Kinnunen, Kong Aik Lee, Paul-Gauthier Noé, Junichi Yamagishi

The outcomes of these findings, namely, the score calibration before fusion, improved linear fusion, and better non-linear fusion, were found to be effective on the SASV challenge database.

Speaker Verification

Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio

1 code implementation12 Jun 2024 Lin Zhang, Xin Wang, Erica Cooper, Mireia Diez, Federico Landini, Nicholas Evans, Junichi Yamagishi

As a pioneering study in spoof diarization, we focus on defining the task, establishing evaluation metrics, and proposing a benchmark model, namely the Countermeasure-Condition Clustering (3C) model.

Clustering

To what extent can ASV systems naturally defend against spoofing attacks?

no code implementations8 Jun 2024 Jee-weon Jung, Xin Wang, Nicholas Evans, Shinji Watanabe, Hye-jin Shim, Hemlata Tak, Sidhhant Arora, Junichi Yamagishi, Joon Son Chung

The current automatic speaker verification (ASV) task involves making binary decisions on two types of trials: target and non-target.

Speaker Verification

Attribute-Aware Implicit Modality Alignment for Text Attribute Person Search

no code implementations6 Jun 2024 Xin Wang, Fangfang Liu, Zheng Li, Caili Guo

Text attribute person search aims to find specific pedestrians through given textual attributes, which is very meaningful in the scene of searching for designated pedestrians through witness descriptions.

Attribute Person Search

Precise Analysis of Covariance Identifiability for Activity Detection in Grant-Free Random Access

no code implementations3 Jun 2024 Shengsong Luo, Junjie Ma, Chongbin Xu, Xin Wang

We consider the identifiability issue of maximum likelihood based activity detection in massive MIMO based grant-free random access.

Action Detection Activity Detection

AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark

1 code implementation2 Jun 2024 Li Lin, Santosh, Xin Wang, Shu Hu

However, no existing dataset comprehensively encompasses both demographic attributes and diverse generative methods, which hinders the development of fair detectors for AI-generated faces.

Face Swapping Fairness

Augmenting Textual Generation via Topology Aware Retrieval

no code implementations27 May 2024 Yu Wang, Nedim Lipka, Ruiyi Zhang, Alexa Siu, Yuying Zhao, Bo Ni, Xin Wang, Ryan Rossi, Tyler Derr

This framework includes a retrieval module that selects texts based on their topological relationships and an aggregation module that integrates these texts into prompts to stimulate LLMs for text generation.

RAG Retrieval +1

Balancing User Preferences by Social Networks: A Condition-Guided Social Recommendation Model for Mitigating Popularity Bias

1 code implementation27 May 2024 Xin He, Wenqi Fan, Ruobing Wang, Yili Wang, Ying Wang, Shirui Pan, Xin Wang

More specifically, CGSoRec first includes a Condition-Guided Social Denoising Model (CSD) to remove redundant social relations in the social network for capturing users' social preferences with items more precisely.

Denoising

Causal-Aware Graph Neural Architecture Search under Distribution Shifts

no code implementations26 May 2024 Peiwen Li, Xin Wang, Zeyang Zhang, Yijian Qin, Ziwei Zhang, Jialong Wang, Yang Li, Wenwu Zhu

We propose to handle the distribution shifts in the graph architecture search process by discovering and exploiting the causal relationship between graphs and architectures to search for the optimal architectures that can generalize under distribution shifts.

Graph Embedding Neural Architecture Search +1

UU-Mamba: Uncertainty-aware U-Mamba for Cardiac Image Segmentation

1 code implementation25 May 2024 Ting Yu Tsai, Li Lin, Shu Hu, Ming-Ching Chang, Hongtu Zhu, Xin Wang

Biomedical image segmentation is critical for accurate identification and analysis of anatomical structures in medical imaging, particularly in cardiac MRI.

Image Segmentation MRI segmentation +2

Rethinking Independent Cross-Entropy Loss For Graph-Structured Data

1 code implementation24 May 2024 Rui Miao, Kaixiong Zhou, Yili Wang, Ninghao Liu, Ying Wang, Xin Wang

We learn the joint distribution of node and cluster labels conditioned on their representations, and train GNNs with the obtained joint loss.

Adversarial Attack Node Classification

DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control

no code implementations21 May 2024 Hong Chen, Xin Wang, YiPeng Zhang, Yuwei Zhou, Zeyang Zhang, Siao Tang, Wenwu Zhu

To tackle the problems, in this paper, we propose DisenStudio, a novel framework that can generate text-guided videos for customized multiple subjects, given few images for each subject.

Attribute Text-to-Video Generation +1

Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances

1 code implementation21 May 2024 Hanlei Zhang, Hua Xu, Fei Long, Xin Wang, Kai Gao

UMC shows remarkable improvements of 2-6\% scores in clustering metrics over state-of-the-art methods, marking the first successful endeavor in this domain.

Clustering Representation Learning

EPPS: Advanced Polyp Segmentation via Edge Information Injection and Selective Feature Decoupling

1 code implementation20 May 2024 Mengqi Lei, Xin Wang

Furthermore, we introduce a component called Selective Feature Decoupler (SFD) to suppress the influence of noise and extraneous features on the model.

Decoder Management +1

Combining multiple post-training techniques to achieve most efficient quantized LLMs

no code implementations12 May 2024 Sayeh Sharify, Zifei Xu, Wanzin Yazar, Xin Wang

Large Language Models (LLMs) have distinguished themselves with outstanding performance in complex language modeling tasks, yet they come with significant computational and storage challenges.

Language Modelling Quantization

PrivSGP-VR: Differentially Private Variance-Reduced Stochastic Gradient Push with Tight Utility Bounds

no code implementations4 May 2024 Zehan Zhu, Yan Huang, Xin Wang, Jinming Xu

In this paper, we propose a differentially private decentralized learning method (termed PrivSGP-VR) which employs stochastic gradient push with variance reduction and guarantees $(\epsilon, \delta)$-differential privacy (DP) for each node.

Tele-FLM Technical Report

no code implementations25 Apr 2024 Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications.

Language Modelling Large Language Model

Optimizing OOD Detection in Molecular Graphs: A Novel Approach with Diffusion Models

no code implementations24 Apr 2024 Xu Shen, Yili Wang, Kaixiong Zhou, Shirui Pan, Xin Wang

In this work, we propose to detect OOD molecules by adopting an auxiliary diffusion model-based framework, which compares similarities between input molecules and reconstructed graphs.

Denoising Graph Reconstruction +1

FedGreen: Carbon-aware Federated Learning with Model Size Adaptation

no code implementations23 Apr 2024 Ali Abbasi, Fan Dong, Xin Wang, Henry Leung, Jiayu Zhou, Steve Drew

Federated learning (FL) provides a promising collaborative framework to build a model from distributed clients, and this work investigates the carbon emission of the FL process.

Federated Learning Model Compression

Mechanisms promoting biodiversity in ecosystems

no code implementations23 Apr 2024 Ju Kang, Yiyuan Niu, Xin Wang

Explaining biodiversity is a central focus in theoretical ecology.

RealTCD: Temporal Causal Discovery from Interventional Data with Large Language Model

no code implementations23 Apr 2024 Peiwen Li, Xin Wang, Zeyang Zhang, Yuan Meng, Fang Shen, Yue Li, Jialong Wang, Yang Li, Wenweu Zhu

In the field of Artificial Intelligence for Information Technology Operations, causal discovery is pivotal for operation and maintenance of graph construction, facilitating downstream industrial tasks such as root cause analysis.

Causal Discovery graph construction +2

FLARE: A New Federated Learning Framework with Adjustable Learning Rates over Resource-Constrained Wireless Networks

no code implementations23 Apr 2024 Bingnan Xiao, Jingjing Zhang, Wei Ni, Xin Wang

Wireless federated learning (WFL) suffers from heterogeneity prevailing in the data distributions, computing powers, and channel conditions of participating devices.

Federated Learning Scheduling

ID-Animator: Zero-Shot Identity-Preserving Human Video Generation

1 code implementation23 Apr 2024 Xuanhua He, Quande Liu, Shengju Qian, Xin Wang, Tao Hu, Ke Cao, Keyu Yan, Jie Zhang

In this study, we present \textbf{ID-Animator}, a zero-shot human-video generation approach that can perform personalized video generation given a single reference facial image without further training.

Attribute Video Generation

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations22 Apr 2024 Marah Abdin, Jyoti Aneja, Hany Awadalla, Ahmed Awadallah, Ammar Ahmad Awan, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Qin Cai, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Weizhu Chen, Yen-Chun Chen, Yi-Ling Chen, Hao Cheng, Parul Chopra, Xiyang Dai, Matthew Dixon, Ronen Eldan, Victor Fragoso, Jianfeng Gao, Mei Gao, Min Gao, Amit Garg, Allie Del Giorno, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Wenxiang Hu, Jamie Huynh, Dan Iter, Sam Ade Jacobs, Mojan Javaheripi, Xin Jin, Nikos Karampatziakis, Piero Kauffmann, Mahoud Khademi, Dongwoo Kim, Young Jin Kim, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Xihui Lin, Zeqi Lin, Ce Liu, Liyuan Liu, Mengchen Liu, Weishung Liu, Xiaodong Liu, Chong Luo, Piyush Madan, Ali Mahmoudzadeh, David Majercak, Matt Mazzola, Caio César Teodoro Mendes, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Liliang Ren, Gustavo de Rosa, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Yelong Shen, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Praneetha Vaddamanu, Chunyu Wang, Guanhua Wang, Lijuan Wang, Shuohang Wang, Xin Wang, Yu Wang, Rachel Ward, Wen Wen, Philipp Witte, Haiping Wu, Xiaoxia Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Jilong Xue, Sonali Yadav, Fan Yang, Jianwei Yang, Yifan Yang, ZiYi Yang, Donghan Yu, Lu Yuan, Chenruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Ranked #5 on MMR total on MRR-Benchmark (using extra training data)

Language Modelling Math +2

Texture-aware and Shape-guided Transformer for Sequential DeepFake Detection

no code implementations22 Apr 2024 Yunfei Li, Yuezun Li, Xin Wang, Jiaran Zhou, Junyu Dong

In this paper, we propose a novel Texture-aware and Shape-guided Transformer to enhance detection performance.

DeepFake Detection Face Swapping

Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images

1 code implementation19 Apr 2024 Santosh, Li Lin, Irene Amerini, Xin Wang, Shu Hu

Diffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields.

Image Generation

Fast Diffeomorphic Image Registration using Patch based Fully Convolutional Networks

no code implementations5 Apr 2024 Jiong Wu, Shuang Zhou, Li Lin, Xin Wang, Wenxue Tan

Diffeomorphic image registration is a fundamental step in medical image analysis, owing to its capability to ensure the invertibility of transformations and preservation of topology.

Image Registration

The VoicePrivacy 2024 Challenge Evaluation Plan

1 code implementation3 Apr 2024 Natalia Tomashenko, Xiaoxiao Miao, Pierre Champion, Sarina Meyer, Xin Wang, Emmanuel Vincent, Michele Panariello, Nicholas Evans, Junichi Yamagishi, Massimiliano Todisco

The task of the challenge is to develop a voice anonymization system for speech data which conceals the speaker's voice identity while protecting linguistic content and emotional states.

Qibo: A Large Language Model for Traditional Chinese Medicine

no code implementations24 Mar 2024 Heyi Zhang, Xin Wang, Zhaopeng Meng, Zhe Chen, Pengwei Zhuang, Yongzhe Jia, Dawei Xu, Wenbin Guo

Large Language Models (LLMs) has made significant progress in a number of professional fields, including medicine, law, and finance.

Language Modelling Large Language Model

Exploring the Potential of Large Language Models in Graph Generation

no code implementations21 Mar 2024 Yang Yao, Xin Wang, Zeyang Zhang, Yijian Qin, Ziwei Zhang, Xu Chu, Yuekui Yang, Wenwu Zhu, Hong Mei

In this paper, we propose LLM4GraphGen to explore the ability of LLMs for graph generation with systematical task designs and extensive experiments.

Drug Discovery Graph Generation +1

When Do We Not Need Larger Vision Models?

2 code implementations19 Mar 2024 Baifeng Shi, Ziyang Wu, Maolin Mao, Xin Wang, Trevor Darrell

Our results show that a multi-scale smaller model has comparable learning capacity to a larger model, and pre-training smaller models with S$^2$ can match or even exceed the advantage of larger models.

Depth Estimation

MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations

1 code implementation16 Mar 2024 Hanlei Zhang, Xin Wang, Hua Xu, Qianrui Zhou, Kai Gao, Jianhua Su, jinyue Zhao, Wenrui Li, Yanting Chen

We believe that MIntRec2. 0 will serve as a valuable resource, providing a pioneering foundation for research in human-machine conversational interactions, and significantly facilitating related applications.

Multimodal Intent Recognition

Robust Light-Weight Facial Affective Behavior Recognition with CLIP

1 code implementation14 Mar 2024 Li Lin, Sarah Papabathini, Xin Wang, Shu Hu

Human affective behavior analysis aims to delve into human expressions and behaviors to deepen our understanding of human emotions.

Robust COVID-19 Detection in CT Images with CLIP

1 code implementation13 Mar 2024 Li Lin, Yamini Sri Krubha, Zhenhuan Yang, Cheng Ren, Thuc Duy Le, Irene Amerini, Xin Wang, Shu Hu

In the realm of medical imaging, particularly for COVID-19 detection, deep learning models face substantial challenges such as the necessity for extensive computational resources, the paucity of well-annotated datasets, and a significant amount of unlabeled data.

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

1 code implementation12 Mar 2024 Xin Wang, Yu Zheng, Zhongwei Wan, Mi Zhang

The advancements in Large Language Models (LLMs) have been hindered by their substantial sizes, which necessitate LLM compression methods for practical deployment.

Language Modelling Large Language Model +1

Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer

1 code implementation11 Mar 2024 Siddhant Satyanaik, Aditya Murali, Deepak Alapatt, Xin Wang, Pietro Mascagni, Nicolas Padoy

Purpose: Advances in deep learning have resulted in effective models for surgical video analysis; however, these models often fail to generalize across medical centers due to domain shift caused by variations in surgical workflow, camera setups, and patient demographics.

Anatomy Disentanglement +3

UAV-Enabled Asynchronous Federated Learning

no code implementations11 Mar 2024 Zhiyuan Zhai, Xiaojun Yuan, Xin Wang, Huiyuan Yang

To exploit unprecedented data generation in mobile edge networks, federated learning (FL) has emerged as a promising alternative to the conventional centralized machine learning (ML).

Federated Learning

Spectral Invariant Learning for Dynamic Graphs under Distribution Shifts

1 code implementation NeurIPS 2023 Zeyang Zhang, Xin Wang, Ziwei Zhang, Zhou Qin, Weigao Wen, Hui Xue, Haoyang Li, Wenwu Zhu

In this paper, we discover that there exist cases with distribution shifts unobservable in the time domain while observable in the spectral domain, and propose to study distribution shifts on dynamic graphs in the spectral domain for the first time.

Link Prediction Node Classification

Unsupervised Graph Neural Architecture Search with Disentangled Self-supervision

no code implementations NeurIPS 2023 Zeyang Zhang, Xin Wang, Ziwei Zhang, Guangyao Shen, Shiqi Shen, Wenwu Zhu

To address the challenge, we propose a novel Disentangled Self-supervised Graph Neural Architecture Search (DSGAS) model, which is able to discover the optimal architectures capturing various latent graph factors in a self-supervised fashion based on unlabeled graph data.

Disentanglement Neural Architecture Search

MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation

no code implementations7 Mar 2024 Zhongwei Wan, Che Liu, Xin Wang, Chaofan Tao, Hui Shen, Zhenwu Peng, Jie Fu, Rossella Arcucci, Huaxiu Yao, Mi Zhang

Electrocardiogram (ECG) is the primary non-invasive diagnostic tool for monitoring cardiac conditions and is crucial in assisting clinicians.

Parameterized quantum comb and simpler circuits for reversing unknown qubit-unitary operations

no code implementations6 Mar 2024 Yin Mo, Lei Zhang, Yu-Ao Chen, Yingjian Liu, Tengxiang Lin, Xin Wang

Quantum comb is an essential tool for characterizing complex quantum protocols in quantum information processing.

Quantum Machine Learning

Neural Radiance Fields in Medical Imaging: Challenges and Next Steps

no code implementations26 Feb 2024 Xin Wang, Shu Hu, Heng Fan, Hongtu Zhu, Xin Li

Neural Radiance Fields (NeRF), as a pioneering technique in computer vision, offer great potential to revolutionize medical imaging by synthesizing three-dimensional representations from the projected two-dimensional image data.

Two-stage Cytopathological Image Synthesis for Augmenting Cervical Abnormality Screening

no code implementations22 Feb 2024 Zhenrong Shen, Manman Fei, Xin Wang, Jiangdong Cai, Sheng Wang, Lichi Zhang, Qian Wang

In the first Global Image Generation stage, a Normal Image Generator is designed to generate cytopathological images full of normal cervical cells.

Cell Detection Data Augmentation +2

HyCubE: Efficient Knowledge Hypergraph 3D Circular Convolutional Embedding

no code implementations14 Feb 2024 Zhao Li, Xin Wang, Jun Zhao, Wenbin Guo, JianXin Li

It is desirable and challenging for knowledge hypergraph embedding to reach a trade-off between model effectiveness and efficiency.

hypergraph embedding

Rethinking Propagation for Unsupervised Graph Domain Adaptation

1 code implementation8 Feb 2024 Meihan Liu, Zeyu Fang, Zhen Zhang, Ming Gu, Sheng Zhou, Xin Wang, Jiajun Bu

Motivated by our empirical analysis, we reevaluate the role of GNNs in graph domain adaptation and uncover the pivotal role of the propagation process in GNNs for adapting to different graph domains.

Domain Adaptation GRAPH DOMAIN ADAPTATION

Failure Analysis in Next-Generation Critical Cellular Communication Infrastructures

no code implementations6 Feb 2024 Siguo Bi, Xin Yuan, Shuyan Hu, Kai Li, Wei Ni, Ekram Hossain, Xin Wang

The advent of communication technologies marks a transformative phase in critical infrastructure construction, where the meticulous analysis of failures becomes paramount in achieving the fundamental objectives of continuity, security, and availability.

Revisiting VAE for Unsupervised Time Series Anomaly Detection: A Frequency Perspective

1 code implementation5 Feb 2024 Zexin Wang, Changhua Pei, Minghua Ma, Xin Wang, Zhihan Li, Dan Pei, Saravan Rajmohan, Dongmei Zhang, QIngwei Lin, Haiming Zhang, Jianhui Li, Gaogang Xie

To ensure an accurate AD, FCVAE exploits an innovative approach to concurrently integrate both the global and local frequency features into the condition of Conditional Variational Autoencoder (CVAE) to significantly increase the accuracy of reconstructing the normal data.

Anomaly Detection Time Series +1

Artificial Intelligence in Image-based Cardiovascular Disease Analysis: A Comprehensive Survey and Future Outlook

no code implementations4 Feb 2024 Xin Wang, Hongtu Zhu

Our review encompasses these modalities, giving a broad perspective on the diverse imaging techniques integrated with AI for CVD analysis.

Masked Conditional Diffusion Model for Enhancing Deepfake Detection

no code implementations1 Feb 2024 Tiewen Chen, Shanmin Yang, Shu Hu, Zhenghan Fang, Ying Fu, Xi Wu, Xin Wang

this paper present we put a new insight into diffusion model-based data augmentation, and propose a Masked Conditional Diffusion Model (MCDM) for enhancing deepfake detection.

Data Augmentation DeepFake Detection +1

Uncertainty-Aware Explainable Recommendation with Large Language Models

no code implementations31 Jan 2024 Yicui Peng, Hao Chen, ChingSheng Lin, Guo Huang, Jinrong Hu, Hui Guo, Bin Kong, Shu Hu, Xi Wu, Xin Wang

Providing explanations within the recommendation system would boost user satisfaction and foster trust, especially by elaborating on the reasons for selecting recommended items tailored to the user.

Explainable Recommendation Multi-Task Learning

Active Generation Network of Human Skeleton for Action Recognition

no code implementations30 Jan 2024 Long Liu, Xin Wang, Fangming Li, Jiayu Chen

To solve those problems, We propose a novel active generative network (AGN), which can adaptively learn various action categories by motion style transfer to generate new actions when the data for a particular action is only a single sample or few samples.

Action Generation Action Recognition +5

Detecting Multimedia Generated by Large AI Models: A Survey

1 code implementation22 Jan 2024 Li Lin, Neeraj Gupta, Yue Zhang, Hainan Ren, Chun-Hao Liu, Feng Ding, Xin Wang, Xin Li, Luisa Verdoliva, Shu Hu

The rapid advancement of Large AI Models (LAIMs), particularly diffusion models and large language models, has marked a new era where AI-generated multimedia is increasingly integrated into various aspects of daily life.

Efficient Image Super-Resolution via Symmetric Visual Attention Network

no code implementations17 Jan 2024 Chengxu Wu, Qinrui Fan, Shu Hu, Xi Wu, Xin Wang, Jing Hu

An important development direction in the Single-Image Super-Resolution (SISR) algorithms is to improve the efficiency of the algorithms.

Image Super-Resolution

Continuous Optical Zooming: A Benchmark for Arbitrary-Scale Image Super-Resolution in Real World

1 code implementation CVPR 2024 Huiyuan Fu, Fei Peng, Xianwei Li, Yejun Li, Xin Wang, Huadong Ma

The extensive experiments demonstrate the superior performance of the arbitrary-scale SR models trained on the COZ dataset compared to models trained on simulated data.

Image Super-Resolution Meta-Learning

Molecular Data Programming: Towards Molecule Pseudo-labeling with Systematic Weak Supervision

no code implementations CVPR 2024 Xin Juan, Kaixiong Zhou, Ninghao Liu, Tianlong Chen, Xin Wang

The premise for the great advancement of molecular machine learning is dependent on a considerable amount of labeled data.

Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos

no code implementations28 Dec 2023 Houlun Chen, Xin Wang, Hong Chen, Zihan Song, Jia Jia, Wenwu Zhu

To tackle these challenges, in this work we propose a Grounding-Prompter method, which is capable of conducting TSG in long videos through prompting LLM with multimodal information.

Denoising In-Context Learning +3

PokeMQA: Programmable knowledge editing for Multi-hop Question Answering

1 code implementation23 Dec 2023 Hengrui Gu, Kaixiong Zhou, Xiaotian Han, Ninghao Liu, Ruobing Wang, Xin Wang

Multi-hop question answering (MQA) is one of the challenging tasks to evaluate machine's comprehension and reasoning abilities, where large language models (LLMs) have widely achieved the human-comparable performance.

Answer Generation knowledge editing +3

LLM4VG: Large Language Models Evaluation for Video Grounding

no code implementations21 Dec 2023 Wei Feng, Xin Wang, Hong Chen, Zeyang Zhang, Zihan Song, Yuwei Zhou, Wenwu Zhu

Recently, researchers have attempted to investigate the capability of LLMs in handling videos and proposed several video LLM models.

Image Captioning Video Grounding +1

In2SET: Intra-Inter Similarity Exploiting Transformer for Dual-Camera Compressive Hyperspectral Imaging

1 code implementation CVPR 2024 Xin Wang, Lizhi Wang, Xiangtian Ma, Maoqing Zhang, Lin Zhu, Hua Huang

Dual-Camera Compressed Hyperspectral Imaging (DCCHI) offers the capability to reconstruct 3D Hyperspectral Image (HSI) by fusing compressive and Panchromatic (PAN) image, which has shown great potential for snapshot hyperspectral imaging in practice.

ConvD: Attention Enhanced Dynamic Convolutional Embeddings for Knowledge Graph Completion

no code implementations11 Dec 2023 Wenbin Guo, Zhao Li, Xin Wang, Zirui Chen

In this paper, we propose a novel dynamic convolutional embedding model ConvD for knowledge graph completion, which directly reshapes the relation embeddings into multiple internal convolution kernels to improve the external convolution kernels of the traditional convolutional embedding model.

Entity Embeddings Relation

Detection and Mitigation of Position Spoofing Attacks on Cooperative UAV Swarm Formations

no code implementations6 Dec 2023 Siguo Bi, Kai Li, Shuyan Hu, Wei Ni, Cong Wang, Xin Wang

Detecting spoofing attacks on the positions of unmanned aerial vehicles (UAVs) within a swarm is challenging.

Position

Efficient Large Language Models: A Survey

3 code implementations6 Dec 2023 Zhongwei Wan, Xin Wang, Che Liu, Samiul Alam, Yu Zheng, Jiachen Liu, Zhongnan Qu, Shen Yan, Yi Zhu, Quanlu Zhang, Mosharaf Chowdhury, Mi Zhang

We hope our survey can serve as a valuable resource to help researchers and practitioners gain a systematic understanding of efficient LLMs research and inspire them to contribute to this important and exciting field.

Natural Language Understanding Text Generation

Virtual Quantum Markov Chains

no code implementations4 Dec 2023 Yu-Ao Chen, Chengkai Zhu, Keming He, Mingrui Jing, Xin Wang

In this work, we propose the concept of virtual quantum Markov chains (VQMCs), focusing on scenarios where subsystems retain classical information about global systems from measurement statistics.

VTimeLLM: Empower LLM to Grasp Video Moments

1 code implementation CVPR 2024 Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu

Large language models (LLMs) have shown remarkable text understanding capabilities, which have been extended as Video LLMs to handle video data for comprehending visual details.

Dense Video Captioning VCGBench-Diverse +6

OFDMA-F$^2$L: Federated Learning With Flexible Aggregation Over an OFDMA Air Interface

no code implementations25 Nov 2023 Shuyan Hu, Xin Yuan, Wei Ni, Xin Wang, Ekram Hossain, H. Vincent Poor

Federated learning (FL) can suffer from a communication bottleneck when deployed in mobile networks, limiting participating clients and deterring FL convergence.

Federated Learning

Out-of-Distribution Generalized Dynamic Graph Neural Network with Disentangled Intervention and Invariance Promotion

no code implementations24 Nov 2023 Zeyang Zhang, Xin Wang, Ziwei Zhang, Haoyang Li, Wenwu Zhu

In this paper, we propose Disentangled Intervention-based Dynamic graph Attention networks with Invariance Promotion (I-DIDA) to handle spatio-temporal distribution shifts in dynamic graphs by discovering and utilizing invariant patterns, i. e., structures and features whose predictive abilities are stable across distribution shifts.

Graph Attention Graph Neural Network

Self-organized biodiversity in biotic resource systems

no code implementations23 Nov 2023 Ju Kang, Shijie Zhang, Yiyuan Niu, Xin Wang

What determines biodiversity in nature is a prominent issue in ecology, especially in biotic resource systems that are typically devoid of cross-feeding.

Adversarial Prompt Tuning for Vision-Language Models

1 code implementation19 Nov 2023 Jiaming Zhang, Xingjun Ma, Xin Wang, Lingyu Qiu, Jiaqi Wang, Yu-Gang Jiang, Jitao Sang

With the rapid advancement of multimodal learning, pre-trained Vision-Language Models (VLMs) such as CLIP have demonstrated remarkable capacities in bridging the gap between visual and language modalities.

Adversarial Robustness

MeLo: Low-rank Adaptation is Better than Fine-tuning for Medical Image Diagnosis

1 code implementation14 Nov 2023 Yitao Zhu, Zhenrong Shen, Zihao Zhao, Sheng Wang, Xin Wang, Xiangyu Zhao, Dinggang Shen, Qian Wang

By fixing the weight of ViT models and only adding small low-rank plug-ins, we achieve competitive results on various diagnosis tasks across different imaging modalities using only a few trainable parameters.

UMedNeRF: Uncertainty-aware Single View Volumetric Rendering for Medical Neural Radiance Fields

no code implementations10 Nov 2023 Jing Hu, Qinrui Fan, Shu Hu, Siwei Lyu, Xi Wu, Xin Wang

In the field of clinical medicine, computed tomography (CT) is an effective medical imaging modality for the diagnosis of various pathologies.

Computed Tomography (CT)

Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing

no code implementations10 Nov 2023 Siao Tang, Xin Wang, Hong Chen, Chaoyu Guan, Zewen Wu, Yansong Tang, Wenwu Zhu

In this paper, we propose a novel post-training quantization method PCR (Progressive Calibration and Relaxing) for text-to-image diffusion models, which consists of a progressive calibration strategy that considers the accumulated quantization error across timesteps, and an activation relaxing strategy that improves the performance with negligible cost.

Quantization

Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search

no code implementations8 Nov 2023 Siao Tang, Xin Wang, Hong Chen, Chaoyu Guan, Yansong Tang, Wenwu Zhu

When retraining the searched architecture, we adopt a dynamic joint loss to maintain the consistency between supernet training and subnet retraining, which also provides informative objectives for each block and shortens the paths of gradient propagation.

Neural Architecture Search

3D Pose Estimation of Tomato Peduncle Nodes using Deep Keypoint Detection and Point Cloud

no code implementations8 Nov 2023 Jianchao Ci, Xin Wang, David Rapado-Rincón, Akshay K. Burusa, Gert Kootstra

A 21 comprehensive evaluation was conducted in a commercial greenhouse to gain insight into the 22 performance of different parts of the method.

3D Pose Estimation Keypoint Detection

Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models

no code implementations7 Nov 2023 Longteng Zhang, Xiang Liu, Zeyu Li, Xinglin Pan, Peijie Dong, Ruibo Fan, Rui Guo, Xin Wang, Qiong Luo, Shaohuai Shi, Xiaowen Chu

For end users, our benchmark and findings help better understand different optimization techniques, training and inference frameworks, together with hardware platforms in choosing configurations for deploying LLMs.

Quantization

VideoDreamer: Customized Multi-Subject Text-to-Video Generation with Disen-Mix Finetuning

no code implementations2 Nov 2023 Hong Chen, Xin Wang, Guanning Zeng, YiPeng Zhang, Yuwei Zhou, Feilin Han, Wenwu Zhu

The video generator is further customized for the given multiple subjects by the proposed Disen-Mix Finetuning and Human-in-the-Loop Re-finetuning strategy, which can tackle the attribute binding problem of multi-subject generation.

Attribute Text-to-Video Generation +1

A Systematic Review for Transformer-based Long-term Series Forecasting

no code implementations31 Oct 2023 Liyilei Su, Xumin Zuo, Rui Li, Xin Wang, Heng Zhao, Bingding Huang

Various variants have enabled transformer architecture to effectively handle long-term time series forecasting (LTSF) tasks.

Time Series Time Series Forecasting

Towards Generalized Multi-stage Clustering: Multi-view Self-distillation

no code implementations29 Oct 2023 Jiatai Wang, Zhiwei Xu, Xin Wang, Tao Li

MVC aims at exploring common semantics and pseudo-labels from multiple views and clustering in a self-supervised manner.

Clustering Contrastive Learning +1

Hierarchical Mutual Information Analysis: Towards Multi-view Clustering in The Wild

no code implementations28 Oct 2023 Jiatai Wang, Zhiwei Xu, Xuewen Yang, Xin Wang

Multi-view clustering (MVC) can explore common semantics from unsupervised views generated by different sources, and thus has been extensively used in applications of practical computer vision.

Clustering

Disentangled Representation Learning with Large Language Models for Text-Attributed Graphs

no code implementations27 Oct 2023 Yijian Qin, Xin Wang, Ziwei Zhang, Wenwu Zhu

Text-attributed graphs (TAGs) are prevalent on the web and research over TAGs such as citation networks, e-commerce networks and social networks has attracted considerable attention in the web community.

Graph Neural Network Representation Learning

LLM4DyG: Can Large Language Models Solve Spatial-Temporal Problems on Dynamic Graphs?

1 code implementation26 Oct 2023 Zeyang Zhang, Xin Wang, Ziwei Zhang, Haoyang Li, Yijian Qin, Wenwu Zhu

Our main observations are: 1) LLMs have preliminary spatial-temporal understanding abilities on dynamic graphs, 2) Dynamic graph tasks show increasing difficulties for LLMs as the graph size and density increase, while not sensitive to the time span and data generation mechanism, 3) the proposed DST2 prompting method can help to improve LLMs' spatial-temporal understanding abilities on dynamic graphs for most tasks.

Self-triggered Consensus Control of Multi-agent Systems from Data

no code implementations19 Oct 2023 Yifei Li, Xin Wang, Jian Sun, Gang Wang, Jie Chen

In the presence of external disturbances, a model-based STC scheme is put forth for $\mathcal{H}_{\infty}$-consensus of MASs, serving as a baseline for the data-driven STC.

Provable Advantage of Parameterized Quantum Circuit in Function Approximation

no code implementations11 Oct 2023 Zhan Yu, Qiuhao Chen, Yuling Jiao, Yinan Li, Xiliang Lu, Xin Wang, Jerry Zhijian Yang

To achieve this, we utilize techniques from quantum signal processing and linear combinations of unitaries to construct PQCs that implement multivariate polynomials.

Quantum Machine Learning

Decentralized Federated Learning via MIMO Over-the-Air Computation: Consensus Analysis and Performance Optimization

no code implementations8 Oct 2023 Zhiyuan Zhai, Xiaojun Yuan, Xin Wang

We conduct a general convergence analysis to quantitatively capture the influence of aggregation weight and communication error on the MIMO OA-DFL performance in \emph{ad hoc} networks.

Distributed Optimization Federated Learning

RAC-BERT: Character Radical Enhanced BERT for Ancient Chinese

no code implementations journal 2023 Lifan Han, Xin Wang, Meng Wang, Zhao Li, Heyi Zhang, Zirui Chen, Xiaowang Zhang

The results show that our model significantly outperforms the state-of-the-art models on most tasks.