Search Results for author: Xi Zhang

Found 100 papers, 27 papers with code

Navigating the Black Box: Leveraging LLMs for Effective Text-Level Graph Injection Attacks

no code implementations16 Jun 2025 Yuefei Lyu, Chaozhuo Li, Xi Zhang, Tianle Zhang

This method efficiently perturbs the target node with minimal training costs in a strict black-box setting, ensuring a text-level graph injection attack for TAGs.

Recommendation Systems TAG +1

Structured Labeling Enables Faster Vision-Language Models for End-to-End Autonomous Driving

no code implementations5 Jun 2025 Hao Jiang, Chuan Hu, Yukang Shi, Yuan He, Ke Wang, Xi Zhang, Zhipeng Zhang

In contrast to existing VLMs with over 7B parameters and unstructured language processing(e. g., LLaVA-1. 5), FastDrive understands structured and concise descriptions and generates machine-friendly driving decisions with high efficiency.

Autonomous Driving Decision Making

Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation

no code implementations5 Jun 2025 Yuyang Wanyan, Xi Zhang, Haiyang Xu, Haowei Liu, Junyang Wang, Jiabo Ye, Yutong Kou, Ming Yan, Fei Huang, Xiaoshan Yang, WeiMing Dong, Changsheng Xu

In recent years, Multimodal Large Language Models (MLLMs) have been extensively utilized for multimodal reasoning tasks, including Graphical User Interface (GUI) automation.

Decision Making Multimodal Reasoning

Hadaptive-Net: Efficient Vision Models via Adaptive Cross-Hadamard Synergy

no code implementations28 May 2025 Xuyang Zhang, Xi Zhang, Liang Chen, Hao Shi, Qingshan Guo

Recent studies have revealed the immense potential of Hadamard product in enhancing network representational capacity and dimensional compression.

3DGS Compression with Sparsity-guided Hierarchical Transform Coding

no code implementations28 May 2025 Hao Xu, Xiaolin Wu, Xi Zhang

3D Gaussian Splatting (3DGS) has gained popularity for its fast and high-quality rendering, but it has a very large memory footprint incurring high transmission and storage overhead.

3DGS

Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation

no code implementations20 May 2025 Junyang Wang, Haiyang Xu, Xi Zhang, Ming Yan, Ji Zhang, Fei Huang, Jitao Sang

The exponential rise in mobile device usage necessitates streamlined automation for effective task management, yet many AI frameworks fall short due to inadequate operational expertise.

Management

MirrorShield: Towards Universal Defense Against Jailbreaks via Entropy-Guided Mirror Crafting

no code implementations17 Mar 2025 Rui Pu, Chaozhuo Li, Rui Ha, Litian Zhang, Lirong Qiu, Xi Zhang

Defending large language models (LLMs) against jailbreak attacks is crucial for ensuring their safe deployment.

Composite Nonlinear Trajectory Tracking Control of Co-Driving Vehicles Using Self-Triggered Adaptive Dynamic Programming

no code implementations5 Mar 2025 Chuan Hu, Sicheng Ge, Yingkui Shi, Weinan Gao, Wenfeng Guo, Xi Zhang

This article presents a composite nonlinear feedback (CNF) control method using self-triggered (ST) adaptive dynamic programming (ADP) algorithm in a human-machine shared steering framework.

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

2 code implementations20 Feb 2025 Haowei Liu, Xi Zhang, Haiyang Xu, Yuyang Wanyan, Junyang Wang, Ming Yan, Ji Zhang, Chunfeng Yuan, Changsheng Xu, Weiming Hu, Fei Huang

From the decision-making perspective, to handle complex user instructions and interdependent subtasks more effectively, we propose a hierarchical multi-agent collaboration architecture that decomposes decision-making processes into Instruction-Subtask-Action levels.

Decision Making

SCCD: A Session-based Dataset for Chinese Cyberbullying Detection

1 code implementation25 Jan 2025 Qingpo Yang, Yakai Chen, Zihui Xu, Yu-Ming Shang, Sanchuan Guo, Xi Zhang

The rampant spread of cyberbullying content poses a growing threat to societal well-being.

Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks

1 code implementation20 Jan 2025 Zhenhailong Wang, Haiyang Xu, Junyang Wang, Xi Zhang, Ming Yan, Ji Zhang, Fei Huang, Heng Ji

By hierarchical, we mean an explicit separation of high-level planning and low-level action execution.

Analyzing the Role of the DSO in Electricity Trading of VPPs via a Stackelberg Game Model

1 code implementation13 Jan 2025 Peng Wang, Xi Zhang, Luis Badesa

In order to study the role of DSO as a stakeholder, a Stackelberg game is represented via a bi-level model: the DSO maximizes profits at the upper level, while the VPPs minimize operating costs at the lower level.

Bilevel Optimization

TSUBF-Net: Trans-Spatial UNet-like Network with Bi-direction Fusion for Segmentation of Adenoid Hypertrophy in CT

no code implementations1 Dec 2024 Rulin Zhou, Yingjie Feng, Guankun Wang, Xiaopin Zhong, Zongze Wu, Qiang Wu, Xi Zhang

The results in the other two public datasets also demonstrate that our methods can robustly and effectively address the challenges of 3D segmentation in CT scans.

Computed Tomography (CT) Image Segmentation +3

Libra: Leveraging Temporal Images for Biomedical Radiology Analysis

1 code implementation28 Nov 2024 Xi Zhang, Zaiqiao Meng, Jake Lever, Edmond S. L. Ho

Radiology report generation (RRG) is a challenging task, as it requires a thorough understanding of medical images, integration of multiple temporal inputs, and accurate report generation.

Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression

no code implementations25 Nov 2024 Xi Zhang, Xiaolin Wu

It is customary to deploy uniform scalar quantization in the end-to-end optimized Neural image compression methods, instead of more powerful vector quantization, due to the high complexity of the latter.

Computational Efficiency Image Compression +1

Development of a Comprehensive Physics-Based Battery Model and Its Multidimensional Comparison with an Equivalent-Circuit Model: Accuracy, Complexity, and Real-World Performance under Varying Conditions

no code implementations19 Nov 2024 Guodong Fan, Boru Zhou, Chengwen Meng, Tengwei Pang, Xi Zhang, Mingshu Du, Wei Zhao

This paper develops a comprehensive physics-based model (PBM) that spans a wide operational range, including varying temperatures, charge/discharge conditions, and real-world field data cycles.

Trajectory Flow Matching with Applications to Clinical Time Series Modeling

1 code implementation28 Oct 2024 Xi Zhang, Yuan Pu, Yuki Kawamura, Andrew Loza, Yoshua Bengio, Dennis L. Shung, Alexander Tong

To address this, we propose Trajectory Flow Matching (TFM), which trains a Neural SDE in a simulation-free manner, bypassing backpropagation through the dynamics.

Time Series

CCDepth: A Lightweight Self-supervised Depth Estimation Network with Enhanced Interpretability

no code implementations30 Sep 2024 Xi Zhang, Yaru Xue, Shaocheng Jia, Xin Pei

To mitigate these issues, this study proposes a novel hybrid self-supervised depth estimation network, CCDepth, comprising convolutional neural networks (CNNs) and the white-box CRATE (Coding RAte reduction TransformEr) network.

Depth Estimation

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

no code implementations26 Aug 2024 Lazar Atanackovic, Xi Zhang, Brandon Amos, Mathieu Blanchette, Leo J. Lee, Yoshua Bengio, Alexander Tong, Kirill Neklyudov

Flow-based models allow for learning these dynamics at the population level - they model the evolution of the entire distribution of samples.

Graph Neural Network

Fast Point Cloud Geometry Compression with Context-based Residual Coding and INR-based Refinement

1 code implementation6 Aug 2024 Hao Xu, Xi Zhang, Xiaolin Wu

This gives us a means to determine the spatial context in which the latent features of 3D points are compressed by arithmetic coding.

Quantization

Beyond Entity Alignment: Towards Complete Knowledge Graph Alignment via Entity-Relation Synergy

no code implementations25 Jul 2024 Xiaohan Fang, Chaozhuo Li, Yi Zhao, Qian Zang, Litian Zhang, Jiquan Peng, Xi Zhang, Jibing Gong

Knowledge Graph Alignment (KGA) aims to integrate knowledge from multiple sources to address the limitations of individual Knowledge Graphs (KGs) in terms of coverage and depth.

Entity Alignment Knowledge Graphs +1

MIBench: Evaluating Multimodal Large Language Models over Multiple Images

no code implementations21 Jul 2024 Haowei Liu, Xi Zhang, Haiyang Xu, Yaya Shi, Chaoya Jiang, Ming Yan, Ji Zhang, Fei Huang, Chunfeng Yuan, Bing Li, Weiming Hu

However, most existing MLLMs and benchmarks primarily focus on single-image input scenarios, leaving the performance of MLLMs when handling realistic multiple images underexplored.

In-Context Learning Multiple-choice

GPT4Rec: Graph Prompt Tuning for Streaming Recommendation

no code implementations12 Jun 2024 Peiyan Zhang, Yuchen Yan, Xi Zhang, Liying Kang, Chaozhuo Li, Feiran Huang, Senzhang Wang, Sunghun Kim

Secondly, structure-level prompts guide the model in adapting to broader patterns of connectivity and relationships within the graph.

Graph Learning Recommendation Systems

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

2 code implementations3 Jun 2024 Junyang Wang, Haiyang Xu, Haitao Jia, Xi Zhang, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

However, the two major navigation challenges in mobile device operation tasks, task progress navigation and focus content navigation, are significantly complicated under the single-agent architecture of existing work.

Uncertainty Quantification on Graph Learning: A Survey

no code implementations23 Apr 2024 Chao Chen, Chenghua Guo, Rui Xu, Xiangwen Liao, Xi Zhang, Sihong Xie, Hui Xiong, Philip Yu

Graphical models, including Graph Neural Networks (GNNs) and Probabilistic Graphical Models (PGMs), have demonstrated their exceptional capabilities across numerous fields.

Decision Making Graph Learning +2

Explainable Survival Analysis with Uncertainty using Convolution-Involved Vision Transformer

no code implementations journal 2024 Zhihao Tang, Li Liu, Yifan Shen, Zongyi Chen, Guixiang Ma, Jiyan Dong, Xujie Sun, Xi Zhang, Chaozhuo Li, Qingfeng Zheng, Lin Yang

Highlights•Without patching WSIs, a novel ViT-based model is proposed for survival predictions.•We first introduce aleatoric uncertainty into the survival loss function.•We explain survival prediction using a post-hoc explainable method.•Our method outperforms baselines in accuracy, explainability, and reliability.

Survival Analysis Survival Prediction

Design as Desired: Utilizing Visual Question Answering for Multimodal Pre-training

2 code implementations30 Mar 2024 Tongkun Su, Jun Li, Xi Zhang, Haibo Jin, Hao Chen, Qiong Wang, Faqin Lv, Baoliang Zhao, Yin Hu

We leverage descriptions in medical reports to design multi-granular question-answer pairs associated with different diseases, which assist the framework in pre-training without requiring extra annotations from experts.

Contrastive Learning Question Answering +1

FineFake: A Knowledge-Enriched Dataset for Fine-Grained Multi-Domain Fake News Detection

1 code implementation30 Mar 2024 Ziyi Zhou, XiaoMing Zhang, Litian Zhang, Jiacheng Liu, Senzhang Wang, Zheng Liu, Xi Zhang, Chaozhuo Li, Philip S. Yu

Existing benchmarks for fake news detection have significantly contributed to the advancement of models in assessing the authenticity of news content.

Domain Adaptation Fake News Detection

A Differential Geometric View and Explainability of GNN on Evolving Graphs

no code implementations11 Mar 2024 Yazheng Liu, Xi Zhang, Sihong Xie

Graphs are ubiquitous in social networks and biochemistry, where Graph Neural Networks (GNN) are the state-of-the-art models for prediction.

Graph Classification Link Prediction +1

FLLIC: Functionally Lossless Image Compression

no code implementations24 Jan 2024 Xi Zhang, Xiaolin Wu

Extensive experiments show that FLLIC achieves state-of-the-art performance in joint denoising and compression of noisy images and does so at a lower computational cost.

Denoising Image Compression

A Survey of Text Watermarking in the Era of Large Language Models

no code implementations13 Dec 2023 Aiwei Liu, Leyi Pan, Yijian Lu, Jingjing Li, Xuming Hu, Xi Zhang, Lijie Wen, Irwin King, Hui Xiong, Philip S. Yu

This paper conducts a comprehensive survey of the current state of text watermarking technology, covering four main aspects: (1) an overview and comparison of different text watermarking techniques; (2) evaluation methods for text watermarking algorithms, including their detectability, impact on text or LLM quality, robustness under target or untargeted attacks; (3) potential application scenarios for text watermarking technology; (4) current challenges and future directions for text watermarking.

Dialogue Generation Survey

Visual Commonsense based Heterogeneous Graph Contrastive Learning

no code implementations11 Nov 2023 Zongzhao Li, Xiangyu Zhu, Xi Zhang, Zhaoxiang Zhang, Zhen Lei

Specifically, our model contains two key components: the Commonsense-based Contrastive Learning and the Graph Relation Network.

Contrastive Learning Question Answering +4

TransGNN: Harnessing the Collaborative Power of Transformers and Graph Neural Networks for Recommender Systems

1 code implementation28 Aug 2023 Peiyan Zhang, Yuchen Yan, Xi Zhang, Chaozhuo Li, Senzhang Wang, Feiran Huang, Sunghun Kim

Graph Neural Networks (GNNs) have emerged as promising solutions for collaborative filtering (CF) through the modeling of user-item interaction graphs.

Collaborative Filtering Graph Classification +1

Robust Ranking Explanations

no code implementations8 Jul 2023 Chao Chen, Chenghua Guo, Guixiang Ma, Ming Zeng, Xi Zhang, Sihong Xie

Robust explanations of machine learning models are critical to establish human trust in the models.

Inconsistent Matters: A Knowledge-guided Dual-consistency Network for Multi-modal Rumor Detection

1 code implementation3 Jun 2023 Mengzhu Sun, Xi Zhang, Jianqiang Ma, Sihong Xie, Yazheng Liu, Philip S. Yu

Rumor spreaders are increasingly utilizing multimedia content to attract the attention and trust of news consumers.

Representation Learning

LVQAC: Lattice Vector Quantization Coupled with Spatially Adaptive Companding for Efficient Learned Image Compression

no code implementations CVPR 2023 Xi Zhang, Xiaolin Wu

Recently, numerous end-to-end optimized image compression neural networks have been developed and proved themselves as leaders in rate-distortion performance.

Image Compression Quantization

HR-NeuS: Recovering High-Frequency Surface Geometry via Neural Implicit Surfaces

no code implementations14 Feb 2023 Erich Liang, Kenan Deng, Xi Zhang, Chun-Kai Wang

Recent advances in neural implicit surfaces for multi-view 3D reconstruction primarily focus on improving large-scale surface reconstruction accuracy, but often produce over-smoothed geometries that lack fine surface details.

3D Reconstruction Multi-View 3D Reconstruction +2

Dual-layer Image Compression via Adaptive Downsampling and Spatially Varying Upconversion

no code implementations13 Feb 2023 Xi Zhang, Xiaolin Wu

In the ADDL compression system, an image is reduced in resolution by learned content-adaptive downsampling kernels and compressed to form a coded base layer.

Decoder Image Compression

Word-Graph2vec: An efficient word embedding approach on word co-occurrence graph using random walk technique

no code implementations11 Jan 2023 Wenting Li, Jiahong Xue, Xi Zhang, Huacan Chen, Zeyu Chen, Feijuan Huang, Yuanzhe Cai

Word embedding has become ubiquitous and is widely used in various natural language processing (NLP) tasks, such as web retrieval, web semantic analysis, and machine translation, and so on.

Information Retrieval Machine Translation +1

VQACL: A Novel Visual Question Answering Continual Learning Setting

1 code implementation CVPR 2023 Xi Zhang, Feifei Zhang, Changsheng Xu

Research on continual learning has recently led to a variety of work in unimodal community, however little attention has been paid to multimodal tasks like visual question answering (VQA).

Continual Learning Question Answering +2

Provable Robust Saliency-based Explanations

no code implementations28 Dec 2022 Chao Chen, Chenghua Guo, Rufeng Chen, Guixiang Ma, Ming Zeng, Xiangwen Liao, Xi Zhang, Sihong Xie

To foster trust in machine learning models, explanations must be faithful and stable for consistent insights.

Unsupervised Scene Sketch to Photo Synthesis

1 code implementation6 Sep 2022 Jiayun Wang, Sangryul Jeon, Stella X. Yu, Xi Zhang, Himanshu Arora, Yu Lou

Taking this advantage, we synthesize a photo-realistic image by combining the structure of a sketch and the visual style of a reference photo.

MFAN: Multi-modal Feature-enhanced Attention Networks for Rumor Detection

1 code implementation 2022 2022 Jiaqi Zheng, Xi Zhang, Sanchuan Guo, Quan Wang, Wenyu Zang, Yongdong Zhang

Rumor spreaders are increasingly taking advantage of multimedia content to attract and mislead news consumers on social media.

Heterogeneous Information Network based Default Analysis on Banking Micro and Small Enterprise Users

no code implementations24 Apr 2022 Zheng Zhang, Yingsheng Ji, Jiachen Shen, Xi Zhang, Guangwen Yang

Risk assessment is a substantial problem for financial institutions that has been extensively studied both for its methodological richness and its various practical applications.

Feature Engineering Implicit Relations

Structured Graph Variational Autoencoders for Indoor Furniture layout Generation

no code implementations11 Apr 2022 Aditya Chattopadhyay, Xi Zhang, David Paul Wipf, Himanshu Arora, Rene Vidal

The architecture consists of a graph encoder that maps the input graph to a structured latent space, and a graph decoder that generates a furniture graph, given a latent code and the room graph.

Decoder Layout Generation

Values of Coordinated Residential Space Heating in Demand Response Provision

no code implementations23 Mar 2022 Zihang Dong, Xi Zhang, Goran Strbac

Demand-side response from space heating in residential buildings can potentially provide a huge amount of flexibility for the power system, particularly with deep electrification of the heat sector.

Deep Decoding of $\ell_\infty$-coded Light Field Images

no code implementations24 Jan 2022 Muhammad Umair Mukati, Xi Zhang, Xiaolin Wu, Søren Forchhammer

To enrich the functionalities of traditional cameras, light field cameras record both the intensity and direction of light rays, so that images can be rendered with user-defined camera parameters via computations.

Decoder Image Compression

Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection

no code implementations15 Jan 2022 Yuefei Lyu, Xiaoyu Yang, Jiaxin Liu, Philip S. Yu, Sihong Xie, Xi Zhang

To discover subtle vulnerabilities, we design a powerful attacking algorithm to camouflage rumors in social networks based on reinforcement learning that can interact with and attack any black-box detectors.

reinforcement-learning Reinforcement Learning (RL)

Multi-objective Explanations of GNN Predictions

no code implementations29 Nov 2021 Yifei Liu, Chao Chen, Yazheng Liu, Xi Zhang, Sihong Xie

We design a user study to investigate such joint effects and use the findings to design a multi-objective optimization (MOO) algorithm to find Pareto optimal explanations that are well-balanced in simulatability and counterfactual.

counterfactual Decision Making +2

Explaining GNN over Evolving Graphs using Information Flow

no code implementations19 Nov 2021 Yazheng Liu, Xi Zhang, Sihong Xie

We define the problem of explaining evolving GNN predictions and propose an axiomatic attribution method to uniquely decompose the change in a prediction to paths on computation graphs.

Knowledge Graphs

RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View

no code implementations1 Oct 2021 Xi Zhang, Chun-Kai Wang, Kenan Deng, Tomas Yago-Vicente, Himanshu Arora

In addition to using learnt robust features, our approach learns an additional ranking function to estimate the final layout instead of using optimization.

Learning-To-Rank

Self-learn to Explain Siamese Networks Robustly

no code implementations15 Sep 2021 Chao Chen, Yifan Shen, Guixiang Ma, Xiangnan Kong, Srinivas Rangarajan, Xi Zhang, Sihong Xie

Learning to compare two objects are essential in applications, such as digital forensics, face recognition, and brain network analysis, especially when labeled data is scarce and imbalanced.

Face Recognition Fairness +1

Multi-modality Deep Restoration of Extremely Compressed Face Videos

no code implementations5 Jul 2021 Xi Zhang, Xiaolin Wu

Arguably the most common and salient object in daily video communications is the talking head, as encountered in social media, virtual classrooms, teleconferences, news broadcasting, talk shows, etc.

Quantization

THP: Topological Hawkes Processes for Learning Causal Structure on Event Sequences

2 code implementations23 May 2021 Ruichu Cai, Siyu Wu, Jie Qiao, Zhifeng Hao, Keli Zhang, Xi Zhang

We further propose a causal structure learning method on THP in a likelihood framework.

An Influence-based Approach for Root Cause Alarm Discovery in Telecom Networks

1 code implementation7 May 2021 Keli Zhang, Marcus Kalander, Min Zhou, Xi Zhang, Junjian Ye

Alarm root cause analysis is a significant component in the day-to-day telecommunication network maintenance, and it is critical for efficient and accurate fault localization and failure recovery.

Causal Inference Fault localization +2

Probing quasi-long-range ordering by magnetostriction in monolayer CoPS3

no code implementations4 Jan 2021 Qiye Liu, Le Wang, Ying Fu, Xi Zhang, Lianglong Huang, Huimin Su, Junhao Lin, Xiaobin Chen, Dapeng Yu, Xiaodong Cui, Jia-Wei Mei, Jun-Feng Dai

Mermin-Wagner-Coleman theorem predicts no long-range magnetic order at finite temperature in the two-dimensional (2D) isotropic systems, but a quasi-long-range order with a divergent correlation length at the Kosterlitz-Thouless (KT) transition for planar magnets.

Mesoscale and Nanoscale Physics

Haze Formation on Triton

no code implementations22 Dec 2020 Kazumasa Ohno, Xi Zhang, Ryo Tazaki, Satoshi Okuzumi

We simulated the formation of sphere and aggregate hazes with and without condensation of the C$_2$H$_4$ ice.

Earth and Planetary Astrophysics

On Numerosity of Deep Neural Networks

no code implementations NeurIPS 2020 Xi Zhang, Xiaolin Wu

With the above critique we ask the question what if a deep convolutional neural network is carefully trained for numerosity?

Object Recognition

Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos

no code implementations2 Aug 2020 Yanhui Guo, Xi Zhang, Xiaolin Wu

We propose a novel deep multi-modality neural network for restoring very low bit rate videos of talking heads.

Quantization Video Compression +1

Local Causal Structure Learning and its Discovery Between Type 2 Diabetes and Bone Mineral Density

no code implementations27 Jun 2020 Wei Wang, Gangqiang Hu, Bo Yuan, Shandong Ye, Chao Chen, YaYun Cui, Xi Zhang, Liting Qian

To illustrate the importance of prior knowledge, the result of the algorithm without prior knowledge is also investigated.

DAVD-Net: Deep Audio-Aided Video Decompression of Talking Heads

no code implementations CVPR 2020 Xi Zhang, Xiaolin Wu, Xinliang Zhai, Xianye Ben, Chengjie Tu

Close-up talking heads are among the most common and salient object in video contents, such as face-to-face conversations in social media, teleconferences, news broadcasting, talk shows, etc.

Video Compression Video Reconstruction

Rigorous Explanation of Inference on Probabilistic Graphical Models

no code implementations21 Apr 2020 Yifei Liu, Chao Chen, Xi Zhang, Sihong Xie

There is no existing method to rigorously attribute the inference outcomes to the contributing factors of the graphical models.

Attribute Decision Making

Ultra High Fidelity Image Compression with $\ell_\infty$-constrained Encoding and Deep Decoding

no code implementations10 Feb 2020 Xi Zhang, Xiaolin Wu

We make a major progress in $\ell_\infty$-constrained image coding after two decades, by developing a novel CNN-based soft $\ell_\infty$-constrained decoding method.

Decoder Image Compression

MDLdroid: a ChainSGD-reduce Approach to Mobile Deep Learning for Personal Mobile Sensing

no code implementations7 Feb 2020 Yu Zhang, Tao Gu, Xi Zhang

Towards pushing deep learning on devices, we present MDLdroid, a novel decentralized mobile deep learning framework to enable resource-aware on-device collaborative learning for personal mobile sensing applications.

Deep Learning Federated Learning +2

Scalable Explanation of Inferences on Large Graphs

no code implementations13 Aug 2019 Chao Chen, Yifei Liu, Xi Zhang, Sihong Xie

Probabilistic inferences distill knowledge from graphs to aid human make important decisions.

Challenge of Spatial Cognition for Deep Learning

no code implementations30 Jul 2019 Xi Zhang, Xiaolin Wu, Jun Du

Given the success of the deep convolutional neural networks (DCNNs) in applications of visual recognition and classification, it would be tantalizing to test if DCNNs can also learn spatial concepts, such as straightness, convexity, left/right, front/back, relative size, aspect ratio, polygons, etc., from varied visual examples of these concepts that are simple and yet vital for spatial reasoning.

Deep Learning Spatial Reasoning

Nonlinear Prediction of Multidimensional Signals via Deep Regression with Applications to Image Coding

no code implementations30 Oct 2018 Xi Zhang, Xiaolin Wu

Deep convolutional neural networks (DCNN) have enjoyed great successes in many signal processing applications because they can learn complex, non-linear causal relationships from input to output.

Prediction regression

Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval

no code implementations ECCV 2018 Xi Zhang, Hanjiang Lai , Jiashi Feng

The proposed new deep adversarial network consists of three building blocks: 1) the feature learning module to obtain the feature representations; 2) the attention module to generate an attention mask, which is used to divide the feature representations into the attended and unattended feature representations; and 3) the hashing module to learn hash functions that preserve the similarities between different modalities.

Cross-Modal Retrieval Retrieval

Multi-region segmentation of bladder cancer structures in MRI with progressive dilated convolutional networks

no code implementations28 May 2018 Jose Dolz, Xiaopan Xu, Jerome Rony, Jing Yuan, Yang Liu, Eric Granger, Christian Desrosiers, Xi Zhang, Ismail Ben Ayed, Hongbing Lu

Precise segmentation of bladder walls and tumor regions is an essential step towards non-invasive identification of tumor stage and grade, which is critical for treatment decision and prognosis of patients with bladder cancer (BC).

Prognosis Segmentation

A Tensor-Based Sub-Mode Coordinate Algorithm for Stock Prediction

no code implementations21 May 2018 Jieyun Huang, Yunjia Zhang, Jialai Zhang, Xi Zhang

The results demonstrate the improvement on the prediction accuracy and the effectiveness of the proposed model.

Prediction Stock Prediction +1

Layered Optical Flow Estimation Using a Deep Neural Network with a Soft Mask

no code implementations9 May 2018 Xi Zhang, Di Ma, Xu Ouyang, Shanshan Jiang, Lin Gan, Gady Agam

We show that by using masks the motion estimate results in a quadratic function of input features in the output layer.

Motion Estimation Optical Flow Estimation

Cognitive Deficit of Deep Learning in Numerosity

no code implementations9 Feb 2018 Xiaolin Wu, Xi Zhang, Xiao Shu

Subitizing, or the sense of small natural numbers, is an innate cognitive function of humans and primates; it responds to visual stimuli prior to the development of any symbolic skills, language or arithmetic.

Deep Learning

Lecture video indexing using boosted margin maximizing neural networks

no code implementations2 Dec 2017 Di Ma, Xi Zhang, Xu Ouyang, Gady Agam

This paper presents a novel approach for lecture video indexing using a boosted deep convolutional neural network system.

HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval

no code implementations26 Nov 2017 Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin, Shuicheng Yan

The proposed new adversarial network, HashGAN, consists of three building blocks: 1) the feature learning module to obtain feature representations, 2) the generative attention module to generate an attention mask, which is used to obtain the attended (foreground) and the unattended (background) feature representations, 3) the discriminative hash coding module to learn hash functions that preserve the similarities between different modalities.

Cross-Modal Retrieval Retrieval

Patient Subtyping via Time-Aware LSTM Networks

1 code implementation KDD '17 Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017 Inci M. Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K. Jain, Jiayu Zhou

We propose a patient subtyping model that leverages the proposed T-LSTM in an auto-encoder to learn a powerful single representation for sequential records of patients, which are then used to cluster patients into clinical subtypes.

Multivariate Time Series Forecasting

Responses to Critiques on Machine Learning of Criminality Perceptions (Addendum of arXiv:1611.04135)

no code implementations13 Nov 2016 Xiaolin Wu, Xi Zhang

In November 2016 we submitted to arXiv our paper "Automated Inference on Criminality Using Face Images".

BIG-bench Machine Learning

CGMOS: Certainty Guided Minority OverSampling

1 code implementation21 Jul 2016 Xi Zhang, Di Ma, Lin Gan, Shanshan Jiang, Gady Agam

In this paper we propose a novel extension to the SMOTE algorithm with a theoretical guarantee for improved classification performance.

Classification General Classification

Modular Decomposition and Analysis of Registration based Trackers

no code implementations3 Mar 2016 Abhineet Singh, Ankush Roy, Xi Zhang, Martin Jagersand

We show how existing trackers can be broken down using the suggested methodology and compare the performance of the default configuration chosen by the authors against other possible combinations to demonstrate the new insights that can be gained by such an approach.

Learning from Synthetic Data Using a Stacked Multichannel Autoencoder

no code implementations17 Sep 2015 Xi Zhang, Yanwei Fu, Shanshan Jiang, Leonid Sigal, Gady Agam

In this paper, we investigate and formalize a general framework-Stacked Multichannel Autoencoder (SMCAE) that enables bridging the synthetic gap and learning from synthetic data more efficiently.

Sketch Recognition

Learning Classifiers from Synthetic Data Using a Multichannel Autoencoder

no code implementations11 Mar 2015 Xi Zhang, Yanwei Fu, Andi Zang, Leonid Sigal, Gady Agam

Experimental results on two datasets validate the efficiency of our MCAE model and our methodology of generating synthetic data.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.