no code implementations • 19 Apr 2025 • Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Michael Felsberg, DaCheng Tao, Xuelong Li
Additionally, diffusion models typically rely on global learned distributions rather than localized features, leading to inconsistencies between the generated and existing image parts.
no code implementations • 17 Apr 2025 • Yuan Zhou, Xinli Shi, Xuelong Li, Jiachen Zhong, Guanghui Wen, Jinde Cao
Employing DFL methods to solve such general optimization problems leads to the formulation of Decentralized Nonconvex Composite Federated Learning (DNCFL), a topic that remains largely underexplored.
no code implementations • 17 Apr 2025 • Qianqian Sun, Jixiang Luo, Dell Zhang, Xuelong Li
The key innovations of SmartFreeEdit include:(1)the introduction of region aware tokens and a mask embedding paradigm that enhance the spatial understanding of complex scenes;(2) a reasoning segmentation pipeline designed to optimize the generation of editing masks based on natural language instructions;and (3) a hypergraph-augmented inpainting module that ensures the preservation of both structural integrity and semantic coherence during complex edits, overcoming the limitations of local-based image generation.
no code implementations • 16 Apr 2025 • Tao Wen, Jiepeng Wang, Yabo Chen, Shugong Xu, Chi Zhang, Xuelong Li
Our design enables a unified and adaptive depth representation across diverse environments.
no code implementations • 15 Apr 2025 • Dianbing Xi, Jiepeng Wang, Yuanzhi Liang, Xi Qiu, Yuchi Huo, Rui Wang, Chi Zhang, Xuelong Li
In this paper, we propose a novel framework for controllable video diffusion, OmniVDiff, aiming to synthesize and comprehend multiple video visual content in a single diffusion model.
no code implementations • 10 Apr 2025 • Junli Liu, Qizhi Chen, Zhigang Wang, Yiwen Tang, Yiting Zhang, Chi Yan, Dong Wang, Xuelong Li, Bin Zhao
Furthermore, we propose an innovative model especially for the AerialVG task, where a Hierarchical Cross-Attention is devised to focus on target regions, and a Relation-Aware Grounding module is designed to infer positional relations.
no code implementations • 7 Apr 2025 • Xueqing Li, Zehan Li, Boyu Zhu, Ruihao Jing, Jian Kang, Jie Li, Xiao-Lei Zhang, Xuelong Li
Its quantization error is lower-bounded by the product of rho and epsilon-kms, where epsilon-kms denotes the quantization error of a single K-means quantizer.
no code implementations • 1 Apr 2025 • Yuanqi Yao, Siao Liu, Haoming Song, Delin Qu, Qizhi Chen, Yan Ding, Bin Zhao, Zhigang Wang, Xuelong Li, Dong Wang
Building a lifelong robot that can effectively leverage prior knowledge for continuous skill acquisition remains significantly challenging.
1 code implementation • 26 Mar 2025 • Jinghui Yuan, Fangyuan Xie, Feiping Nie, Xuelong Li
The indicator matrix plays an important role in machine learning, but optimizing it is an NP-hard problem.
no code implementations • 19 Mar 2025 • Zhong Ji, Ci Liu, Jingren Liu, Chen Tang, Yanwei Pang, Xuelong Li
Central to this approach is the Optimal Transport Adapter (OTA), which employs a cross-modal attention mechanism to enrich textual representations and facilitate subsequent better information interaction.
no code implementations • 17 Mar 2025 • Cheng Yuan, Zhening Liu, Jiashu Lv, Jiawei Shao, Yufei Jiang, Jun Zhang, Xuelong Li
To address this challenge, we propose a task-oriented feature compression (TOFC) method for multimodal understanding in a device-edge co-inference framework, where visual features are merged by clustering and encoded by a learnable and selective entropy model before feature projection.
no code implementations • 14 Mar 2025 • Pingrui Zhang, Xianqiang Gao, Yuhan Wu, Kehui Liu, Dong Wang, Zhigang Wang, Bin Zhao, Yan Ding, Xuelong Li
Our approach enables models to learn affordance-based final positioning that accommodates different arm types and platform heights, thereby paving the way for more robust and generalizable integration of navigation and manipulation in embodied AI.
no code implementations • 12 Mar 2025 • Xiaozhen Qiao, Peng Huang, Jiakang Yuan, Xianda Guo, Bowen Ye, Zhe Sun, Xuelong Li
BPRE first employs a Multi-Dimensional Quality-Aware Reward Module to evaluate feature quality and guide prototype refinement precisely.
no code implementations • 10 Mar 2025 • Zhihao Huang, Xi Qiu, Yukuo Ma, Yifu Zhou, Chi Zhang, Xuelong Li
Autoregressive models have achieved promising results in natural language processing.
1 code implementation • 10 Mar 2025 • Yifan Chen, Hongjun An, Zhe Sun, Tong Tian, Mingliang Chen, Christian Spielmann, Xuelong Li
Ghost imaging (GI) achieves 2D image reconstruction through high-order correlation of 1D bucket signals and 2D light field information, particularly demonstrating enhanced detection sensitivity and high-quality image reconstruction via efficient photon collection in scattering media.
no code implementations • 8 Mar 2025 • Muzhi Dai, Jiashuo Sun, Zhiyuan Zhao, Shixuan Liu, Rui Li, Junyu Gao, Xuelong Li
Aligning large vision-language models (LVLMs) with human preferences is challenging due to the scarcity of fine-grained, high-quality, and multimodal preference data without human annotations.
no code implementations • 26 Feb 2025 • Lei Zhao, Sizhou Chen, Linfeng Feng, Xiao-Lei Zhang, Xuelong Li
Particularly, to improve the synthesis quality and azimuth accuracy of the spatial sound events simultaneously, we propose to use two kinds of acoustic features.
no code implementations • 25 Feb 2025 • Yunpeng Gao, Chenhui Li, Zhongrui You, Junli Liu, Zhen Li, Pengan Chen, Qizhi Chen, Zhonghan Tang, Liansheng Wang, Penghui Yang, Yiwen Tang, YuHang Tang, Shuai Liang, Songyi Zhu, Ziqin Xiong, Yifei Su, Xinyi Ye, Jianan Li, Yan Ding, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li
Particularly, 3D GS supports real-to-sim rendering, further enhancing the realism of the dataset.
no code implementations • 24 Feb 2025 • Weiji Xie, Chenjia Bai, Jiyuan Shi, Junkai Yang, Yunfei Ge, Weinan Zhang, Xuelong Li
Humans possess delicate dynamic balance mechanisms that enable them to maintain stability across diverse terrains and under extreme conditions.
no code implementations • 17 Feb 2025 • Jiachen Yu, Shaoning Sun, Xiaohui Hu, Jiaxu Yan, Kaidong Yu, Xuelong Li
Furthermore, our training method enhances the general capabilities of the model by constructing complicated judge task, and the judge signals provided by our model have significantly enhanced the downstream DPO training performance of our internal models in our test to optimize policy model with Judge Model.
no code implementations • 17 Feb 2025 • Runqi Wang, Caoyuan Ma, Jian Zhao, Hanrui Xu, Dongfang Sun, Haoyang Chen, Lin Xiong, Zheng Wang, Xuelong Li
To generate interactive motion following specified trajectories, this paper decouples complex motion into a Leader - Follower dynamic, inspired by role allocation in partner dancing.
no code implementations • 16 Feb 2025 • Linfeng Feng, Lei Zhao, Boyu Zhu, Xiao-Lei Zhang, Xuelong Li
Additionally, we propose a binaural source localization model to assess the quality of the generated audio.
1 code implementation • 13 Feb 2025 • Yiwen Tang, Zoey Guo, Zhuhao Wang, Ray Zhang, Qizhi Chen, Junli Liu, Delin Qu, Zhigang Wang, Dong Wang, Xuelong Li, Bin Zhao
In this paper, we present the first comprehensive investigation into the potential of encoder-free architectures to overcome the challenges of encoder-based 3D Large Multimodal Models (LMMs).
no code implementations • 6 Feb 2025 • Lei Zhao, Linfeng Feng, Dongxu Ge, Rujin Chen, Fangqiu Yi, Chi Zhang, Xiao-Lei Zhang, Xuelong Li
With the rise of diffusion models, audio-video generation has been revolutionized.
no code implementations • 30 Jan 2025 • Fangyuan Xie, Jinghui Yuan, Feiping Nie, Xuelong Li
Min cut is an important graph partitioning method.
no code implementations • 27 Jan 2025 • Delin Qu, Haoming Song, Qizhi Chen, Yuanqi Yao, Xinyi Ye, Yan Ding, Zhigang Wang, Jiayuan Gu, Bin Zhao, Dong Wang, Xuelong Li
Specifically, we introduce Ego3D Position Encoding to inject 3D information into the input observations of the visual-language-action model, and propose Adaptive Action Grids to represent spatial robot movement actions with adaptive discretized action grids, facilitating learning generalizable and transferrable spatial action knowledge for cross-robot control.
Ranked #2 on
Robot Manipulation
on SimplerEnv-Widow X
(using extra training data)
1 code implementation • 24 Jan 2025 • Han Qi, Fei Guo, Li Zhu, Qiaosheng Zhang, Xuelong Li
In this paper, we study the stochastic multi-armed bandit problem with graph feedback.
no code implementations • 24 Jan 2025 • Hao Ma, Rujin Chen, Ruihao Jing, Xiao-Lei Zhang, Ju Liu, Xuelong Li
Nevertheless, these methods often overlook speech intelligibility, leading to alterations or loss of semantic content in the re-synthesized speech.
1 code implementation • 22 Jan 2025 • Chenjia Bai, Yang Zhang, Shuang Qiu, Qiaosheng Zhang, Kang Xu, Xuelong Li
Then, we reformulate our objective to direct preference optimization with an exploration term, where the UCB-term can be converted to a count-based exploration bonus.
1 code implementation • 1 Jan 2025 • Bingyu Li, Da Zhang, Zhiyuan Zhao, Junyu Gao, Xuelong Li
The core of FGAseg is a Pixel-Level Alignment module that employs a cross-modal attention mechanism and a text-pixel alignment loss to refine the coarse-grained alignment from CLIP, achieving finer-grained pixel-text semantic alignment.
Open Vocabulary Semantic Segmentation
Open-Vocabulary Semantic Segmentation
+1
no code implementations • 28 Dec 2024 • Feiping Nie, Shenfei Pei, Zengwei Zheng, Rong Wang, Xuelong Li
To reduce the computational complexity of GGC, only mergers between clusters and their neighbors are considered.
no code implementations • 21 Dec 2024 • Chi Zhang, Yuanzhi Liang, Xi Qiu, Fangqiu Yi, Xuelong Li
Generating high-quality videos from textual descriptions poses challenges in maintaining temporal coherence and control over subject motion.
1 code implementation • 14 Dec 2024 • Sida Huang, Hongyuan Zhang, Xuelong Li
It therefore implies a new scheme to learn beneficial noise distribution that can be employed to fine-tune VL models.
1 code implementation • 11 Dec 2024 • Yanchen Xu, Siqi Huang, Hongyuan Zhang, Xuelong Li
Inspired by the theoretical conclusions and the idea of positive-incentive noise, we propose a novel GCL algorithm, Error-PAssing-based Graph Contrastive Learning (EPAGCL), which uses both edge adding and edge dropping as its augmentations.
1 code implementation • 27 Nov 2024 • Tianxing Chen, Yao Mu, Zhixuan Liang, Zanxin Chen, Shijia Peng, Qiangyu Chen, Mingkun Xu, Ruizhen Hu, Hongyuan Zhang, Xuelong Li, Ping Luo
Our results demonstrate the effectiveness of G3Flow in enhancing real-time dynamic semantic feature understanding for robotic manipulation policies.
no code implementations • 25 Nov 2024 • Zhigang Wang, Yifei Su, Chenhui Li, Dong Wang, Yan Huang, Bin Zhao, Xuelong Li
Open-vocabulary 3D scene understanding is indispensable for embodied agents.
no code implementations • 25 Nov 2024 • Guangzhao Dai, Jian Zhao, Yuantao Chen, Yusen Qin, Hao Zhao, GuoSen Xie, Yazhou Yao, Xiangbo Shu, Xuelong Li
Vision-and-Language Navigation (VLN), where an agent follows instructions to reach a target destination, has recently seen significant advancements.
no code implementations • 21 Nov 2024 • Guanzhou Lan, YuQi Yang, Zhigang Wang, Dong Wang, Bin Zhao, Xuelong Li
Specifically, our method comprises a degradation disentanglement module and a degradation-aware contrastive learning module.
no code implementations • 19 Nov 2024 • Jiawei Shao, Xuelong Li
This article presents AI Flow, a framework that streamlines the inference process by jointly leveraging the heterogeneous resources available across devices, edge nodes, and cloud servers, making intelligence flow across networks.
no code implementations • 5 Nov 2024 • Yang Zhao, Zidong Nie, Kangsheng Dong, Qinghua Huang, Xuelong Li
This paper proposes a deep reinforcement learning-based model for decision-making in multi-role UAV cooperative pursuit-evasion game, to address the challenge of enabling UAV to autonomously make decisions in complex game environments.
no code implementations • 4 Nov 2024 • Feiping Nie, Yitao Song, Wei Chang, Rong Wang, Xuelong Li
In the graph-based semi-supervised learning, the Green-function method is a classical method that works by computing the Green's function in the graph space.
no code implementations • 4 Nov 2024 • Feiping Nie, Yitao Song, Jingjing Xue, Rong Wang, Xuelong Li
We propose the DPSM method, a density-based node clustering approach that automatically determines the number of clusters and can be applied in both data space and graph space.
no code implementations • 1 Nov 2024 • Hongjun An, Yiliang Song, Xuelong Li
We discovered the underlying physics in Next-token Prediction (NTP).
no code implementations • 29 Oct 2024 • Qizhi Chen, Delin Qu, Yiwen Tang, Haoming Song, Yiting Zhang, Dong Wang, Bin Zhao, Xuelong Li
Reconstructing controllable Gaussian splats from monocular video is a challenging task due to its inherently insufficient constraints.
no code implementations • 16 Oct 2024 • Guanzhou Lan, Qianli Ma, YuQi Yang, Zhigang Wang, Dong Wang, Xuelong Li, Bin Zhao
In this paper, we identify two primary factors contributing to performance degradation: fitting errors and the inference gap.
no code implementations • 11 Oct 2024 • Yunpeng Gao, Zhigang Wang, Linglin Jing, Dong Wang, Xuelong Li, Bin Zhao
Aerial Vision-and-Language Navigation (VLN) is a novel task enabling Unmanned Aerial Vehicles (UAVs) to navigate in outdoor environments through natural language instructions and visual cues.
1 code implementation • 9 Oct 2024 • Yuhan Kang, Qingpeng Li, Leyuan Fang, Jian Zhao, Xuelong Li
In this paper, considering that the surrounding environment information can be well utilized to identify the concealed objects, and thus, we propose a novel deep Surrounding-Aware Network, namely SurANet, for COD tasks, which introduces surrounding information into feature extraction and loss function to improve the discrimination.
no code implementations • 3 Oct 2024 • Jinjing Shi, Tian Chen, Shichao Zhang, Xuelong Li
A personalized quantum federated learning algorithm for privacy image classification is proposed to enhance the personality of the client model in the case of an imbalanced distribution of images.
no code implementations • 29 Sep 2024 • Yifan Duan, Jian Zhao, Pengcheng, Junyuan Mao, Hao Wu, Jingyu Xu, Shilong Wang, Caoyuan Ma, Kai Wang, Kun Wang, Xuelong Li
To this end, we establish a causal framework for ST predictions, termed CaPaint, which targets to identify causal regions in data and endow model with causal reasoning ability in a two-stage process.
1 code implementation • 25 Sep 2024 • Guanlin Li, Ke Zhang, Ting Wang, Ming Li, Bin Zhao, Xuelong Li
Despite the impressive advancements made in recent low-light image enhancement techniques, the scarcity of paired data has emerged as a significant obstacle to further advancements.
1 code implementation • 23 Sep 2024 • Kehui Liu, Zixin Tang, Dong Wang, Zhigang Wang, Xuelong Li, Bin Zhao
Specifically, a Proposal-Execution-Feedback-Adjustment (PEFA) mechanism is designed to decompose and assign actions for individual robots, where a centralized task assigner makes a task planning proposal to decompose the complex task into subtasks, and then assigns subtasks to robot executors.
no code implementations • 18 Sep 2024 • Zhaxizhuoma Zhaxizhuoma, Pengan Chen, Ziniu Wu, Jiawei Sun, Dong Wang, Peng Zhou, Nieqing Cao, Yan Ding, Bin Zhao, Xuelong Li
To validate the effectiveness of AlignBot, experiments are conducted in real-world household environments, which are constructed within the laboratory to replicate typical household settings.
1 code implementation • 29 Aug 2024 • Kaijing Ma, Haojian Huang, Jin Chen, Haodong Chen, Pengliang Ji, Xianghao Zang, Han Fang, Chao Ban, Hao Sun, Mulin Chen, Xuelong Li
To the best of our knowledge, this marks the first successful attempt of DER in VTG.
1 code implementation • 20 Aug 2024 • Yongbo Yu, Weizhong Yu, Feiping Nie, Xuelong Li
The self-attention mechanism in Transformer architecture, invariant to sequence order, necessitates positional embeddings to encode temporal order in time series prediction.
Ranked #1 on
Time Series Forecasting
on Traffic (720)
no code implementations • 19 Aug 2024 • Hongyuan Zhang, Yanchen Xu, Sida Huang, Xuelong Li
Inspired by the theoretical study, a framework that develops a $\pi$-noise generator to learn the beneficial noise (instead of estimation) as data augmentations for contrast is proposed.
no code implementations • 8 Aug 2024 • Yifan Chen, Xiaozhen Qiao, Zhe Sun, Xuelong Li
In this paper, we propose a novel approach, ComKD-CLIP: Comprehensive Knowledge Distillation for Contrastive Language-Image Pre-traning Model, which aims to comprehensively distill the knowledge from a large teacher CLIP model into a smaller student model, ensuring comparable performance with significantly reduced parameters.
no code implementations • 6 Aug 2024 • Jingxian Lu, Wenke Xia, Dong Wang, Zhigang Wang, Bin Zhao, Di Hu, Xuelong Li
Within the intervals between semantic key states, optical flow is employed to capture motion key states to understand the mechanisms of "how to do".
no code implementations • 2 Aug 2024 • Ruoxuan Feng, Di Hu, Wenke Ma, Xuelong Li
Humans possess a remarkable talent for flexibly alternating to different senses when interacting with the environment.
no code implementations • 1 Aug 2024 • Wenzhe Tian, Haijin Zeng, Yin-Ping Zhao, Yongyong Chen, Zhen Wang, Xuelong Li
Current CNN-based methods are limited in modeling long-range dependencies, while Transformer-based models face high computational complexity.
1 code implementation • 1 Aug 2024 • Hongjun An, Yifan Chen, Zhe Sun, Xuelong Li
Current large language models (LLMs) primarily utilize next-token prediction method for inference, which significantly impedes their processing speed.
no code implementations • 26 Jul 2024 • Zhaoqing Chen, Jiawei Sun, Xibin Yang, Xinyi Ye, Bin Zhao, Xuelong Li, Juergen Czarske
Lensless fiber endomicroscope is an emerging tool for in-vivo microscopic imaging, where quantitative phase imaging (QPI) can be utilized as a label-free method to enhance image contrast.
no code implementations • 24 Jul 2024 • Jingren Liu, Zhong Ji, Yunlong Yu, Jiale Cao, Yanwei Pang, Jungong Han, Xuelong Li
This work provides a theoretical foundation for understanding and improving PEFT-CL models, offering insights into the interplay between feature representation, task orthogonality, and generalization, contributing to the development of more efficient continual learning systems.
no code implementations • 11 Jul 2024 • Zhenhe Wu, Zhongqiu Li, Jie Zhang, Mengxiang Li, Yu Zhao, Ruiyu Fang, Zhongjiang He, Xuelong Li, Zhoujun Li, Shuangyong Song
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task.
no code implementations • 3 Jul 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang
Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence.
no code implementations • 23 Jun 2024 • Delin Qu, Qizhi Chen, Pingrui Zhang, Xianqiang Gao, Bin Zhao, Dong Wang, Xuelong Li
This paper scales object-level reconstruction to complex scenes, advancing interactive scene reconstruction.
1 code implementation • 22 Jun 2024 • Yang Zhang, Chenjia Bai, Bin Zhao, Junchi Yan, Xiu Li, Xuelong Li
We cast the dynamics learning as an auto-regressive sequence modeling problem over discrete tokens by leveraging the expressive Transformer architecture, in order to model complex local dynamics across different agents and provide accurate and consistent long-term imaginations.
no code implementations • 15 Jun 2024 • Ying Fu, Yu Li, ShaoDi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu, Yunkang Zhang, Siyuan Jiang, Xiaoqiang Lu, Licheng Jiao, Fang Liu, Xu Liu, Lingling Li, Wenping Ma, Shuyuan Yang, Haiyang Xie, Jian Zhao, Shihua Huang, Peng Cheng, Xi Shen, Zheng Wang, Shuai An, Caizhi Zhu, Xuelong Li, Tao Zhang, Liang Li, Yu Liu, Chenggang Yan, Gengchen Zhang, Linyan Jiang, Bingyi Song, Zhuoyu An, Haibo Lei, Qing Luo, Jie Song, YuAn Liu, Haoyuan Zhang, Lingfeng Wang, Wei Chen, Aling Luo, Cheng Li, Jun Cao, Shu Chen, Zifei Dou, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Xuejian Gou, Qinliang Wang, Yang Liu, Shizhan Zhao, Yanzhao Zhang, Libo Yan, Yuwei Guo, Guoxin Li, Qiong Gao, Chenyue Che, Long Sun, Xiang Chen, Hao Li, Jinshan Pan, Chuanlong Xie, Hongming Chen, Mingrui Li, Tianchen Deng, Jingwei Huang, Yufeng Li, Fei Wan, Bingxin Xu, Jian Cheng, Hongzhe Liu, Cheng Xu, Yuxiang Zou, Weiguo Pan, Songyin Dai, Sen Jia, Junpei Zhang, Puhua Chen, Qihang Li
The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies.
1 code implementation • 1 Jun 2024 • Jia Zeng, Qingwen Bu, Bangjun Wang, Wenke Xia, Li Chen, Hao Dong, Haoming Song, Dong Wang, Di Hu, Ping Luo, Heming Cui, Bin Zhao, Xuelong Li, Yu Qiao, Hongyang Li
To this end, we propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction (MPI) and enhances the visual representation. Given a pair of keyframes representing the initial and final states, along with language instructions, our algorithm predicts the transition frame and detects the interaction object, respectively.
no code implementations • 30 May 2024 • Junjie Zhang, Chenjia Bai, Haoran He, Wenke Xia, Zhigang Wang, Bin Zhao, Xiu Li, Xuelong Li
In this paper, we propose SAM-E, a novel architecture for robot manipulation by leveraging a vision-foundation model for generalizable scene understanding and sequence imitation for long-term action reasoning.
1 code implementation • 24 May 2024 • Bingyu Li, Da Zhang, Zhiyuan Zhao, Junyu Gao, Xuelong Li
To address this issue, we leverage the inherent capabilities of the model itself to discover the optimal equilibrium in multimodal fusion and introduce U3M: An Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation.
no code implementations • 23 May 2024 • Yang Zhang, Shixin Yang, Chenjia Bai, Fei Wu, Xiu Li, Zhen Wang, Xuelong Li
In this paper, we propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans.
no code implementations • 22 May 2024 • Qiang Chen, Weizhong Yu, Feiping Nie, Xuelong Li
Fuzzy clustering algorithms can be roughly categorized into two main groups: Fuzzy C-Means (FCM) based methods and mixture model based methods.
1 code implementation • 10 May 2024 • Xiaoyu Wen, Chenjia Bai, Kang Xu, Xudong Yu, Yang Zhang, Xuelong Li, Zhen Wang
In this paper, we propose a novel representation-based approach to measure the domain gap, where the representation is learned through a contrastive objective by sampling transitions from different domains.
no code implementations • 30 Apr 2024 • Qiaosheng Zhang, Chenjia Bai, Shuyue Hu, Zhen Wang, Xuelong Li
Finally, we extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
1 code implementation • 30 Apr 2024 • Chenjia Bai, Lingxiao Wang, Jianye Hao, Zhuoran Yang, Bin Zhao, Zhen Wang, Xuelong Li
We further provide theoretical analysis, which shows that the optimality gap of our method is only related to the expected data coverage of the shared dataset, thus resolving the distribution shift issue in data sharing.
no code implementations • 26 Apr 2024 • Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li
Emotion recognition aims to discern the emotional state of subjects within an image, relying on subject-centric and contextual visual cues.
no code implementations • 25 Apr 2024 • Haorui Xiang, Zhichang Wu, Guoxu Li, Rong Wang, Feiping Nie, Xuelong Li
Adhering to this concept, we introduce a new model, Capped $\ell_{p}$-Norm Support Vector Ordinal Regression(CSVOR), that is robust to outliers.
no code implementations • 25 Apr 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang
Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications.
no code implementations • 23 Apr 2024 • Ziheng Jiao, Hongyuan Zhang, Xuelong Li
Notably, due to extracting the intra-sample representation of a single instance and the topological relationship among the datasets simultaneously, the performance of distilled ``boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
1 code implementation • 23 Apr 2024 • Fan Zhang, Zhi-Qi Cheng, Jian Zhao, Xiaojiang Peng, Xuelong Li
LEAF introduces a hierarchical expression-aware aggregation strategy that operates at three levels: semantic, instance, and category.
Facial Expression Recognition
Facial Expression Recognition (FER)
1 code implementation • 22 Apr 2024 • Junyu Gao, Da Zhang, Xuelong Li
Then, based on the theory, we design a DPD algorithm which is composed by a training paradigm and proxy domain generator to enhance the domain generalization of the confidence-threshold learner.
1 code implementation • 15 Apr 2024 • Haojian Huang, Xiaozhen Qiao, Zhuo Chen, Haodong Chen, Bingyu Li, Zhe Sun, Mulin Chen, Xuelong Li
Zero-shot learning (ZSL) enables the recognition of novel classes by leveraging semantic knowledge transfer from known to unknown categories.
no code implementations • 15 Apr 2024 • Bonan Ding, Jin Xie, Jing Nie, Jiale Cao, Xuelong Li, Yanwei Pang
Therefore, an effective solution involves transforming monocular images into LiDAR-like representations and employing a LiDAR-based 3D object detector to predict the 3D coordinates of objects.
1 code implementation • 14 Apr 2024 • Xuelong Li, Hongjun An, Guangying Li, Xing Wang, Guanghua Cheng, Zhe Sun
In this paper, we introduce StreakNet-Arch, a novel signal processing architecture designed for Underwater Carrier LiDAR-Radar (UCLR) imaging systems, to address the limitations in scatter suppression and real-time imaging.
7 code implementations • 11 Apr 2024 • Yiwen Tang, Ray Zhang, Jiaming Liu, Zoey Guo, Dong Wang, Zhigang Wang, Bin Zhao, Shanghang Zhang, Peng Gao, Hongsheng Li, Xuelong Li
The adapter incorporates prior spatial knowledge from the source modality to guide the local feature aggregation of 3D tokens, compelling the semantic adaption of any-modality transformers.
no code implementations • 7 Apr 2024 • Xudong Yu, Chenjia Bai, Haoran He, Changhong Wang, Xuelong Li
Sequential decision-making is desired to align with human intents and exhibit versatility across various tasks.
no code implementations • CVPR 2024 • Linglin Jing, Yiming Ding, Yunpeng Gao, Zhigang Wang, Xu Yan, Dong Wang, Gerald Schaefer, Hui Fang, Bin Zhao, Xuelong Li
In this paper, we propose a novel hybrid pseudo-labeling framework for unsupervised event-based semantic segmentation, HPL-ESS, to alleviate the influence of noisy pseudo labels.
no code implementations • 25 Feb 2024 • Mulin Chen, Bocheng Wang, Xuelong Li
Graph Convolutional Network (GCN) has exhibited remarkable potential in improving graph-based clustering.
1 code implementation • 22 Feb 2024 • Haoran He, Chenjia Bai, Ling Pan, Weinan Zhang, Bin Zhao, Xuelong Li
In the pre-training stage, we employ a discrete diffusion model with a mask-and-replace diffusion strategy to predict future video tokens in the latent space.
1 code implementation • 20 Feb 2024 • Jinjing Shi, Zimeng Xiao, Heyuan Shi, Yu Jiang, Xuelong Li
Subsequently, QuanTest formulates the problem of generating test inputs that maximize the quantum entanglement adequacy and capture incorrect behaviors of the QNN system as a joint optimization problem and solves it in a gradient-based manner to generate quantum adversarial examples.
no code implementations • 5 Feb 2024 • Pengfei Han, Fuhua Zhang, Bin Zhao, Xuelong Li
Subsequently, a cross-scale motion structure is presented to estimate and refine intermediate flow maps by the extracted features.
1 code implementation • 1 Feb 2024 • Bin Zhao, Pengfei Han, Xuelong Li
Satellites are capable of capturing high-resolution videos.
no code implementations • 25 Jan 2024 • Ren-xin Zhao, Jinjing Shi, Xuelong Li
In response to the dilemma of HAM and QML, a Grover-inspired Quantum Hard Attention Mechanism (GQHAM) consisting of a Flexible Oracle (FO) and an Adaptive Diffusion Operator (ADO) is proposed.
1 code implementation • 19 Jan 2024 • Junyu Gao, Liangliang Zhao, Xuelong Li
Considering the absence of a dataset for this task, a large-scale Dataset (NWPU-MOC) is collected, consisting of 3, 416 scenes with a resolution of 1024 $\times$ 1024 pixels, and well-annotated using 14 fine-grained object categories.
no code implementations • 17 Jan 2024 • Yexin Zhang, Zhongtian Ma, Qiaosheng Zhang, Zhen Wang, Xuelong Li
This paper considers the problem of community detection on multiple potentially correlated graphs from an information-theoretical perspective.
no code implementations • 8 Jan 2024 • Zhongjiang He, Zihan Wang, Xinzhang Liu, Shixuan Liu, Yitong Yao, Yuyao Huang, Xuelong Li, Yongxiang Li, Zhonghao Che, Zhaoxi Zhang, Yan Wang, Xin Wang, Luwen Pu, Huinan Xu, Ruiyu Fang, Yu Zhao, Jie Zhang, Xiaomeng Huang, Zhilong Lu, Jiaxin Peng, Wenjun Zheng, Shiquan Wang, Bingkai Yang, Xuewei he, Zhuoru Jiang, Qiyi Xie, Yanhan Zhang, Zhongqiu Li, Lingling Shi, Weiwei Fu, Yin Zhang, Zilu Huang, Sishi Xiong, Yuxiang Zhang, Chao Wang, Shuangyong Song
Subsequently, the model undergoes fine-tuning to align with human preferences, following a detailed methodology that we describe.
no code implementations • 4 Jan 2024 • Yukang Zhang, Yang Lu, Yan Yan, Hanzi Wang, Xuelong Li
Specifically, we propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information, which mainly includes an amplitude guided phase (AGP) module and an amplitude nuances mining (ANM) module.
1 code implementation • 19 Dec 2023 • Feixiang Zhou, Zheheng Jiang, Huiyu Zhou, Xuelong Li
However, learning the representation of each frame by unsupervised contrastive learning for action segmentation remains an open and challenging problem.
no code implementations • 12 Dec 2023 • Jiawei Sun, Bin Zhao, Dong Wang, Zhigang Wang, Jie Zhang, Nektarios Koukourakis, Juergen W. Czarske, Xuelong Li
Quantitative phase imaging (QPI) through multi-core fibers (MCFs) has been an emerging in vivo label-free endoscopic imaging modality with minimal invasiveness.
no code implementations • 12 Dec 2023 • Jingchun Zhou, Zongxin He, Qiuping Jiang, Kui Jiang, Xianping Fu, Xuelong Li
To solve this issue, previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features, limiting the generalization and adaptability of the model.
no code implementations • 26 Nov 2023 • Feiping Nie, Jitao Lu, Danyang Wu, Rong Wang, Xuelong Li
To address the problems, we propose a novel N-Cut solver designed based on the famous coordinate descent method.
1 code implementation • 21 Nov 2023 • Linfeng Feng, Xiao-Lei Zhang, Xuelong Li
To address this, we propose an Unbiased Label Distribution (ULD) to eliminate quantization error in training targets.
no code implementations • CVPR 2024 • Chi Yan, Delin Qu, Dan Xu, Bin Zhao, Zhigang Wang, Dong Wang, Xuelong Li
This strategy is essential to extend 3D Gaussian representation to reconstruct the whole scene rather than synthesize a static object in existing methods.
no code implementations • CVPR 2024 • Delin Qu, Chi Yan, Dong Wang, Jie Yin, Dan Xu, Bin Zhao, Xuelong Li
To address these challenges, we propose EN-SLAM, the first event-RGBD implicit neural SLAM framework, which effectively leverages the high rate and high dynamic range advantages of event data for tracking and mapping.
2 code implementations • 6 Nov 2023 • Wenke Xia, Dong Wang, Xincheng Pang, Zhigang Wang, Bin Zhao, Di Hu, Xuelong Li
Generalizable articulated object manipulation is essential for home-assistant robots.
no code implementations • 22 Oct 2023 • Yibo Bai, Xiao-Lei Zhang, Xuelong Li
Recently, automatic speaker verification (ASV) based on deep learning is easily contaminated by adversarial attacks, which is a new type of attack that injects imperceptible perturbations to audio signals so as to make ASV produce wrong decisions.
1 code implementation • 19 Oct 2023 • Hongyuan Zhang, Xuelong Li
Unfortunately, the goal of the existing methods is not to find a discrete solution that minimizes the original objective.
no code implementations • 11 Oct 2023 • Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Xuelong Li, Yue Lu
The challenge of image generation has been effectively modeled as a problem of structure priors or transformation.
7 code implementations • 4 Oct 2023 • Yiwen Tang, Ray Zhang, Zoey Guo, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li
To this end, we introduce Point-PEFT, a novel framework for adapting point cloud pre-trained models with minimal learnable parameters.
no code implementations • 25 Aug 2023 • Ren-xin Zhao, Jinjing Shi, Xuelong Li
Self-Attention Mechanism (SAM) excels at distilling important information from the interior of data to improve the computational efficiency of models.
no code implementations • 11 Jul 2023 • Guanzhou Lan, Bin Zhao, Xuelong Li
Targeting the surveillance scenes, we develop a disentangled representation, which is an auxiliary pretext task that separates surveillance scenes into the foreground and background with contrastive learning.
no code implementations • 28 Jun 2023 • Dongpeng Hou, Zhen Wang, Chao GAO, Xuelong Li
Snapshot observation based source localization has been widely studied due to its accessibility and low cost.
1 code implementation • 26 Jun 2023 • Zhong Ji, Zhihao LI, Yan Zhang, Haoran Wang, Yanwei Pang, Xuelong Li
Afterwards, the VR module is developed to excavate the potential semantic correlations among multiple region-query pairs, which further explores the high-level reasoning similarity.
no code implementations • 13 Jun 2023 • Hongyuan Zhang, Sida Huang, Xuelong Li
From the experiments, it is shown that the proposed VPN generator can improve the base models.
2 code implementations • International Journal of Computer Vision 2023 • Shengping Zhang, Xianzhu Liu, Haozhe Xie, Liqiang Nie, Huiyu Zhou, DaCheng Tao, Xuelong Li
It exploits the repetitive geometric structures in common 3D objects to recover the complete shapes, which contains three sub-networks: geometric patch network, structure transformation network, and detail refinement network.
Ranked #4 on
Point Cloud Completion
on ShapeNet
no code implementations • 5 Jun 2023 • Xiaohan Liu, Yanwei Pang, Xuebin Sun, Yiming Liu, Yonghong Hou, ZhenChang Wang, Xuelong Li
To address this problem, we propose the following: (1) a novel convolutional operator called Faster Fourier Convolution (FasterFC) to replace the two consecutive convolution operations typically used in convolutional neural networks (e. g., U-Net, ResNet).
1 code implementation • NeurIPS 2023 • Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, Xuelong Li
Specifically, we propose Multi-Task Diffusion Model (\textsc{MTDiff}), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multi-task offline settings.
no code implementations • 28 May 2023 • Kang Xu, Chenjia Bai, Shuang Qiu, Haoran He, Bin Zhao, Zhen Wang, Wei Li, Xuelong Li
Leveraging learned strategies in unfamiliar scenarios is fundamental to human intelligence.
no code implementations • 17 May 2023 • Hao Yang, Junyu Gao, Yuan Yuan, Xuelong Li
Anomaly detection in temporal data from sensors under aviation scenarios is a practical but challenging task: 1) long temporal data is difficult to extract contextual information with temporal correlation; 2) the anomalous data are rare in time series, causing normal/abnormal imbalance in anomaly detection, making the detector classification degenerate or even fail.
1 code implementation • 8 May 2023 • Rushuai Yang, Chenjia Bai, Hongyi Guo, Siyuan Li, Bin Zhao, Zhen Wang, Peng Liu, Xuelong Li
Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill, which serves as an upper bound of the previous MI objective.
no code implementations • 24 Apr 2023 • Hanqing Sun, Yanwei Pang, Jiale Cao, Jin Xie, Xuelong Li
In this paper, we explore the model design of Transformers in binocular 3D object detection, focusing particularly on extracting and encoding task-specific image correspondence information.
1 code implementation • 20 Apr 2023 • Hongyuan Zhang, Yanan Zhu, Xuelong Li
It extremely limits the application of stochastic optimization algorithms so that the training of GNN is usually time-consuming.
no code implementations • CVPR 2023 • Weichuang Li, Longhao Zhang, Dong Wang, Bin Zhao, Zhigang Wang, Mulin Chen, Bang Zhang, Zhongjian Wang, Liefeng Bo, Xuelong Li
Talking head generation aims to generate faces that maintain the identity information of the source image and imitate the motion of the driving image.
1 code implementation • 3 Apr 2023 • Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, DaCheng Tao, Xuelong Li
This weak projection, however, can be addressed by a Riemannian metric, and we show that geodesics computation and accurate interpolations between data samples on the Riemannian manifold can substantially improve the performance of deep generative models.
1 code implementation • ICCV 2023 • Delin Qu, Yizhen Lao, Zhigang Wang, Dong Wang, Bin Zhao, Xuelong Li
This paper addresses the problem of rolling shutter correction in complex nonlinear and dynamic scenes with extreme occlusion.
Ranked #2 on
Rolling Shutter Correction
on BS-RSC
7 code implementations • 29 Mar 2023 • Zoey Guo, Yiwen Tang, Ray Zhang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li
In this paper, we propose ViewRefer, a multi-view framework for 3D visual grounding exploring how to grasp the view knowledge from both text and 3D modalities.
no code implementations • CVPR 2023 • Yihao Wang, Zhigang Wang, Bin Zhao, Dong Wang, Mulin Chen, Xuelong Li
In contrast, we propose a purely passive method to track a person walking in an invisible room by only observing a relay wall, which is more in line with real application scenarios, e. g., security.
1 code implementation • CVPR 2023 • Haozhe Si, Bin Zhao, Dong Wang, Yunpeng Gao, Mulin Chen, Zhigang Wang, Xuelong Li
We show that our framework circumvents the needs for the depth and AIF image ground-truth, and receives superior predictions, thus closing the gap between the theoretical success of DFD works and their applications in the real world.
1 code implementation • 9 Feb 2023 • Jiabei Wang, Yanwei Pang, Jiale Cao, Hanqing Sun, Zhuang Shao, Xuelong Li
We hope that our simple intra-image contrastive learning can provide more paradigms on weakly supervised person search.
1 code implementation • 17 Jan 2023 • Yan Zhang, Zhong Ji, Di Wang, Yanwei Pang, Xuelong Li
(2) It limits the scale of negative sample pairs by employing the mini-batch based end-to-end training mechanism.
no code implementations • ICCV 2023 • Zoey Guo, Yiwen Tang, Ray Zhang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li
In this paper, we propose ViewRefer, a multi-view framework for 3D visual grounding exploring how to grasp the view knowledge from both text and 3D modalities.
no code implementations • 19 Dec 2022 • Xuelong Li
After introducing the task entropy, the noise can be classified into two kinds, Positive-incentive noise (Pi-noise or $\pi$-noise) and pure noise, according to whether the noise can reduce the complexity of the task.
no code implementations • 7 Dec 2022 • Feiping Nie, Hong Chen, Rong Wang, Xuelong Li
This paper presents an algorithm to solve the Soft k-Means problem globally.
no code implementations • 2 Dec 2022 • Qi Wang, Juncheng Wang, Junyu Gao, Yuan Yuan, Xuelong Li
The mainstream crowd counting methods regress density map and integrate it to obtain counting results.
1 code implementation • 30 Oct 2022 • Zhen Wang, Haotong Du, Quanming Yao, Xuelong Li
In particular, we develop a generalized framework to explore topological and temporal information in TKGs.
Ranked #1 on
Link Prediction
on GDELT
no code implementations • 19 Oct 2022 • Shupei Liu, Linfeng Feng, Yijun Gong, Chengdong Liang, Chen Zhang, Xiao-Lei Zhang, Xuelong Li
To further boost the estimation accuracy, we introduce a node selection algorithm that strategically filters the most reliable nodes.
no code implementations • 22 Aug 2022 • Yuqing Wang, Xiangxian Li, Zhuang Qi, Jingyu Li, Xuelong Li, Xiangxu Meng, Lei Meng
Causal inference has become a powerful tool to handle the out-of-distribution (OOD) generalization problem, which aims to extract the invariant features.
no code implementations • 20 Aug 2022 • Yake Wei, Di Hu, Yapeng Tian, Xuelong Li
A comprehensive survey that can systematically organize and analyze studies of the audio-visual field is expected.
no code implementations • 11 Aug 2022 • Zhong Ji, Zhishen Hou, Xiyao Liu, Yanwei Pang, Xuelong Li
Few-shot Class-Incremental Learning (FSCIL) aims at learning new concepts continually with only a few samples, which is prone to suffer the catastrophic forgetting and overfitting problems.
class-incremental learning
Few-Shot Class-Incremental Learning
+3
1 code implementation • 5 Aug 2022 • Xuelong Li, Guanlin Li, Bin Zhao
The illumination enhancement branch is adopted to enlighten the low-frequency component with reduced resolution.
no code implementations • 18 Jul 2022 • Xuelong Li, Ziheng Jiao, Hongyuan Zhang, Rui Zhang
Admittedly, Graph Convolution Network (GCN) has achieved excellent results on graph datasets such as social networks, citation networks, etc.
no code implementations • 14 Jul 2022 • Jinjing Shi, Ren-xin Zhao, Wenxuan Wang, Shichao Zhang, Xuelong Li
Self-Attention Mechanism (SAM) is good at capturing the internal connections of features and greatly improves the performance of machine learning models, espeacially requiring efficient characterization and feature extraction of high-dimensional data.
no code implementations • 7 Mar 2022 • Yifan Chen, Yang Zhao, Xuelong Li
In this paper, we try to enhance the discrimination of spatio-temporal gait features from two aspects: effective extraction of spatio-temporal gait features and reasonable refinement of extracted features.
no code implementations • 4 Mar 2022 • Xuelong Li, Hongyuan Zhang, Rui Zhang
We theoretically validate that it is equivalent to the existing matrix completion models.
no code implementations • 22 Feb 2022 • Yuyu Guo, Lianli Gao, Jingkuan Song, Peng Wang, Nicu Sebe, Heng Tao Shen, Xuelong Li
Inspired by this observation, in this article, we propose a relation regularized network (R2-Net), which can predict whether there is a relationship between two objects and encode this relation into object feature refinement and better SGG.
no code implementations • 9 Dec 2021 • Wei Chang, Feiping Nie, Rong Wang, Xuelong Li
Multi-task learning has been observed by many researchers, which supposes that different tasks can share a low-rank common yet latent subspace.
no code implementations • 18 Nov 2021 • Chuang Yang, Mulin Chen, Yuan Yuan, Qi Wang, Xuelong Li
It weakens the coupling of texts to shrink-masks, which improves the robustness of detection results.
no code implementations • 12 Nov 2021 • Hongyuan Zhang, Jiankun Shi, Rui Zhang, Xuelong Li
The core problems mainly come from two aspects: (1) the graph is unavailable in the most clustering scenes so that how to construct high-quality graphs on the non-graph data is usually the most important part; (2) given n samples, the graph-based clustering methods usually consume at least $\mathcal O(n^2)$ time to build graphs and the graph convolution requires nearly $\mathcal O(n^2)$ for a dense graph and $\mathcal O(|\mathcal{E}|)$ for a sparse one with $|\mathcal{E}|$ edges.
no code implementations • 28 Oct 2021 • Junyu Gao, Maoguo Gong, Xuelong Li
The second is an audio CNN for encoding Log Mel-Spectrogram of audio signals.
no code implementations • 10 Oct 2021 • Qi Wang, Tao Han, Junyu Gao, Yuan Yuan, Xuelong Li
The rapid development in visual crowd analysis shows a trend to count people by positioning or even detecting, rather than simply summing a density map.
no code implementations • 22 Sep 2021 • Bin Zhao, Maoguo Gong, Xuelong Li
To integrate the two kinds of information, they are encoded in a two-stream scheme, and a multimodal fusion mechanism is developed based on the hierarchical transformer.
Ranked #17 on
Supervised Video Summarization
on TvSum
no code implementations • 12 Sep 2021 • Qi Wang, Sikai Bai, Junyu Gao, Yuan Yuan, Xuelong Li
In addition, due to domain gaps between different datasets, the performance is dramatically decreased when re-ID models pre-trained on label-rich datasets (source domain) are directly applied to other unlabeled datasets (target domain).
1 code implementation • 2 Aug 2021 • Junyu Gao, Maoguo Gong, Xuelong Li
To this end, we propose a Dilated Convolutional Swin Transformer (DCST) for congested crowd scenes.
1 code implementation • 19 Jul 2021 • Dawei Du, Longyin Wen, Pengfei Zhu, Heng Fan, QinGhua Hu, Haibin Ling, Mubarak Shah, Junwen Pan, Ali Al-Ali, Amr Mohamed, Bakour Imene, Bin Dong, Binyu Zhang, Bouchali Hadia Nesma, Chenfeng Xu, Chenzhen Duan, Ciro Castiello, Corrado Mencar, Dingkang Liang, Florian Krüger, Gennaro Vessio, Giovanna Castellano, Jieru Wang, Junyu Gao, Khalid Abualsaud, Laihui Ding, Lei Zhao, Marco Cianciotta, Muhammad Saqib, Noor Almaadeed, Omar Elharrouss, Pei Lyu, Qi Wang, Shidong Liu, Shuang Qiu, Siyang Pan, Somaya Al-Maadeed, Sultan Daud Khan, Tamer Khattab, Tao Han, Thomas Golda, Wei Xu, Xiang Bai, Xiaoqing Xu, Xuelong Li, Yanyun Zhao, Ye Tian, Yingnan Lin, Yongchao Xu, Yuehan Yao, Zhenyu Xu, Zhijian Zhao, Zhipeng Luo, Zhiwei Wei, Zhiyuan Zhao
Crowd counting on the drone platform is an interesting topic in computer vision, which brings new challenges such as small object inference, background clutter and wide viewpoint.
no code implementations • 17 Jun 2021 • Rui Zhang, Chengjun Lu, Ziheng Jiao, Xuelong Li
In particular, in this paper, we apply AH to contrastive learning (AHCL) such that it can be effectively transferred from weak-supervised learning (given label priori) to unsupervised learning, where soft labels of contrastive learning are directly and adaptively learned.
no code implementations • 15 Jun 2021 • Rui Zhang, Ziheng Jiao, Hongyuan Zhang, Xuelong Li
Moreover, by unifying the flexible Stiefel manifold and adaptive support vector machine, we devise the novel decision layer which efficiently fits the manifold structure of the data and label information.
no code implementations • NeurIPS 2021 • Feiping Nie, Shenfei Pei, Rong Wang, Liang Zhang, Jun Wu, Qinglong Chang, Xuelong Li
We also developed a general model that unified LKM, KSUMS, and SC, and discussed the connection among them.
no code implementations • 17 May 2021 • Bin Zhao, Xuelong Li
Specifically, in the flow estimation stage, three edge-aware mechanisms are developed to emphasize the frame edges in estimating flow maps, so that the edge-maps are taken as the auxiliary information to provide more guidance to boost the flow accuracy.
no code implementations • 17 May 2021 • Bin Zhao, Maoguo Gong, Xuelong Li
Motivated by this, we propose to jointly exploit the audio and visual information for the video summarization task, and develop an AudioVisual Recurrent Network (AVRN) to achieve this.
no code implementations • 10 May 2021 • Bin Zhao, Haopeng Li, Xiaoqiang Lu, Xuelong Li
Then, the videos are summarized by exploiting both the local and global dependencies among shots.
no code implementations • 24 Apr 2021 • Qi Wang, Yanling Miao, Mulin Chen, Xuelong Li
In order to better handle the high dimensionality problem and preserve the spatial structures, this paper proposes a novel unsupervised approach called spatial-spectral clustering with anchor graph (SSCAG) for HSI data clustering.
no code implementations • 11 Apr 2021 • Qi Wang, Xu Jiang, Mulin Chen, Xuelong Li
In this paper, we focus on the unsupervised multi-view feature selection which tries to handle high dimensional data in the field of multi-view learning.
no code implementations • 25 Mar 2021 • Yanling Miao, Qi Wang, Mulin Chen, Xuelong Li
Graph-based semi-supervised learning methods, which deal well with the situation of limited labeled data, have shown dominant performance in practical applications.
no code implementations • 24 Mar 2021 • Mulin Chen, Xuelong Li
Considering that the outliers are usually much less than the normal samples, a new entropy loss function is established for matrix factorization, which minimizes the entropy of the residue distribution and allows a few samples to have large approximation errors.
no code implementations • 24 Mar 2021 • Mulin Chen, Maoguo Gong, Xuelong Li
Non-negative Matrix Factorization (NMF) is one of the most popular techniques for data representation and clustering, and has been widely used in machine learning and data analysis.
no code implementations • 22 Mar 2021 • Rui Zhang, Hongyuan Zhang, Xuelong Li
Principal component analysis (PCA) frequently suffers from the disturbance of outliers and thus a spectrum of robust extensions and variations of PCA have been developed.
no code implementations • 17 Mar 2021 • Bo Wei, Mulin Chen, Qi Wang, Xuelong Li
To obtain the accurate supervision information of different channels, the MDSNet employs an auxiliary network called SupervisionNet (SN) to generate abundant supervision maps based on existing groundtruth.
no code implementations • 9 Mar 2021 • Xuelong Li, Kai Kou, Bin Zhao
To this end, the generator of Weather GAN is composed of an initial translation module, an attention module and a weather-cue segmentation module.
no code implementations • 6 Jan 2021 • Rong Wang, Yihang Lu, Qianrong Zhang, Feiping Nie, Zhen Wang, Xuelong Li
To alleviate this problem, we proposed a novel ensemble and random collaborative representation-based detector (ERCRD) for HAD, which comprises two closely related stages.
no code implementations • 29 Dec 2020 • Zhengxin Li, Feiping Nie, Jintang Bian, Xuelong Li
However, real-world data contain a large number of noise samples and features, making the similarity matrix constructed by original data cannot be completely reliable.
1 code implementation • 8 Dec 2020 • Junyu Gao, Tao Han, Qi Wang, Yuan Yuan, Xuelong Li
Furthermore, to improve the segmentation quality for different density regions, we present a differentiable Binarization Module (BM) to output structured instance maps.
1 code implementation • NeurIPS 2020 • Shenfei Pei, Feiping Nie, Rong Wang, Xuelong Li
In particular, over 15x and 7x speed-up can be obtained with respect to $k$-means on the synthetic dataset of 1 million samples and the benchmark dataset (CelebA) of 200k samples, respectively [GitHub].
1 code implementation • NeurIPS 2020 • Lai Tian, Feiping Nie, Rong Wang, Xuelong Li
This paper presents new algorithms to solve the feature-sparsity constrained PCA problem (FSPCA), which performs feature selection and PCA simultaneously.
1 code implementation • 4 Nov 2020 • Zheheng Jiang, Feixiang Zhou, Aite Zhao, Xin Li, Ling Li, DaCheng Tao, Xuelong Li, Huiyu Zhou
To address this problem, we here propose a novel multiview latent-attention and dynamic discriminative model that jointly learns view-specific and view-shared sub-structures, where the former captures unique dynamics of each view whilst the latter encodes the interaction between the views.
Ranked #4 on
Question Answering
on MultiTQ
no code implementations • 20 Feb 2020 • Hongyuan Zhang, Rui Zhang, Xuelong Li
Driven by theoretical analysis about relaxed k-means, we design a specific GAE-based model for graph clustering to be consistent with the theory, namely Embedding Graph Auto-Encoder (EGAE).
no code implementations • 14 Aug 2018 • Zhanxuan Hu, Feiping Nie, Rong Wang, Xuelong Li
Low rank regularization, in essence, involves introducing a low rank or approximately low rank assumption for matrix we aim to learn, which has achieved great success in many fields including machine learning, data mining and computer version.