no code implementations • 29 Jan 2025 • Xinzhe Xia, Weiguang Zhao, Yuyao Yan, Guanyu Yang, Rui Zhang, Kaizhu Huang, Xi Yang
3D open-world classification is a challenging yet essential task in dynamic and unstructured real-world scenarios, requiring both open-category and open-pose recognition.
no code implementations • 29 Jan 2025 • Jingcheng Ni, Weiguang Zhao, Daniel Wang, Ziyao Zeng, Chenyu You, Alex Wong, Kaizhu Huang
Object removal is of great significance to 3D scene understanding, essential for applications in content filtering and scene editing.
no code implementations • 28 Jan 2025 • Chenru Jiang, Chengrui Zhang, Xi Yang, Jie Sun, Yifei Zhang, Bin Dong, Kaizhu Huang
First, we convert 3D structural priors derived from the initial 3D point cloud as a bound term to increase evidence in the variational Bayesian framework, leveraging these robust intrinsic priors to tightly govern the diffusion training process and bolster consistency in reconstruction.
1 code implementation • 15 Jan 2025 • Zhipeng Ye, Feng Jiang, Qiufeng Wang, Kaizhu Huang, Jiaqi Huang
As one important contribution, we employ the Llama model and design a comprehensive pipeline to generate textual descriptions for images of 11 datasets, resulting in a total of 1, 637, 795 image-text pairs, named "IMD-11".
no code implementations • 8 Jan 2025 • Zhiqiang Gao, Jiaqi Wang, Hangchi Shen, Zhihao Dou, Xiangbo Zhang, Kaizhu Huang
Hyperspectral image (HSI) classification is a crucial technique for remote sensing to build large-scale earth monitoring systems.
no code implementations • 24 Dec 2024 • Zihan Ye, Xinyuan Ru, Shiming Chen, Yaochu Jin, Kaizhu Huang, Xiaobo Jin
This paper delves into the pivotal influence of unseen class priors within the framework of transductive ZSL (TZSL) and illuminates the finding that even a marginal prior bias can result in substantial accuracy declines.
1 code implementation • 20 Dec 2024 • Xiaoqiang Kang, Zimu Wang, Xiaobo Jin, Wei Wang, Kaizhu Huang, Qiufeng Wang
In this paper, we propose a Template-driven LLM-paraphrased (TeLL) framework for generating high-quality TMWP samples with diverse backgrounds and accurate tables, questions, answers, and solutions.
no code implementations • 17 Dec 2024 • Jianan Ye, Weiguang Zhao, Xi Yang, Guangliang Cheng, Kaizhu Huang
Point cloud anomaly detection under the anomaly-free setting poses significant challenges as it requires accurately capturing the features of 3D normal data to identify deviations indicative of anomalies.
no code implementations • 27 Nov 2024 • Weiguang Zhao, Chenru Jiang, Chengrui Zhang, Jie Sun, Yuyao Yan, Rui Zhang, Kaizhu Huang
Leveraging the segmentation results, we propose to engage a training-free binary clustering algorithm that not only improves segmentation precision but also possesses the capability to cluster and localize unseen objects for executing grasping operations.
no code implementations • 18 Nov 2024 • Jing Li, Xueke Chi, Qiufeng Wang, DaHan Wang, Kaizhu Huang, Yongge Liu, Cheng-Lin Liu
Oracle character recognition-an analysis of ancient Chinese inscriptions found on oracle bones-has become a pivotal field intersecting archaeology, paleography, and historical cultural studies.
no code implementations • 16 Nov 2024 • Zixian Su, Jingwei Guo, Xi Yang, Qiufeng Wang, Kaizhu Huang
While Test-Time Adaptation (TTA) has shown promise in addressing distribution shifts between training and testing data, its effectiveness diminishes with heterogeneous data streams due to uniform target estimation.
1 code implementation • 12 Nov 2024 • Jianan Ye, Zhaorui Tan, Yijie Hu, Xi Yang, Guangliang Cheng, Kaizhu Huang
To our knowledge, this is a pioneering effort to apply the concept of disentanglement for one-class anomaly detection on tabular data.
no code implementations • 9 Nov 2024 • Zhaorui Tan, Xi Yang, Tan Pan, Tianyi Liu, Chen Jiang, Xin Guo, Qiufeng Wang, Anh Nguyen, Yuan Qi, Kaizhu Huang, Yuan Cheng
We validate the feasibility and benefits of learning a personalized ${X}_h$, showing that this representation is highly generalizable and transferable across various multi-modal medical tasks.
no code implementations • 5 Nov 2024 • Zixian Su, Jingwei Guo, Xi Yang, Qiufeng Wang, Frans Coenen, Kaizhu Huang
Medical Image Analysis (MedIA) has become indispensable in modern healthcare, enhancing clinical diagnostics and personalized treatment.
1 code implementation • 2 Nov 2024 • Yijie Hu, Guanyu Yang, Zhaorui Tan, Xiaowei Huang, Kaizhu Huang, Qiu-Feng Wang
In this paper, we aim to mitigate these issues by directly constraining the span of each class distribution from a covariance perspective.
class-incremental learning
Few-Shot Class-Incremental Learning
+1
1 code implementation • 6 Oct 2024 • Zhaorui Tan, Xi Yang, Qiufeng Wang, Anh Nguyen, Kaizhu Huang
Vision models excel in image classification but struggle to generalize to unseen data, such as classifying images from unseen domains or discovering novel categories.
no code implementations • 28 Sep 2024 • Tianyi Liu, Zhaorui Tan, Haochuan Jiang, Xi Yang, Kaizhu Huang
Brain tumor segmentation is often based on multiple magnetic resonance imaging (MRI).
no code implementations • 23 Aug 2024 • Baoru Huang, Tuan Vo, Chayun Kongtongvattana, Giulio Dagnino, Dennis Kundrat, Wenqiang Chi, Mohamed Abdelaziz, Trevor Kwok, Tudor Jianu, Tuong Do, Hieu Le, Minh Nguyen, Hoan Nguyen, Erman Tjiputra, Quang Tran, Jianyang Xie, Yanda Meng, Binod Bhattarai, Zhaorui Tan, Hongbin Liu, Hong Seng Gan, Wei Wang, Xi Yang, Qiufeng Wang, Jionglong Su, Kaizhu Huang, Angelos Stefanidis, Min Guo, Bo Du, Rong Tao, Minh Vu, Guoyan Zheng, Yalin Zheng, Francisco Vasconcelos, Danail Stoyanov, Daniel Elson, Ferdinando Rodriguez y Baena, Anh Nguyen
Real-time visual feedback from catheterization analysis is crucial for enhancing surgical safety and efficiency during endovascular interventions.
no code implementations • 21 Aug 2024 • Chongwen Zhao, Zhihao Dou, Kaizhu Huang
Large Language Models (LLMs) are increasingly attracting attention in various applications.
no code implementations • 18 Aug 2024 • Tianyi Liu, Zhaorui Tan, Muyin Chen, Xi Yang, Haochuan Jiang, Kaizhu Huang
Along this line, in this paper, we propose a novel paradigm that aligns latent features of involved modalities to a well-defined distribution anchor as the substitution of the pre-trained model}.
no code implementations • 11 Jul 2024 • ZiHao Zhou, Shudong Liu, Maizhen Ning, Wei Liu, Jindong Wang, Derek F. Wong, Xiaowei Huang, Qiufeng Wang, Kaizhu Huang
Exceptional mathematical reasoning ability is one of the key features that demonstrate the power of large language models (LLMs).
no code implementations • 10 Jun 2024 • Haochuan Jiang, Guanyu Yang, Kaizhu Huang, Rui Zhang
Due to the huge category number, the sophisticated combinations of various strokes and radicals, and the free writing or printing styles, generating Chinese characters with diverse styles is always considered as a difficult task.
no code implementations • 10 Jun 2024 • Haochuan Jiang, Guanyu Yang, Fei Cheng, Kaizhu Huang
Synthesizing Chinese characters with consistent style using few stylized examples is challenging.
no code implementations • 5 Jun 2024 • Zihan Ye, Shreyank N. Gowda, Xiaobo Jin, Xiaowei Huang, Haotian Xu, Yaochu Jin, Kaizhu Huang
For class-level effectiveness, we design a two-branch generation structure that consists of a Diffusion-based Feature Generator (DFG) and a Diffusion-based Representation Generator (DRG).
Ranked #1 on
Zero-Shot Learning
on AwA2
no code implementations • 31 May 2024 • Zhaorui Tan, Chengrui Zhang, Xi Yang, Jie Sun, Kaizhu Huang
Generalized category discovery presents a challenge in a realistic scenario, which requires the model's generalization ability to recognize unlabeled samples from known and unknown categories.
no code implementations • 23 May 2024 • Kai Yao, Zhaorui Tan, Zixian Su, Xi Yang, Jie Sun, Kaizhu Huang
Built upon this, we argue that conventional OCDA approaches may substantially underestimate the inherent variance inside the compound target domains for model generalization.
no code implementations • 14 Apr 2024 • Yuqi Wang, Zeqiang Wang, Wei Wang, Qi Chen, Kaizhu Huang, Anh Nguyen, Suparna De
Safe and reliable natural language inference is critical for extracting insights from clinical trial reports but poses challenges due to biases in large pre-trained language models.
no code implementations • 28 Mar 2024 • Tianyi Liu, Zhaorui Tan, Kaizhu Huang, Haochuan Jiang
Medical image segmentation presents the challenge of segmenting various-size targets, demanding the model to effectively capture both local and global information.
2 code implementations • CVPR 2024 • Zhaorui Tan, Xi Yang, Kaizhu Huang
Multi-domain generalization (mDG) is universally aimed to minimize the discrepancy between training and testing distributions to enhance marginal-to-label distribution mapping.
Ranked #2 on
Domain Generalization
on TerraIncognita
no code implementations • 17 Jan 2024 • Jingwei Guo, Kaizhu Huang, Xinping Yi, Zixian Su, Rui Zhang
Whilst spectral Graph Neural Networks (GNNs) are theoretically well-founded in the spectral domain, their practical reliance on polynomial approximation implies a profound linkage to the spatial domain.
no code implementations • 21 Dec 2023 • Jing Li, Qiu-Feng Wang, Siyuan Wang, Rui Zhang, Kaizhu Huang, Erik Cambria
In particular, on the challenging OBC306 dataset, Diff-Oracle leads to an accuracy gain of 7. 70% in the zero-shot setting and is able to recognize unseen oracle character images with the accuracy of 84. 62%, achieving a new benchmark for deciphering oracle bone scripts.
1 code implementation • 15 Dec 2023 • Zixian Su, Jingwei Guo, Kai Yao, Xi Yang, Qiufeng Wang, Kaizhu Huang
While recent test-time adaptations exhibit efficacy by adjusting batch normalization to narrow domain disparities, their effectiveness diminishes with realistic mini-batches due to inaccurate target estimation.
1 code implementation • 14 Dec 2023 • Jingwei Guo, Kaizhu Huang, Xinping Yi, Rui Zhang
Spectral Graph Neural Networks (GNNs) have achieved tremendous success in graph machine learning, with polynomial filters applied for graph convolutions, where all nodes share the identical filter weights to mine their local contexts.
no code implementations • 13 Dec 2023 • Weiguang Zhang, Qiufeng Wang, Kaizhu Huang
While Cartesian coordinates are typically leveraged by state-of-the-art approaches to learn a group of deformation control points, such representation is not efficient for dewarping model to learn the deformation information.
1 code implementation • 13 Dec 2023 • Zhaorui Tan, Xi Yang, Kaizhu Huang
In particular, we propose to augment texts in the semantic space via an Implicit Textual Semantic Preserving Augmentation ($ITA$), in conjunction with a specifically designed Image Semantic Regularization Loss ($L_r$) as Generated Image Semantic Conservation, to cope well with semantic mismatch and collapse.
2 code implementations • 12 Dec 2023 • Weiguang Zhao, Guanyu Yang, Rui Zhang, Chenru Jiang, Chaolong Yang, Yuyao Yan, Amir Hussain, Kaizhu Huang
To this end, we propose a more realistic and challenging scenario named open-pose 3D zero-shot classification, focusing on the recognition of 3D objects regardless of their orientation.
no code implementations • 31 Oct 2023 • Yuqi Wang, Zeqiang Wang, Wei Wang, Qi Chen, Kaizhu Huang, Anh Nguyen, Suparna De
In the era of the Internet of Things (IoT), the retrieval of relevant medical information has become essential for efficient clinical decision-making.
no code implementations • 25 Oct 2023 • Yiming Lin, Xiao-Bo Jin, Qiufeng Wang, Kaizhu Huang
The current state-of-the-art methods first refine the representation of phrase by aggregating the most similar $k$ image pixels, and then match the refined text representations with the pixels of the image feature map to generate segmentation results.
no code implementations • 4 Sep 2023 • ZiHao Zhou, Qiufeng Wang, Mingyu Jin, Jie Yao, Jianan Ye, Wei Liu, Wei Wang, Xiaowei Huang, Kaizhu Huang
Instead of attacking prompts in the use of LLMs, we propose a MathAttack model to attack MWP samples which are closer to the essence of security in solving math problems.
1 code implementation • 5 Aug 2023 • Maizhen Ning, Qiu-Feng Wang, Kaizhu Huang, Xiaowei Huang
For the diagram encoder, we pre-train it under a multi-label classification framework with the symbolic characters as labels.
1 code implementation • 15 Jun 2023 • ZiHao Zhou, Maizhen Ning, Qiufeng Wang, Jie Yao, Wei Wang, Xiaowei Huang, Kaizhu Huang
We then feed them to a question generator together with the scenario to obtain the corresponding diverse questions, forming a new MWP with a variety of questions and equations.
no code implementations • 14 Jun 2023 • Jianan Ye, Yijie Hu, Xi Yang, Qiu-Feng Wang, Chao Huang, Kaizhu Huang
We then design a novel patch-wise residual module in the anomaly learning head to extract and assess the fine-grained anomaly features from each sample, facilitating the learning of discriminative representations of anomaly instances.
no code implementations • 20 Apr 2023 • Jiezhu Cheng, Kaizhu Huang, Zibin Zheng
By lowering the volatility of the stock recommendation model, SVAT effectively reduces investment risks and outperforms state-of-the-art baselines by more than 30% in terms of risk-adjusted profits.
1 code implementation • 12 Feb 2023 • Shiran Yuan, Kaizhu Huang
We present the Generalized CP Decomposition Tensor Completion (GCDTC) framework, the first generalizable framework for low-rank tensor completion that takes numerical priors of the data into account.
1 code implementation • ICCV 2023 • Zhiqiang Gao, Kaizhu Huang, Rui Zhang, Dawei Liu, Jieming Ma
Recent studies have investigated how to achieve robustness for unsupervised domain adaptation (UDA).
no code implementations • 13 Dec 2022 • Chaolong Yang, Yuyao Yan, Weiguang Zhao, Jianan Ye, Xi Yang, Amir Hussain, Kaizhu Huang
On the one hand, the unidirectional projection enforces our model focused more on the core task, i. e., 3D segmentation; on the other hand, unlocking the bidirectional to unidirectional projection enables a deeper cross-domain semantic alignment and enjoys the flexibility to fuse better and complicated features from very different spaces.
no code implementations • 7 Dec 2022 • M. Tanveer, M. A. Ganaie, Iman Beheshti, Tripti Goel, Nehal Ahmad, Kuan-Ting Lai, Kaizhu Huang, Yu-Dong Zhang, Javier Del Ser, Chin-Teng Lin
In this review, we offer a comprehensive analysis of the literature related to the adoption of deep learning for brain age estimation with neuroimaging data.
1 code implementation • 27 Nov 2022 • Zixian Su, Kai Yao, Xi Yang, Qiufeng Wang, Jie Sun, Kaizhu Huang
Single-source domain generalization (SDG) in medical image segmentation is a challenging yet essential task as domain shifts are quite common among clinical image datasets.
1 code implementation • 27 Oct 2022 • Zhaorui Tan, Xi Yang, Zihan Ye, Qiufeng Wang, Yuyao Yan, Anh Nguyen, Kaizhu Huang
Generating consistent and high-quality images from given texts is essential for visual-language understanding.
1 code implementation • 13 Oct 2022 • Zihan Ye, Guanyu Yang, Xiaobo Jin, Youfa Liu, Kaizhu Huang
Broadly speaking, present ZSL methods usually adopt class-level semantic labels and compare them with instance-level semantic predictions to infer unseen classes.
1 code implementation • 3 Aug 2022 • Penglei Gao, Xi Yang, Rui Zhang, Ping Guo, John Y. Goulermas, Kaizhu Huang
While exogenous variables have a major impact on performance improvement in time series analysis, inter-series correlation and time dependence among them are rarely considered in the present continuous methods.
1 code implementation • ICCV 2023 • Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang
Due to the uneven distribution of offset points, these existing methods can hardly cluster all instance points.
Ranked #3 on
3D Instance Segmentation
on S3DIS
1 code implementation • 12 Jul 2022 • Kai Yao, Penglei Gao, Xi Yang, Kaizhu Huang, Jie Sun, Rui Zhang
Image outpainting, which is well studied with Convolution Neural Network (CNN) based framework, has recently drawn more attention in computer vision.
1 code implementation • 27 May 2022 • Jingwei Guo, Kaizhu Huang, Rui Zhang, Xinping Yi
While Graph Neural Networks (GNNs) have achieved enormous success in multiple graph analytical tasks, modern variants mostly rely on the strong inductive bias of homophily.
no code implementations • 24 May 2022 • Zixian Su, Kai Yao, Xi Yang, Qiufeng Wang, Yuyao Yan, Jie Sun, Kaizhu Huang
This combination of global and local alignment can precisely localize the crucial regions in segmentation target while preserving the overall semantic consistency.
1 code implementation • 8 Apr 2022 • Weiguang Zhao, Chaolong Yang, Jianan Ye, Rui Zhang, Yuyao Yan, Xi Yang, Bin Dong, Amir Hussain, Kaizhu Huang
Specifically, we present a novel multi-view feature fusion backbone that utilizes face masks to align features from multiple encoders and integrates one multi-layer attention mechanism to enhance feature interaction and fusion, resulting in one unified facial representation.
no code implementations • 26 Mar 2022 • Zhuang Qian, Kaizhu Huang, Qiu-Feng Wang, Xu-Yao Zhang
In this paper, we present a comprehensive survey trying to offer a systematic and structured investigation on robust adversarial training in pattern recognition.
no code implementations • 18 Feb 2022 • Chenru Jiang, Kaizhu Huang, Shufei Zhang, Jimin Xiao, Zhenxing Niu, Amir Hussain
In this paper, we focus on tackling the precise keypoint coordinates regression task.
1 code implementation • 27 Jan 2022 • Penglei Gao, Xi Yang, Rui Zhang, John Y. Goulermas, Yujie Geng, Yuyao Yan, Kaizhu Huang
In this paper, we develop a novel transformer-based generative adversarial neural network called U-Transformer for generalised image outpainting problem.
1 code implementation • 1 Nov 2021 • Kai Yao, Kaizhu Huang, Jie Sun, Amir Hussain
Automatic nuclei segmentation and classification play a vital role in digital pathology.
Ranked #4 on
Multi-tissue Nucleus Segmentation
on CoNSeP
1 code implementation • 21 Oct 2021 • Liuqing Zhao, Fan Lyu, Fuyuan Hu, Kaizhu Huang, Fenglei Xu, Linyan Li
Sentence-based Image Editing (SIE) aims to deploy natural language to edit an image.
no code implementations • 29 Sep 2021 • Zhuang Qian, Shufei Zhang, Kaizhu Huang, Qiufeng Wang, Bin Gu, Huan Xiong, Xinping Yi
It is possibly due to the fact that the conventional adversarial training methods generate adversarial perturbations usually in a supervised way, so that the adversarial samples are highly biased towards the decision boundary, resulting in an inhomogeneous data distribution.
1 code implementation • 23 Jul 2021 • Kai Yao, Kaizhu Huang, Jie Sun, Curran Jude
We also propose a novel training algorithm able to align the disentangled content in the latent space to reduce micro-level lossy transformation.
1 code implementation • 8 Jul 2021 • Zhuang Qian, Shufei Zhang, Kaizhu Huang, Qiufeng Wang, Rui Zhang, Xinping Yi
The proposed adversarial training with latent distribution (ATLD) method defends against adversarial attacks by crafting LMAEs with the latent manifold in an unsupervised manner.
1 code implementation • 16 Jun 2021 • Zihan Ye, Fuyuan Hu, Fan Lyu, Linyan Li, Kaizhu Huang
However, the traditional TL cannot search reliable unseen disentangled representations due to the unavailability of unseen classes in ZSL.
no code implementations • 16 Jun 2021 • Shuyi Qu, Zhenxing Niu, Kaizhu Huang, Jianke Zhu, Matan Protter, Gadi Zimerman, Yinghui Xu
Recent deep generative models have achieved promising performance in image inpainting.
no code implementations • 13 Jun 2021 • Zhicheng Cai, Kaizhu Huang, Chenglei Peng
This paper proposes a novel nonlinear activation mechanism typically for convolutional neural network (CNN), named as reborn mechanism.
no code implementations • 28 Apr 2021 • Yangfan Zhou, Kaizhu Huang, Cheng Cheng, Xuguang Wang, Amir Hussain, Xin Liu
%on how to exploit strong convexity to further improve the convergence rate of AdaBelief.
1 code implementation • 24 Apr 2021 • Jingwei Guo, Kaizhu Huang, Xinping Yi, Rui Zhang
}, we introduce a novel Local and Global Disentangled Graph Convolutional Network (LGD-GCN) to capture both local and global information for graph disentanglement.
no code implementations • 10 Mar 2021 • Ping Guo, Kaizhu Huang, Zenglin Xu
In this work, we generalize the reaction-diffusion equation in statistical physics, Schr\"odinger equation in quantum mechanics, Helmholtz equation in paraxial optics into the neural partial differential equations (NPDE), which can be considered as the fundamental equations in the field of artificial intelligence research.
1 code implementation • ICCV 2021 • Zhiqiang Gao, Shufei Zhang, Kaizhu Huang, Qiufeng Wang, Chaoliang Zhong
In particular, we show that the distribution discrepancy can be reduced by constraining feature gradients of two domains to have similar distributions.
1 code implementation • 26 Nov 2020 • Penglei Gao, Xi Yang, Rui Zhang, Kaizhu Huang
We propose a continuous neural network architecture, termed Explainable Tensorized Neural Ordinary Differential Equations (ETN-ODE), for multi-step time series prediction at arbitrary time points.
2 code implementations • NeurIPS 2021 • Ye Ma, Zixun Lan, Lu Zong, Kaizhu Huang
A global scoring mechanism is then developed to regulate beam search to generate summaries in a near-global optimal fashion.
1 code implementation • 13 Dec 2018 • Haochuan Jiang, Guanyu Yang, Kaizhu Huang, and Rui ZHANG
Due to the huge category number, the sophisticated com-binations of various strokes and radicals, and the free writing or print-ing styles, generating Chinese characters with diverse styles is alwaysconsidered as a difficult task.
2 code implementations • 6 Dec 2017 • Kyeong Soo Kim, Sanghyuk Lee, Kaizhu Huang
Exploiting the hierarchical nature of the building/floor estimation and floor-level coordinates estimation of a location, we propose a new DNN architecture consisting of a stacked autoencoder for the reduction of feature space dimension and a feed-forward classifier for multi-label classification of building/floor/location, on which the multi-building and multi-floor indoor localization system based on Wi-Fi fingerprinting is built.
no code implementations • 15 Mar 2012 • Kaizhu Huang, Rong Jin, Zenglin Xu, Cheng-Lin Liu
Most existing distance metric learning methods assume perfect side information that is usually given in pairwise or triplet constraints.