no code implementations • 2 Apr 2025 • Zhaochen Wang, Yujun Cai, Zi Huang, Bryan Hooi, Yiwei Wang, Ming-Hsuan Yang
Vision-language models (VLMs) have advanced rapidly in processing multimodal information, but their ability to reconcile conflicting signals across modalities remains underexplored.
no code implementations • 24 Mar 2025 • Wenhao You, Bryan Hooi, Yiwei Wang, Youke Wang, Zong Ke, Ming-Hsuan Yang, Zi Huang, Yujun Cai
While safety mechanisms have significantly progressed in filtering harmful text inputs, MLLMs remain vulnerable to multimodal jailbreaks that exploit their cross-modal reasoning capabilities.
no code implementations • 18 Mar 2025 • Bowen Yuan, Yuxia Fu, Zijian Wang, Yadan Luo, Zi Huang
We argue that the three key properties to alleviate this performance-storage dilemma are informativeness, discriminativeness, and compressibility of the condensed data.
no code implementations • 14 Mar 2025 • Shuyang Hao, Yiwei Wang, Bryan Hooi, Jun Liu, Muhao Chen, Zi Huang, Yujun Cai
However, we identify a critical limitation: not every adversarial optimization step leads to a positive outcome, and indiscriminately accepting optimization results at each step may reduce the overall attack success rate.
1 code implementation • 13 Mar 2025 • Zecheng Zhao, Zhi Chen, Zi Huang, Shazia Sadiq, Tong Chen
Text-to-Video Retrieval (TVR) aims to match videos with corresponding textual queries, yet the continual influx of new video content poses a significant challenge for maintaining system performance over time.
no code implementations • 13 Mar 2025 • Zhi Chen, Zecheng Zhao, Jingcai Guo, Jingjing Li, Zi Huang
Zero-shot learning (ZSL) aims to recognize unseen classes without labeled training examples by leveraging class-level semantic descriptors such as attributes.
no code implementations • 11 Mar 2025 • Fengyi Zhang, Huitong Yang, Zheng Zhang, Zi Huang, Yadan Luo
Self-supervised 3D occupancy prediction offers a promising solution for understanding complex driving scenes without requiring costly 3D annotations.
no code implementations • 12 Feb 2025 • Peng-Fei Zhang, Guangdong Bai, Zi Huang
Current adversarial attacks for evaluating the robustness of vision-language pre-trained (VLP) models in multi-modal tasks suffer from limited transferability, where attacks crafted for a specific model often struggle to generalize effectively across different models, limiting their utility in assessing robustness more broadly.
no code implementations • 9 Feb 2025 • Danny Wang, Ruihong Qiu, Guangdong Bai, Zi Huang
The implicit adversarial training process employs a novel alternating optimisation framework by training: (1) a latent generative model to regularly imitate the in-distribution (ID) embeddings from an evolving GNN, and (2) a GNN encoder and an OOD detector to accurately classify ID data while increasing the energy divergence between the ID embeddings and the generative model's synthetic embeddings.
no code implementations • 5 Feb 2025 • Wenhao You, Bryan Hooi, Yiwei Wang, Euijin Choo, Ming-Hsuan Yang, Junsong Yuan, Zi Huang, Yujun Cai
Recent advancements in diffusion models have driven the growth of text-guided image editing tools, enabling precise and iterative modifications of synthesized content.
1 code implementation • 3 Feb 2025 • Zhizhen Zhang, Lei Zhu, Zhen Fang, Zi Huang, Yadan Luo
Pre-training vision-language representations on human action videos has emerged as a promising approach to reduce reliance on large-scale expert demonstrations for training embodied agents.
1 code implementation • 7 Jan 2025 • Zi Huang, Simon Denman, Akila Pemasiri, Clinton Fookes, Terrence Martin
Specifically, we investigate pre-training masked autoencoders (MAE) on baseband in-phase and quadrature (I/Q) signals from various RF domains and subsequently transfer the learned representation to the radar domain, where annotated data are limited.
1 code implementation • 6 Jan 2025 • Zi Huang, Simon Denman, Akila Pemasiri, Terrence Martin, Clinton Fookes
To address this challenge, we introduce a novel and challenging dataset for radar detection (RadDet), comprising a large corpus of radar signals occupying a wideband spectrum across diverse radar density environments and signal-to-noise ratios (SNR).
no code implementations • 27 Nov 2024 • Shuyang Hao, Bryan Hooi, Jun Liu, Kai-Wei Chang, Zi Huang, Yujun Cai
Despite inheriting security measures from underlying language models, Vision-Language Models (VLMs) may still be vulnerable to safety alignment issues.
1 code implementation • 18 Nov 2024 • Bowen Yuan, Zijian Wang, Mahsa Baktashmotlagh, Yadan Luo, Zi Huang
At the image level, we employ a palette network, a specialized neural network, to dynamically allocate colors from a reduced color space to each pixel.
no code implementations • 11 Nov 2024 • Jia Syuen Lim, Yadan Luo, Zhi Chen, Tianqi Wei, Scott Chapman, Zi Huang
In the Detection and Multi-Object Tracking of Sweet Peppers Challenge, we present Track Any Peppers (TAP) - a weakly supervised ensemble technique for sweet peppers tracking.
1 code implementation • 16 Oct 2024 • Zixin Wang, Dong Gong, Sen Wang, Zi Huang, Yadan Luo
To address these questions, we propose Token Condensation as Adaptation (TCA), a training-free adaptation method for CLIP by pruning class-irrelevant visual tokens while merging class-ambiguous tokens.
1 code implementation • 13 Sep 2024 • Zhi Chen, Tianqi Wei, Zecheng Zhao, Jia Syuen Lim, Yadan Luo, Hu Zhang, Xin Yu, Scott Chapman, Zi Huang
In modern agriculture, precise monitoring of plants and fruits is crucial for tasks such as high-throughput phenotyping and automated harvesting.
1 code implementation • 6 Sep 2024 • Tianqi Wei, Zhi Chen, Xin Yu, Scott Chapman, Paul Melloy, Zi Huang
(2) Image source: Unlike typical datasets that contain images from laboratory settings, PlantSeg primarily comprises in-the-wild plant disease images.
no code implementations • 6 Aug 2024 • Zhi Chen, Zecheng Zhao, Yadan Luo, Zi Huang
It can effectively align the target text prompt and input image within the same feature space and save substantial processing time.
no code implementations • 6 Aug 2024 • Tianqi Wei, Zhi Chen, Zi Huang, Xin Yu
Motivated by this observation, we propose an in-the-wild multimodal plant disease recognition dataset that contains the largest number of disease classes but also text-based descriptions for each disease.
1 code implementation • 31 Jul 2024 • Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu
Real-world navigation often involves dealing with unexpected obstructions such as closed doors, moved objects, and unpredictable entities.
1 code implementation • 18 Jul 2024 • Wei Jiang, Tong Chen, Guanhua Ye, Wentao Zhang, Lizhen Cui, Zi Huang, Hongzhi Yin
Moreover, due to the interval-based predictions and intermittent nature of data filing in many transportation services, the instantaneous dynamics of urban flows can hardly be captured, rendering differential equation-based continuous modeling a loose fit for this setting.
1 code implementation • 25 Jun 2024 • Hung Vinh Tran, Tong Chen, Quoc Viet Hung Nguyen, Zi Huang, Lizhen Cui, Hongzhi Yin
State-of-the-art RSs primarily depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables.
no code implementations • 21 Jun 2024 • Zhuoxiao Chen, Junjie Meng, Mahsa Baktashmotlagh, Yonggang Zhang, Zi Huang, Yadan Luo
Specifically, we propose a Model Synergy (MOS) strategy that dynamically selects historical checkpoints with diverse knowledge and assembles them to best accommodate the current test batch.
1 code implementation • 21 Jun 2024 • Jia Syuen Lim, Zhuoxiao Chen, Mahsa Baktashmotlagh, Zhi Chen, Xin Yu, Zi Huang, Yadan Luo
We demonstrate the effectiveness of DiPEx through extensive class-agnostic OD and OOD-OD experiments on MS-COCO and LVIS, surpassing other prompting methods by up to 20. 1% in AR and achieving a 21. 3% AP improvement over SAM.
1 code implementation • 19 Jun 2024 • Zhuoxiao Chen, Zixin Wang, Yadan Luo, Sen Wang, Zi Huang
We minimize the sharpness to cultivate a flat loss landscape to ensure model resiliency to minor data variations, thereby enhancing the generalization of the adaptation process.
no code implementations • 13 Jun 2024 • Fengyi Zhang, Yadan Luo, Tianjun Zhang, Lin Zhang, Zi Huang
The field of novel-view synthesis has recently witnessed the emergence of 3D Gaussian Splatting, which represents scenes in a point-based manner and renders through rasterization.
1 code implementation • 4 Jun 2024 • Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu
Current Vision-and-Language Navigation (VLN) tasks mainly employ textual instructions to guide agents.
1 code implementation • 23 May 2024 • Yilun Liu, Ruihong Qiu, Zi Huang
Large-scale graphs are valuable for graph representation learning, yet the abundant data in these graphs hinders the efficiency of the training process.
1 code implementation • 20 May 2024 • Yanran Tang, Ruihong Qiu, Yilun Liu, Xue Li, Zi Huang
Specifically, an edge feature-based graph attention layer (EUGAT) is proposed to comprehensively update node and edge features during graph modelling, resulting in a full utilisation of structural information of legal cases.
no code implementations • 9 May 2024 • Peng-Fei Zhang, Zi Huang, Guangdong Bai
To this end, we propose a novel black-box method to generate Universal Adversarial Perturbations (UAPs), which is so called the Effective and T ransferable Universal Adversarial Attack (ETU), aiming to mislead a variety of existing VLP models in a range of downstream tasks.
no code implementations • 7 May 2024 • Peng-Fei Zhang, Zi Huang, Xin-Shun Xu, Guangdong Bai
We propose a hybrid adversarial training surrounding multiple potential adversarial perturbations, alongside a semi-supervised learning based on class- rebalancing sample selection to enhance the resilience of the model for dual corruption.
1 code implementation • 26 Mar 2024 • Yanran Tang, Ruihong Qiu, Hongzhi Yin, Xue Li, Zi Huang
In a case pool, there are three types of case connectivity relationships: the case reference relationship, the case semantic relationship, and the case legal charge relationship.
1 code implementation • 20 Mar 2024 • Djamahl Etchegaray, Zi Huang, Tatsuya Harada, Yadan Luo
In this work, we tackle the limitations of current LiDAR-based 3D object detection systems, which are hindered by a restricted class vocabulary and the high costs associated with annotating new object classes.
no code implementations • 29 Feb 2024 • Akila Pemasiri, Zi Huang, Fraser Williams, Ethan Goan, Simon Denman, Terrence Martin, Clinton Fookes
This paper addresses a critical preliminary step in radar signal processing: detecting the presence of a radar signal and robustly estimating its bandwidth.
1 code implementation • 22 Dec 2023 • Yilun Liu, Ruihong Qiu, Yanran Tang, Hongzhi Yin, Zi Huang
Although the CaT alleviates the catastrophic forgetting problem, there exist three issues: (1) The graph condensation only focuses on labelled nodes while neglecting abundant information carried by unlabelled nodes; (2) The continual training scheme of the CaT overemphasises on the previously learned knowledge, limiting the model capacity to learn from newly added memories; (3) Both the condensation process and replaying process of the CaT are time-consuming.
1 code implementation • 18 Dec 2023 • Yanran Tang, Ruihong Qiu, Yilun Liu, Xue Li, Zi Huang
Previous neural legal case retrieval models mostly encode the unstructured raw text of case into a case representation, which causes the lack of important legal structural information in a case and leads to poor case representation; (2) Lengthy legal text limitation.
1 code implementation • 15 Dec 2023 • Zi Huang, Akila Pemasiri, Simon Denman, Clinton Fookes, Terrence Martin
Radio signal recognition is a crucial function in electronic warfare.
no code implementations • 12 Dec 2023 • Hu Zhang, Jianhua Xu, Tao Tang, Haiyang Sun, Xin Yu, Zi Huang, Kaicheng Yu
OpenSight utilizes 2D-3D geometric priors for the initial discernment and localization of generic objects, followed by a more specific semantic interpretation of the detected objects.
1 code implementation • 4 Dec 2023 • Yiyun Zhang, Zijian Wang, Yadan Luo, Xin Yu, Zi Huang
Existing Building Damage Detection (BDD) methods always require labour-intensive pixel-level annotations of buildings and their conditions, hence largely limiting their applications.
1 code implementation • 31 Oct 2023 • Zixin Wang, Yadan Luo, Liang Zheng, Zhuoxiao Chen, Sen Wang, Zi Huang
This article presents a comprehensive survey of online test-time adaptation (OTTA), focusing on effectively adapting machine learning models to distributionally different target data upon batch arrival.
1 code implementation • 26 Oct 2023 • Yudong Chen, Sen Wang, Jiajun Liu, Xuwei Xu, Frank de Hoog, Brano Kusy, Zi Huang
Interestingly, we discovered that even if the student and the teacher have the same feature dimensions, adding a projector still helps to improve the distillation performance.
1 code implementation • 16 Oct 2023 • Zhuoxiao Chen, Yadan Luo, Zixin Wang, Zijian Wang, Xin Yu, Zi Huang
This paper investigates a more practical and challenging research task: Open World Active Learning for 3D Object Detection (OWAL-3D), aimed at acquiring informative point clouds with new concepts.
no code implementations • 9 Oct 2023 • Hu Zhang, Xin Shen, Heming Du, Huiqiang Chen, Chen Liu, Hongwei Sheng, Qingzheng Xu, MD Wahiduzzaman Khan, Qingtao Yu, Tianqing Zhu, Scott Chapman, Zi Huang, Xin Yu
In the wheat nutrient deficiencies classification challenge, we present the DividE and EnseMble (DEEM) method for progressive test data predictions.
no code implementations • 6 Oct 2023 • Xiaoxiao Sun, Xingjian Leng, Zijian Wang, Yang Yang, Zi Huang, Liang Zheng
Analyzing model performance in various unseen environments is a critical research problem in the machine learning community.
3 code implementations • 18 Sep 2023 • Yilun Liu, Ruihong Qiu, Zi Huang
Recent replay-based methods intend to solve this problem by updating the model using both (1) the entire new-coming data and (2) a sampling-based memory bank that stores replayed graphs to approximate the distribution of historical data.
1 code implementation • 8 Sep 2023 • Yan Jiang, Ruihong Qiu, Yi Zhang, Zi Huang
As social media becomes increasingly popular, more and more activities related to public health emerge.
no code implementations • 20 Aug 2023 • Chen Liu, Peike Li, Hu Zhang, Lincheng Li, Zi Huang, Dadong Wang, Xin Yu
In a nutshell, our BAVS is designed to eliminate the interference of background noise or off-screen sounds in segmentation by establishing the audio-visual correspondences in an explicit manner.
1 code implementation • 6 Aug 2023 • Zixin Wang, Yadan Luo, Zhi Chen, Sen Wang, Zi Huang
The prevalence of domain adaptive semantic segmentation has prompted concerns regarding source domain data leakage, where private information from the source domain could inadvertently be exposed in the target domain.
no code implementations • 5 Aug 2023 • Cheng-Wei Ching, Chirag Gupta, Zi Huang, Liting Hu
However, the existing compressed data aggregation (CDA) frameworks (e. g., compressed sensing-based data aggregation, deep learning(DL)-based data aggregation) do not possess the flexibility and adaptivity required to handle distinct sensing tasks and environmental changes.
1 code implementation • 1 Aug 2023 • Zhi Chen, Pengfei Zhang, Jingjing Li, Sen Wang, Zi Huang
To take the advantage of image augmentations while mitigating the semantic distortion issue, we propose a novel ZSL approach by Harnessing Adversarial Samples (HAS).
no code implementations • ICCV 2023 • Yadan Luo, Zhuoxiao Chen, Zhen Fang, Zheng Zhang, Zi Huang, Mahsa Baktashmotlagh
Achieving a reliable LiDAR-based object detector in autonomous driving is paramount, but its success hinges on obtaining large amounts of precise 3D annotations.
1 code implementation • ICCV 2023 • Zhuoxiao Chen, Yadan Luo, Zheng Wang, Mahsa Baktashmotlagh, Zi Huang
Unsupervised domain adaptation (DA) with the aid of pseudo labeling techniques has emerged as a crucial approach for domain-adaptive 3D object detection.
1 code implementation • 19 Jun 2023 • Zi Huang, Akila Pemasiri, Simon Denman, Clinton Fookes, Terrence Martin
Radio signal recognition is a crucial task in both civilian and military applications, as accurate and timely identification of unknown signals is an essential part of spectrum management and electronic warfare.
no code implementations • 1 May 2023 • Xuhui Ren, Tong Chen, Quoc Viet Hung Nguyen, Lizhen Cui, Zi Huang, Hongzhi Yin
Recent conversational recommender systems (CRSs) tackle those limitations by enabling recommender systems to interact with the user to obtain her/his current preference through a sequence of clarifying questions.
no code implementations • 10 Feb 2023 • Liang Qu, Ningzhi Tang, Ruiqi Zheng, Quoc Viet Hung Nguyen, Zi Huang, Yuhui Shi, Hongzhi Yin
In light of this, we propose a semi-decentralized federated ego graph learning framework for on-device recommendations, named SemiDFEGL, which introduces new device-to-device collaborations to improve scalability and reduce communication costs and innovatively utilizes predicted interacted item nodes to connect isolated ego graphs to augment local subgraphs such that the high-order user-item collaborative information could be used in a privacy-preserving manner.
1 code implementation • 23 Jan 2023 • Yadan Luo, Zhuoxiao Chen, Zijian Wang, Xin Yu, Zi Huang, Mahsa Baktashmotlagh
To alleviate the high annotation cost in LiDAR-based 3D object detection, active learning is a promising solution that learns to select only a small portion of unlabeled data to annotate, without compromising model performance.
no code implementations • CVPR 2023 • Heming Du, Lincheng Li, Zi Huang, Xin Yu
In HiNL, we propose a History-aware State Estimation (HaSE) module to alleviate the impacts of dominant historical states on the current state estimation.
no code implementations • ICCV 2023 • Zijian Wang, Yadan Luo, Liang Zheng, Zi Huang, Mahsa Baktashmotlagh
This paper focuses on model transferability estimation, i. e., assessing the performance of pre-trained models on a downstream task without performing fine-tuning.
1 code implementation • 27 Oct 2022 • Yudong Chen, Sen Wang, Jiajun Liu, Xuwei Xu, Frank de Hoog, Zi Huang
Motivated by the positive effect of the projector in feature distillation, we propose an ensemble of projectors to further improve the quality of student features.
no code implementations • 22 Oct 2022 • Yang Li, Tong Chen, Peng-Fei Zhang, Zi Huang, Hongzhi Yin
In order to counteract the scarcity and incompleteness of POI check-ins, we propose a novel self-supervised learning paradigm in \ssgrec, where the trajectory representations are contrastively learned from two augmented views on geolocations and temporal transitions.
1 code implementation • 8 Sep 2022 • Ruihong Qiu, Zi Huang, Hongzhi Yin
In this paper, we propose the Overparameterised Recommender (OverRec), which utilises a recurrent neural tangent kernel (RNTK) as a similarity measurement for user sequences to successfully bypass the restriction of hardware for huge models.
no code implementations • 5 Sep 2022 • Zhi Chen, Yadan Luo, Sen Wang, Jingjing Li, Zi Huang
We identify two key challenges in our FedZSL protocol: 1) the trained models are prone to be biased to the locally observed classes, thus failing to generalize to the unseen classes and/or seen classes appeared on other devices; 2) as each category in the training data comes from a single source, the central model is highly vulnerable to model replacement (backdoor) attacks.
1 code implementation • 11 Jul 2022 • Zixin Wang, Yadan Luo, Peng-Fei Zhang, Sen Wang, Zi Huang
A typical multi-source domain adaptation (MSDA) approach aims to transfer knowledge learned from a set of labeled source domains, to an unlabeled target domain.
no code implementations • 5 Jul 2022 • Zhi Chen, Yadan Luo, Sen Wang, Jingjing Li, Zi Huang
To address this issue, we propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
1 code implementation • 29 Mar 2022 • Junliang Yu, Hongzhi Yin, Xin Xia, Tong Chen, Jundong Li, Zi Huang
In recent years, neural architecture-based recommender systems have achieved tremendous success, but they still fall short of expectation when dealing with highly sparse data.
2 code implementations • 13 Feb 2022 • Yadan Luo, Zijian Wang, Zhuoxiao Chen, Zi Huang, Mahsa Baktashmotlagh
However, most existing OSDA approaches are limited due to three main reasons, including: (1) the lack of essential theoretical analysis of generalization bound, (2) the reliance on the coexistence of source and target data during adaptation, and (3) failing to accurately estimate the uncertainty of model predictions.
no code implementations • 8 Jan 2022 • Mubashir Imran, Hongzhi Yin, Tong Chen, Zi Huang, Kai Zheng
Such heterogeneous network embedding (HNE) methods effectively harness the heterogeneity of small-scale HINs.
no code implementations • 13 Dec 2021 • Yudi Li, Min Tang, Yun Yang, Zi Huang, Ruofeng Tong, Shuangcai Yang, Yao Li, Dinesh Manocha
We present a novel mesh-based learning approach (N-Cloth) for plausible 3D cloth deformation prediction.
no code implementations • 21 Oct 2021 • Shijie Zhang, Hongzhi Yin, Tong Chen, Zi Huang, Quoc Viet Hung Nguyen, Lizhen Cui
Evaluations on two real-world datasets show that 1) our attack model significantly boosts the exposure rate of the target item in a stealthy way, without harming the accuracy of the poisoned recommender; and 2) existing defenses are not effective enough, highlighting the need for new defenses against our local model poisoning attacks to federated recommender systems.
1 code implementation • 13 Oct 2021 • Fuming You, Jingjing Li, Lei Zhu, Ke Lu, Zhi Chen, Zi Huang
To address these problems, we investigate domain adaptive semantic segmentation without source data, which assumes that the model is pre-trained on the source domain, and then adapting to the target domain without accessing source data anymore.
2 code implementations • 12 Oct 2021 • Ruihong Qiu, Zi Huang, Hongzhi Yin, Zijian Wang
In this paper, both empirical and theoretical investigations of this representation degeneration problem are first provided, based on which a novel recommender model DuoRec is proposed to improve the item embeddings distribution.
2 code implementations • 1 Sep 2021 • Ruihong Qiu, Zi Huang, Hongzhi Yin
In this paper, we propose a novel sequential recommendation framework to overcome these challenges based on a memory augmented multi-instance contrastive predictive coding scheme, denoted as MMInfoRec.
1 code implementation • ICCV 2021 • Zijian Wang, Yadan Luo, Ruihong Qiu, Zi Huang, Mahsa Baktashmotlagh
Domain generalization (DG) aims to generalize a model trained on multiple source (i. e., training) domains to a distributionally different target (i. e., test) domain.
Ranked #7 on
Single-Source Domain Generalization
on Digits-five
1 code implementation • 7 Jul 2021 • Zhi Chen, Yadan Luo, Sen Wang, Ruihong Qiu, Jingjing Li, Zi Huang
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e. g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
1 code implementation • 6 Jul 2021 • Ruihong Qiu, Sen Wang, Zhi Chen, Hongzhi Yin, Zi Huang
Existing visually-aware models make use of the visual features as a separate collaborative signal similarly to other features to directly predict the user's preference without considering a potential bias, which gives rise to a visually biased recommendation.
no code implementations • 4 Jul 2021 • Yang Li, Tong Chen, Zi Huang
As a result, this creates a severe bottleneck when we are trying to advance the recommendation accuracy and generating fine-grained explanations since the explicit attributes have only loose connections to the actual recommendation process.
no code implementations • 2 Jul 2021 • Ruihong Qiu, Zi Huang, Tong Chen, Hongzhi Yin
According to our analysis, existing positional encoding schemes are generally forward-aware only, which can hardly represent the dynamics of the intention in a session.
no code implementations • 2 Jul 2021 • Ruihong Qiu, Zi Huang, Jingjing Li, Hongzhi Yin
Different from the traditional recommender system, the session-based recommender system introduces the concept of the session, i. e., a sequence of interactions between a user and multiple items within a period, to preserve the user's recent interest.
no code implementations • 30 Jun 2021 • Yang Li, Tong Chen, Yadan Luo, Hongzhi Yin, Zi Huang
Furthermore, the sparse POI-POI transitions restrict the ability of a model to learn effective sequential patterns for recommendation.
no code implementations • 4 Jun 2021 • Tong Chen, Hongzhi Yin, Yujia Zheng, Zi Huang, Yang Wang, Meng Wang
The core idea is to compose elastic embeddings for each item, where an elastic embedding is the concatenation of a set of embedding blocks that are carefully chosen by an automated search function.
no code implementations • 11 May 2021 • Xuhui Ren, Hongzhi Yin, Tong Chen, Hao Wang, Zi Huang, Kai Zheng
Hence, the ability to generate suitable clarifying questions is the key to timely tracing users' dynamic preferences and achieving successful recommendations.
no code implementations • 5 Apr 2021 • Tong Chen, Hongzhi Yin, Xiangliang Zhang, Zi Huang, Yang Wang, Meng Wang
As a well-established approach, factorization machine (FM) is capable of automatically learning high-order interactions among features to make predictions without the need for manual feature engineering.
no code implementations • 4 Apr 2021 • Tong Chen, Hongzhi Yin, Jie Ren, Zi Huang, Xiangliang Zhang, Hao Wang
In WIDEN, we propose a novel inductive, meta path-free message passing scheme that packs up heterogeneous node features with their associated edges from both low- and high-order neighbor nodes.
no code implementations • 23 Feb 2021 • Ziwei Wang, Yadan Luo, Zi Huang
In this work, we explicitly build a Modality Transition Module (MTM) to transfer visual features into semantic representations before forwarding them to the language model.
no code implementations • 29 Jan 2021 • Shijie Zhang, Hongzhi Yin, Tong Chen, Zi Huang, Lizhen Cui, Xiangliang Zhang
Specifically, in GERAI, we bind the information perturbation mechanism in differential privacy with the recommendation capability of graph convolutional networks.
1 code implementation • ICCV 2021 • Zhi Chen, Yadan Luo, Ruihong Qiu, Sen Wang, Zi Huang, Jingjing Li, Zheng Zhang
Generalized zero-shot learning (GZSL) aims to classify samples under the assumption that some classes are not observable during training.
no code implementations • 9 Jan 2021 • Zhi Chen, Zi Huang, Jingjing Li, Zheng Zhang
To address these issues, in this paper, we propose a novel framework that leverages dual variational autoencoders with a triplet loss to learn discriminative latent features and applies the entropy-based calibration to minimize the uncertainty in the overlapped area between the seen and unseen classes.
1 code implementation • 25 Nov 2020 • Yadan Luo, Zi Huang, Hongxu Chen, Yang Yang, Mahsa Baktashmotlagh
Most of the prior efforts are devoted to learning node embeddings with graph neural networks (GNNs), which preserve the signed network topology by message-passing along edges to facilitate the downstream link prediction task.
1 code implementation • 31 Jul 2020 • Yadan Luo, Zi Huang, Zijian Wang, Zheng Zhang, Mahsa Baktashmotlagh
To further enhance the model capacity and testify the robustness of the proposed architecture on difficult transfer tasks, we extend our model to work in a semi-supervised setting using an additional video-level bipartite graph.
Ranked #3 on
Domain Adaptation
on HMDB --> UCF (full)
no code implementations • 27 Jul 2020 • Zhi Chen, Sen Wang, Jingjing Li, Zi Huang
A voting strategy averages the probability distributions output from the classifiers and, given that some patches are more discriminative than others, a discrimination-based attention mechanism helps to weight each patch accordingly.
1 code implementation • 6 Jul 2020 • Ruihong Qiu, Hongzhi Yin, Zi Huang, Tong Chen
On one hand, when a new session arrives, a session graph with a global attribute is constructed based on the current session and its associate user.
1 code implementation • ICML 2020 • Yadan Luo, Zijian Wang, Zi Huang, Mahsa Baktashmotlagh
The existing domain adaptation approaches which tackle this problem work in the closed-set setting with the assumption that the source and the target data share exactly the same classes of objects.
no code implementations • 15 Jun 2020 • Ziwei Wang, Zi Huang, Yadan Luo, Huimin Lu
With the rapid advancement of image captioning and visual question answering at single-round level, the question of how to generate multi-round dialogue about visual content has not yet been well explored. Existing visual dialogue methods encode the image into a fixed feature vector directly, concatenated with the question and history embeddings to predict the response. Some recent methods tackle the co-reference resolution problem using co-attention mechanism to cross-refer relevant elements from the image, history, and the target question. However, it remains challenging to reason visual relationships, since the fine-grained object-level information is omitted before co-attentive reasoning.
1 code implementation • 20 May 2020 • Shijie Zhang, Hongzhi Yin, Tong Chen, Quoc Viet Nguyen Hung, Zi Huang, Lizhen Cui
Therefore, it is of great practical significance to construct a robust recommender system that is able to generate stable recommendations even in the presence of shilling attacks.
no code implementations • 19 May 2020 • Tong Chen, Hongzhi Yin, Guanhua Ye, Zi Huang, Yang Wang, Meng Wang
Then, by treating attributes as the bridge between users and items, we can thoroughly model the user-item preferences (i. e., personalization) and item-item relationships (i. e., substitution) for recommendation.
no code implementations • 16 Apr 2020 • Shaoxiong Ji, Xue Li, Zi Huang, Erik Cambria
Mental health is a critical issue in modern society, and mental disorders could sometimes turn to suicidal ideation without effective treatment.
no code implementations • 5 Apr 2020 • Junliang Yu, Hongzhi Yin, Jundong Li, Min Gao, Zi Huang, Lizhen Cui
Social recommender systems are expected to improve recommendation quality by incorporating social information when there is little user-item interaction data.
1 code implementation • 27 Nov 2019 • Ruihong Qiu, Jingjing Li, Zi Huang, Hongzhi Yin
In this paper, therefore, we study the item transition pattern by constructing a session graph and propose a novel model which collaboratively considers the sequence order and the latent order in the session graph for a session-based recommender system.
no code implementations • 12 Nov 2019 • Yadan Luo, Zi Huang, Zheng Zhang, Ziwei Wang, Mahsa Baktashmotlagh, Yang Yang
Meta-learning for few-shot learning allows a machine to leverage previously acquired knowledge as a prior, thus improving the performance on novel tasks with only small amounts of data.
no code implementations • 5 Nov 2019 • Zijian Wang, Zheng Zhang, Yadan Luo, Zi Huang
Existing deep hashing approaches fail to fully explore semantic correlations and neglect the effect of linguistic context on visual attention learning, leading to inferior performance.
no code implementations • 23 Oct 2019 • Shaoxiong Ji, Shirui Pan, Xue Li, Erik Cambria, Guodong Long, Zi Huang
Suicide is a critical issue in modern society.
no code implementations • 21 Sep 2019 • Zhi Chen, Jingjing Li, Yadan Luo, Zi Huang, Yang Yang
Thus, a multi-modal cycle-consistency loss between the synthesized semantic representations and the ground truth can be learned and leveraged to enforce the generated semantic features to approximate to the real distribution in semantic space.
1 code implementation • 17 Sep 2019 • Jingjing Li, Mengmeng Jing, Ke Lu, Lei Zhu, Yang Yang, Zi Huang
An inevitable issue of such a paradigm is that the synthesized unseen features are prone to seen references and incapable to reflect the novelty and diversity of real unseen instances.
1 code implementation • 17 Sep 2019 • Jingjing Li, Erpeng Chen, Zhengming Ding, Lei Zhu, Ke Lu, Zi Huang
Domain adaptation investigates the problem of cross-domain knowledge transfer where the labeled source domain and unlabeled target domain have distinctive data distributions.
Ranked #4 on
Domain Adaptation
on USPS-to-MNIST
no code implementations • 1 Aug 2019 • Yadan Luo, Zi Huang, Zheng Zhang, Ziwei Wang, Jingjing Li, Yang Yang
Visual paragraph generation aims to automatically describe a given image from different perspectives and organize sentences in a coherent way.
no code implementations • 11 Jul 2019 • Jingjing Li, Mengmeng Jing, Yue Xie, Ke Lu, Zi Huang
Because of the distribution shifts, different target samples have distinct degrees of difficulty in adaptation.
1 code implementation • 20 Jun 2019 • Jingjing Li, Mengmeng Jing, Ke Lu, Lei Zhu, Yang Yang, Zi Huang
This work, for the first time, formulates CSR as a ZSL problem, and a tailor-made ZSL method is proposed to handle CSR.
no code implementations • 25 Apr 2019 • Lei Zhu, Zi Huang, Zhihui Li, Liang Xie, Heng Tao Shen
To address the problem, in this paper, we propose a novel hashing approach, dubbed as \emph{Discrete Semantic Transfer Hashing} (DSTH).
1 code implementation • CVPR 2019 • Jingjing Li, Mengmeng Jin, Ke Lu, Zhengming Ding, Lei Zhu, Zi Huang
In this paper, we take the advantage of generative adversarial networks (GANs) and propose a novel method, named leveraging invariant side GAN (LisGAN), which can directly generate the unseen features from random noises which are conditioned by the semantic descriptions.
Ranked #5 on
Generalized Zero-Shot Learning
on SUN Attribute
no code implementations • 5 Apr 2019 • Yadan Luo, Ziwei Wang, Zi Huang, Yang Yang, Huimin Lu
With the increasing number of online stores, there is a pressing need for intelligent search systems to understand the item photos snapped by customers and search against large-scale product databases to find their desired items.
no code implementations • 3 Apr 2019 • Zheng Zhang, Guo-Sen Xie, Yang Li, Sheng Li, Zi Huang
Due to its low storage cost and fast query speed, hashing has been recognized to accomplish similarity search in large-scale multimedia retrieval applications.
4 code implementations • 17 Dec 2018 • Shaoxiong Ji, Shirui Pan, Guodong Long, Xue Li, Jing Jiang, Zi Huang
Federated learning (FL) provides a promising approach to learning private language modeling for intelligent personalized keyboard suggestion by training models in distributed clients rather than training in a central server.
no code implementations • ACM International Conference on Multimedia 2018 • Ziwei Wang, Yadan Luo, Yang Li, Zi Huang, Hongzhi Yin
Existing image paragraph captioning methods give a series of sentences to represent the objects and regions of interests, where the descriptions are essentially generated by feeding the image fragments containing objects and regions into conventional image single-sentence captioning models.
1 code implementation • 25 Sep 2018 • Yadan Luo, Zi Huang, Yang Li, Fumin Shen, Yang Yang, Peng Cui
Hashing techniques are in great demand for a wide range of real-world applications such as image retrieval and network compression.
no code implementations • 22 Aug 2018 • Yadan Luo, Ziwei Wang, Zi Huang, Yang Yang, Cong Zhao
Rich high-quality annotated data is critical for semantic segmentation learning, yet acquiring dense and pixel-wise ground-truth is both labor- and time-consuming.
no code implementations • ICCV 2017 • Chao Li, Jiewei Cao, Zi Huang, Lei Zhu, Heng Tao Shen
In this paper, we propose a novel approach to automatically maximize the utility of weak semantic annotations (formalized as the semantic relevance of video shots to the target event) to facilitate video event classification.
no code implementations • 13 Jul 2017 • Lei Zhu, Zi Huang, Xiaobai Liu, Xiangnan He, Jingkuan Song, Xiaofang Zhou
Finally, compact binary codes are learned on intermediate representation within a tailored discrete binary embedding model which preserves visual relations of images measured with canonical views and removes the involved noises.
no code implementations • CVPR 2017 • Peng Wang, Lingqiao Liu, Chunhua Shen, Zi Huang, Anton Van Den Hengel, Heng Tao Shen
One-shot learning is a challenging problem where the aim is to recognize a class identified by a single training image.
no code implementations • 17 Jan 2017 • Hongyun Cai, Vincent W. Zheng, Fanwei Zhu, Kevin Chen-Chuan Chang, Zi Huang
Most existing community-related studies focus on detection, which aim to find the community membership for each user from user friendship links.
no code implementations • 6 Dec 2016 • Ruicong Xu, Yang Yang, Yadan Luo, Fumin Shen, Zi Huang, Heng Tao Shen
The first approach, termed Inner-product Binary Coding (IBC), preserves the inner relationships of images and videos in a common Hamming space.
no code implementations • 22 Jun 2016 • Jiewei Cao, Lingqiao Liu, Peng Wang, Zi Huang, Chunhua Shen, Heng Tao Shen
Instance retrieval requires one to search for images that contain a particular object within a large corpus.
no code implementations • 15 Jun 2016 • Yi Bin, Yang Yang, Zi Huang, Fumin Shen, Xing Xu, Heng Tao Shen
Video captioning has been attracting broad research attention in multimedia community.
no code implementations • CVPR 2016 • Peng Wang, Lingqiao Liu, Chunhua Shen, Zi Huang, Anton Van Den Hengel, Heng Tao Shen
The key observation motivating our approach is that "regular object" images, "unusual object" images and "other objects" images exhibit different region-level scores in terms of both the score values and the spatial distributions.