no code implementations • 20 Mar 2024 • Djamahl Etchegaray, Zi Huang, Tatsuya Harada, Yadan Luo
In this work, we tackle the limitations of current LiDAR-based 3D object detection systems, which are hindered by a restricted class vocabulary and the high costs associated with annotating new object classes.
no code implementations • 27 Feb 2024 • Bo Peng, Yadan Luo, Yonggang Zhang, Yixuan Li, Zhen Fang
Extensive experiments across OOD detection benchmarks empirically demonstrate that our proposed \textsc{ConjNorm} has established a new state-of-the-art in a variety of OOD detection setups, outperforming the current best method by up to 13. 25$\%$ and 28. 19$\%$ (FPR95) on CIFAR-100 and ImageNet-1K, respectively.
no code implementations • 12 Jan 2024 • Ziying Song, Lin Liu, Feiyang Jia, Yadan Luo, Guoxin Zhang, Lei Yang, Li Wang, Caiyan Jia
In the realm of modern autonomous driving, the perception system is indispensable for accurately assessing the state of the surrounding environment, thereby enabling informed prediction and planning.
1 code implementation • 4 Dec 2023 • Yiyun Zhang, Zijian Wang, Yadan Luo, Xin Yu, Zi Huang
Existing Building Damage Detection (BDD) methods always require labour-intensive pixel-level annotations of buildings and their conditions, hence largely limiting their applications.
1 code implementation • 31 Oct 2023 • Zixin Wang, Yadan Luo, Liang Zheng, Zhuoxiao Chen, Sen Wang, Zi Huang
In this paper, we present a comprehensive survey on online test-time adaptation (OTTA), a paradigm focused on adapting machine learning models to novel data distributions upon batch arrival.
1 code implementation • 16 Oct 2023 • Zhuoxiao Chen, Yadan Luo, Zixin Wang, Zijian Wang, Xin Yu, Zi Huang
To seek effective solutions, we investigate a more practical yet challenging research task: Open World Active Learning for 3D Object Detection (OWAL-3D), aiming at selecting a small number of 3D boxes to annotate while maximizing detection performance on both known and unknown classes.
1 code implementation • 6 Aug 2023 • Zixin Wang, Yadan Luo, Zhi Chen, Sen Wang, Zi Huang
The prevalence of domain adaptive semantic segmentation has prompted concerns regarding source domain data leakage, where private information from the source domain could inadvertently be exposed in the target domain.
no code implementations • ICCV 2023 • Yadan Luo, Zhuoxiao Chen, Zhen Fang, Zheng Zhang, Zi Huang, Mahsa Baktashmotlagh
Achieving a reliable LiDAR-based object detector in autonomous driving is paramount, but its success hinges on obtaining large amounts of precise 3D annotations.
1 code implementation • ICCV 2023 • Zhuoxiao Chen, Yadan Luo, Zheng Wang, Mahsa Baktashmotlagh, Zi Huang
Unsupervised domain adaptation (DA) with the aid of pseudo labeling techniques has emerged as a crucial approach for domain-adaptive 3D object detection.
1 code implementation • 23 Jan 2023 • Yadan Luo, Zhuoxiao Chen, Zijian Wang, Xin Yu, Zi Huang, Mahsa Baktashmotlagh
To alleviate the high annotation cost in LiDAR-based 3D object detection, active learning is a promising solution that learns to select only a small portion of unlabeled data to annotate, without compromising model performance.
no code implementations • ICCV 2023 • Zijian Wang, Yadan Luo, Liang Zheng, Zi Huang, Mahsa Baktashmotlagh
This paper focuses on model transferability estimation, i. e., assessing the performance of pre-trained models on a downstream task without performing fine-tuning.
no code implementations • 5 Sep 2022 • Zhi Chen, Yadan Luo, Sen Wang, Jingjing Li, Zi Huang
We identify two key challenges in our FedZSL protocol: 1) the trained models are prone to be biased to the locally observed classes, thus failing to generalize to the unseen classes and/or seen classes appeared on other devices; 2) as each category in the training data comes from a single source, the central model is highly vulnerable to model replacement (backdoor) attacks.
1 code implementation • 11 Jul 2022 • Zixin Wang, Yadan Luo, Peng-Fei Zhang, Sen Wang, Zi Huang
A typical multi-source domain adaptation (MSDA) approach aims to transfer knowledge learned from a set of labeled source domains, to an unlabeled target domain.
no code implementations • 5 Jul 2022 • Zhi Chen, Yadan Luo, Sen Wang, Jingjing Li, Zi Huang
To address this issue, we propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
2 code implementations • 13 Feb 2022 • Yadan Luo, Zijian Wang, Zhuoxiao Chen, Zi Huang, Mahsa Baktashmotlagh
However, most existing OSDA approaches are limited due to three main reasons, including: (1) the lack of essential theoretical analysis of generalization bound, (2) the reliance on the coexistence of source and target data during adaptation, and (3) failing to accurately estimate the uncertainty of model predictions.
no code implementations • 8 Sep 2021 • Zhuoxiao Chen, Yiyun Zhang, Yadan Luo, Zijian Wang, Jinjiang Zhong, Anthony Southon
With the rapid development of intelligent detection algorithms based on deep learning, much progress has been made in automatic road defect recognition and road marking parsing.
1 code implementation • 1 Sep 2021 • Zhuoxiao Chen, Yadan Luo, Mahsa Baktashmotlagh
The majority of video domain adaptation algorithms are proposed for closed-set scenarios in which all the classes are shared among the domains.
1 code implementation • ICCV 2021 • Zijian Wang, Yadan Luo, Ruihong Qiu, Zi Huang, Mahsa Baktashmotlagh
Domain generalization (DG) aims to generalize a model trained on multiple source (i. e., training) domains to a distributionally different target (i. e., test) domain.
1 code implementation • 7 Jul 2021 • Zhi Chen, Yadan Luo, Sen Wang, Ruihong Qiu, Jingjing Li, Zi Huang
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e. g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
no code implementations • 30 Jun 2021 • Yang Li, Yadan Luo, Zheng Zhang, Shazia W. Sadiq, Peng Cui
It aims at suggesting the next POI to a user in spatial and temporal context, which is a practical yet challenging task in various applications.
no code implementations • 30 Jun 2021 • Yang Li, Tong Chen, Yadan Luo, Hongzhi Yin, Zi Huang
Furthermore, the sparse POI-POI transitions restrict the ability of a model to learn effective sequential patterns for recommendation.
no code implementations • 23 Feb 2021 • Ziwei Wang, Yadan Luo, Zi Huang
In this work, we explicitly build a Modality Transition Module (MTM) to transfer visual features into semantic representations before forwarding them to the language model.
1 code implementation • ICCV 2021 • Zhi Chen, Yadan Luo, Ruihong Qiu, Sen Wang, Zi Huang, Jingjing Li, Zheng Zhang
Generalized zero-shot learning (GZSL) aims to classify samples under the assumption that some classes are not observable during training.
1 code implementation • 25 Nov 2020 • Yadan Luo, Zi Huang, Hongxu Chen, Yang Yang, Mahsa Baktashmotlagh
Most of the prior efforts are devoted to learning node embeddings with graph neural networks (GNNs), which preserve the signed network topology by message-passing along edges to facilitate the downstream link prediction task.
1 code implementation • 31 Jul 2020 • Yadan Luo, Zi Huang, Zijian Wang, Zheng Zhang, Mahsa Baktashmotlagh
To further enhance the model capacity and testify the robustness of the proposed architecture on difficult transfer tasks, we extend our model to work in a semi-supervised setting using an additional video-level bipartite graph.
Ranked #2 on Domain Adaptation on HMDB --> UCF (full)
1 code implementation • ICML 2020 • Yadan Luo, Zijian Wang, Zi Huang, Mahsa Baktashmotlagh
The existing domain adaptation approaches which tackle this problem work in the closed-set setting with the assumption that the source and the target data share exactly the same classes of objects.
no code implementations • 15 Jun 2020 • Ziwei Wang, Zi Huang, Yadan Luo, Huimin Lu
With the rapid advancement of image captioning and visual question answering at single-round level, the question of how to generate multi-round dialogue about visual content has not yet been well explored. Existing visual dialogue methods encode the image into a fixed feature vector directly, concatenated with the question and history embeddings to predict the response. Some recent methods tackle the co-reference resolution problem using co-attention mechanism to cross-refer relevant elements from the image, history, and the target question. However, it remains challenging to reason visual relationships, since the fine-grained object-level information is omitted before co-attentive reasoning.
no code implementations • 12 Nov 2019 • Yadan Luo, Zi Huang, Zheng Zhang, Ziwei Wang, Mahsa Baktashmotlagh, Yang Yang
Meta-learning for few-shot learning allows a machine to leverage previously acquired knowledge as a prior, thus improving the performance on novel tasks with only small amounts of data.
no code implementations • 5 Nov 2019 • Zijian Wang, Zheng Zhang, Yadan Luo, Zi Huang
Existing deep hashing approaches fail to fully explore semantic correlations and neglect the effect of linguistic context on visual attention learning, leading to inferior performance.
no code implementations • 21 Sep 2019 • Zhi Chen, Jingjing Li, Yadan Luo, Zi Huang, Yang Yang
Thus, a multi-modal cycle-consistency loss between the synthesized semantic representations and the ground truth can be learned and leveraged to enforce the generated semantic features to approximate to the real distribution in semantic space.
no code implementations • 1 Aug 2019 • Yadan Luo, Zi Huang, Zheng Zhang, Ziwei Wang, Jingjing Li, Yang Yang
Visual paragraph generation aims to automatically describe a given image from different perspectives and organize sentences in a coherent way.
no code implementations • 5 Apr 2019 • Yadan Luo, Ziwei Wang, Zi Huang, Yang Yang, Huimin Lu
With the increasing number of online stores, there is a pressing need for intelligent search systems to understand the item photos snapped by customers and search against large-scale product databases to find their desired items.
no code implementations • ACM International Conference on Multimedia 2018 • Ziwei Wang, Yadan Luo, Yang Li, Zi Huang, Hongzhi Yin
Existing image paragraph captioning methods give a series of sentences to represent the objects and regions of interests, where the descriptions are essentially generated by feeding the image fragments containing objects and regions into conventional image single-sentence captioning models.
1 code implementation • 25 Sep 2018 • Yadan Luo, Zi Huang, Yang Li, Fumin Shen, Yang Yang, Peng Cui
Hashing techniques are in great demand for a wide range of real-world applications such as image retrieval and network compression.
no code implementations • 22 Aug 2018 • Yadan Luo, Ziwei Wang, Zi Huang, Yang Yang, Cong Zhao
Rich high-quality annotated data is critical for semantic segmentation learning, yet acquiring dense and pixel-wise ground-truth is both labor- and time-consuming.
no code implementations • 6 Dec 2016 • Ruicong Xu, Yang Yang, Yadan Luo, Fumin Shen, Zi Huang, Heng Tao Shen
The first approach, termed Inner-product Binary Coding (IBC), preserves the inner relationships of images and videos in a common Hamming space.
no code implementations • 16 Jun 2016 • Yang Yang, Wei-Lun Chen, Yadan Luo, Fumin Shen, Jie Shao, Heng Tao Shen
Supervised knowledge e. g. semantic labels or pair-wise relationship) associated to data is capable of significantly improving the quality of hash codes and hash functions.