no code implementations • 8 Jul 2024 • Zhenyu Wang, Aoxue Li, Zhenguo Li, Xihui Liu
For a complex problem, the MLLM agent decomposes it into simpler sub-problems and constructs a tree structure to systematically plan the procedure of generation, editing, and self-correction with step-by-step verification.
no code implementations • 24 May 2024 • Aoxue Li, Mingyang Yi, Zhenguo Li
Then following a fusion process that carefully integrates the source intermediate (hidden) states (obtained by inversion) with the ones of the target image.
no code implementations • 24 May 2024 • Mingyang Yi, Aoxue Li, Yi Xin, Zhenguo Li
We conclude that in the earlier generation stage, the image is mostly decided by the special token [\texttt{EOS}] in the text prompt, and the information in the text prompt is already conveyed in this stage.
no code implementations • 14 Mar 2024 • Zhao Wang, Aoxue Li, Fengwei Zhou, Zhenguo Li, Qi Dou
Without using knowledge distillation, ensemble model or extra training data during detector training, our proposed MIC outperforms previous SOTA methods trained with these complex techniques on LVIS.
no code implementations • 14 Mar 2024 • Zhao Wang, Aoxue Li, Zhenguo Li, Qi Dou
Given this zoo, we adopt 7 target datasets from 5 diverse domains as the downstream target tasks for evaluation.
no code implementations • 28 Jan 2024 • Zhenyu Wang, Enze Xie, Aoxue Li, Zhongdao Wang, Xihui Liu, Zhenguo Li
Given a complex text prompt containing multiple concepts including objects, attributes, and relationships, the LLM agent initially decomposes it, which entails the extraction of individual objects, their associated attributes, and the prediction of a coherent scene layout.
no code implementations • 18 Jan 2024 • Zhao Wang, Aoxue Li, Lingting Zhu, Yong Guo, Qi Dou, Zhenguo Li
Customized text-to-video generation aims to generate high-quality videos guided by text prompts and subject references.
no code implementations • 19 Dec 2023 • Yunhao Gou, Zhili Liu, Kai Chen, Lanqing Hong, Hang Xu, Aoxue Li, Dit-yan Yeung, James T. Kwok, Yu Zhang
Instruction tuning of Large Vision-language Models (LVLMs) has revolutionized the development of versatile models with zero-shot generalization across a wide range of downstream vision-language tasks.
3 code implementations • ICCV 2023 • Haiyang Wang, Hao Tang, Shaoshuai Shi, Aoxue Li, Zhenguo Li, Bernt Schiele, LiWei Wang
Jointly processing information from multiple sensors is crucial to achieving accurate and robust perception for reliable autonomous driving systems.
Ranked #8 on
3D Object Detection
on nuScenes
no code implementations • CVPR 2023 • Hao Yang, Lanqing Hong, Aoxue Li, Tianyang Hu, Zhenguo Li, Gim Hee Lee, LiWei Wang
In this work, we first investigate the effects of synthetic data in synthetic-to-real novel view synthesis and surprisingly observe that models trained with synthetic data tend to produce sharper but less accurate volume densities.
1 code implementation • 9 Oct 2022 • Haiyang Wang, Lihe Ding, Shaocong Dong, Shaoshuai Shi, Aoxue Li, Jianan Li, Zhenguo Li, LiWei Wang
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Ranked #1 on
3D Object Detection
on SUN-RGBD
no code implementations • 23 Apr 2022 • Pengzhou Cheng, Mu Han, Aoxue Li, Fengwei Zhang
To address these limitations, we present a novel model for automotive intrusion detection by spatial-temporal correlation features of in-vehicle communication traffic (STC-IDS).
no code implementations • CVPR 2022 • Aoxue Li, Peng Yuan, Zhenguo Li
Semi-Supervised object detection (SSOD) aims to improve the generalization ability of object detectors with large-scale unlabeled images.
no code implementations • CVPR 2021 • Aoxue Li, Zhenguo Li
To this end, we propose a simple yet effective Transformation Invariant Principle (TIP) that can be flexibly applied to various meta-learning models for boosting the detection performance on novel class objects.
1 code implementation • CVPR 2021 • Hanzhe Hu, Shuai Bai, Aoxue Li, Jinshi Cui, LiWei Wang
In this work, aiming to fully exploit features of annotated novel object and capture fine-grained features of query object, we propose Dense Relation Distillation with Context-aware Aggregation (DCNet) to tackle the few-shot detection problem.
no code implementations • CVPR 2020 • Aoxue Li, Weiran Huang, Xu Lan, Jiashi Feng, Zhenguo Li, Li-Wei Wang
Few-shot learning (FSL) has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in learning to generalize from a few examples.
Ranked #1 on
Few-Shot Image Classification
on ImageNet (1-shot)
2 code implementations • ICCV 2019 • Tiange Luo, Aoxue Li, Tao Xiang, Weiran Huang, Li-Wei Wang
In this paper, we propose to tackle the challenging few-shot learning (FSL) problem by learning global class representations using both base and novel class training samples.
1 code implementation • CVPR 2019 • Aoxue Li, Tiange Luo, Zhiwu Lu, Tao Xiang, Liwei Wang
Recently, large-scale few-shot learning (FSL) becomes topical.
no code implementations • 19 Oct 2018 • Zhiwu Lu, Jiechao Guan, Aoxue Li, Tao Xiang, An Zhao, Ji-Rong Wen
Specifically, we assume that each synthesised data point can belong to any unseen class; and the most likely two class candidates are exploited to learn a robust projection function in a competitive fashion.
no code implementations • 19 Oct 2018 • Aoxue Li, Zhiwu Lu, Jiechao Guan, Tao Xiang, Li-Wei Wang, Ji-Rong Wen
Inspired by the fact that an unseen class is not exactly `unseen' if it belongs to the same superclass as a seen class, we propose a novel inductive ZSL model that leverages superclasses as the bridge between seen and unseen classes to narrow the domain gap.
no code implementations • 4 Jul 2017 • Aoxue Li, Zhiwu Lu, Li-Wei Wang, Tao Xiang, Xinqi Li, Ji-Rong Wen
In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i. e. zero-shot fine-grained classification.
no code implementations • 14 Jun 2017 • Jia Ding, Aoxue Li, Zhiqiang Hu, Li-Wei Wang
Early detection of pulmonary cancer is the most promising way to enhance a patient's chance for survival.