Search Results for author: Yaqian Li

Found 18 papers, 9 papers with code

A Survey for Foundation Models in Autonomous Driving

no code implementations2 Feb 2024 Haoxiang Gao, Yaqian Li, Kaiwen Long, Ming Yang, Yiqing Shen

The advent of foundation models has revolutionized the fields of natural language processing and computer vision, paving the way for their application in autonomous driving (AD).

3D Object Detection Autonomous Driving +2

u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model

1 code implementation9 Nov 2023 Jinjin Xu, Liwu Xu, Yuzhe Yang, Xiang Li, Fanyi Wang, Yanchun Xie, Yi-Jie Huang, Yaqian Li

Recent advancements in multi-modal large language models (MLLMs) have led to substantial improvements in visual understanding, primarily driven by sophisticated modality alignment strategies.

Instruction Following Language Modelling +1

Open-Set Image Tagging with Multi-Grained Text Supervision

2 code implementations23 Oct 2023 Xinyu Huang, Yi-Jie Huang, Youcai Zhang, Weiwei Tian, Rui Feng, Yuejie Zhang, Yanchun Xie, Yaqian Li, Lei Zhang

Specifically, for predefined commonly used tag categories, RAM++ showcases 10. 2 mAP and 15. 4 mAP enhancements over CLIP on OpenImages and ImageNet.

Human-Object Interaction Detection Open Set Learning +1

Prototype Fission: Closing Set for Robust Open-set Semi-supervised Learning

no code implementations29 Aug 2023 Xuwei Tan, Yi-Jie Huang, Yaqian Li

Instead of "opening set", i. e., modeling OOD distribution, Prototype Fission "closes set" and makes it hard for OOD samples to fit in sub-class latent space.

CLIP Brings Better Features to Visual Aesthetics Learners

no code implementations28 Jul 2023 Liwu Xu, Jinjin Xu, Yuzhe Yang, YiJie Huang, Yanchun Xie, Yaqian Li

Specifically, we first integrate and leverage a multi-source unlabeled dataset to align rich features between a given visual encoder and an off-the-shelf CLIP image encoder via feature alignment loss.

MAM Faster R-CNN: Improved Faster R-CNN based on Malformed Attention Module for object detection on X-ray security inspection

no code implementations journal 2023 Wenming Zhang, Qikai Zhu, Yaqian Li, Haibin Li

First, in order to expand the receptive field of the feature map and effectively extract the regional features of the target object with shape distortion in the feature map, we propose the Malformed Attention Module (MAM).

object-detection Object Detection

Recognize Anything: A Strong Image Tagging Model

2 code implementations6 Jun 2023 Youcai Zhang, Xinyu Huang, Jinyu Ma, Zhaoyang Li, Zhaochuan Luo, Yanchun Xie, Yuzhuo Qin, Tong Luo, Yaqian Li, Shilong Liu, Yandong Guo, Lei Zhang

We are releasing the RAM at \url{https://recognize-anything. github. io/} to foster the advancements of large models in computer vision.

Semantic Parsing

Box-Level Active Detection

1 code implementation CVPR 2023 Mengyao Lyu, Jundong Zhou, Hui Chen, YiJie Huang, Dongdong Yu, Yaqian Li, Yandong Guo, Yuchen Guo, Liuyu Xiang, Guiguang Ding

Active learning selects informative samples for annotation within budget, which has proven efficient recently on object detection.

Active Learning object-detection +1

Tag2Text: Guiding Vision-Language Model via Image Tagging

2 code implementations10 Mar 2023 Xinyu Huang, Youcai Zhang, Jinyu Ma, Weiwei Tian, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Lei Zhang

This paper presents Tag2Text, a vision language pre-training (VLP) framework, which introduces image tagging into vision-language models to guide the learning of visual-linguistic features.

Language Modelling TAG

Mixed Sample Augmentation for Online Distillation

no code implementations24 Jun 2022 Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo

Mixed Sample Regularization (MSR), such as MixUp or CutMix, is a powerful data augmentation strategy to generalize convolutional neural networks.

Data Augmentation Knowledge Distillation

Personalized Image Aesthetics Assessment with Rich Attributes

no code implementations CVPR 2022 Yuzhe Yang, Liwu Xu, Leida Li, Nan Qie, Yaqian Li, Peng Zhang, Yandong Guo

To solve the dilemma, we conduct so far, the most comprehensive subjective study of personalized image aesthetics and introduce a new Personalized image Aesthetics database with Rich Attributes (PARA), which consists of 31, 220 images with annotations by 438 subjects.

Towards Communication-Efficient and Privacy-Preserving Federated Representation Learning

no code implementations29 Sep 2021 Haizhou Shi, Youcai Zhang, Zijin Shen, Siliang Tang, Yaqian Li, Yandong Guo, Yueting Zhuang

This paper investigates the feasibility of federated representation learning under the constraints of communication cost and privacy protection.

Contrastive Learning Federated Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.