Search Results for author: Qi Qian

Found 40 papers, 20 papers with code

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model

1 code implementation30 Nov 2023 Anwen Hu, Yaya Shi, Haiyang Xu, Jiabo Ye, Qinghao Ye, Ming Yan, Chenliang Li, Qi Qian, Ji Zhang, Fei Huang

In this work, towards a more versatile copilot for academic paper writing, we mainly focus on strengthening the multi-modal diagram analysis ability of Multimodal LLMs.

Language Modelling Large Language Model

Stable Cluster Discrimination for Deep Clustering

1 code implementation ICCV 2023 Qi Qian

Meanwhile, one-stage methods are developed mainly for representation learning rather than clustering, where various constraints for cluster assignments are designed to avoid collapsing explicitly.

Clustering Deep Clustering +3

Graph Convolution Based Efficient Re-Ranking for Visual Retrieval

1 code implementation15 Jun 2023 Yuqi Zhang, Qi Qian, Hongsong Wang, Chong Liu, Weihua Chen, Fan Wang

In particular, the plain GCR is extended for cross-camera retrieval and an improved feature propagation formulation is presented to leverage affinity relationships across different cameras.

Distributed Computing Image Retrieval +3

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

1 code implementation7 Jun 2023 Haiyang Xu, Qinghao Ye, Xuan Wu, Ming Yan, Yuan Miao, Jiabo Ye, Guohai Xu, Anwen Hu, Yaya Shi, Guangwei Xu, Chenliang Li, Qi Qian, Maofei Que, Ji Zhang, Xiao Zeng, Fei Huang

In addition, to facilitate a comprehensive evaluation of video-language models, we carefully build the largest human-annotated Chinese benchmarks covering three popular video-language tasks of cross-modal retrieval, video captioning, and video category classification.

Cross-Modal Retrieval Language Modelling +3

ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human

1 code implementation16 Apr 2023 Junfeng Tian, Hehong Chen, Guohai Xu, Ming Yan, Xing Gao, Jianhai Zhang, Chenliang Li, Jiayi Liu, Wenshen Xu, Haiyang Xu, Qi Qian, Wei Wang, Qinghao Ye, Jiejing Zhang, Ji Zhang, Fei Huang, Jingren Zhou

In this paper, we present ChatPLUG, a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format.

World Knowledge

Improved Visual Fine-tuning with Natural Language Supervision

1 code implementation ICCV 2023 Junyang Wang, Yuanhong Xu, Juhua Hu, Ming Yan, Jitao Sang, Qi Qian

Fine-tuning a visual pre-trained model can leverage the semantic information from large-scale pre-training data and mitigate the over-fitting problem on downstream vision tasks with limited training examples.

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

4 code implementations1 Feb 2023 Haiyang Xu, Qinghao Ye, Ming Yan, Yaya Shi, Jiabo Ye, Yuanhong Xu, Chenliang Li, Bin Bi, Qi Qian, Wei Wang, Guohai Xu, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou

In contrast to predominant paradigms of solely relying on sequence-to-sequence generation or encoder-based instance discrimination, mPLUG-2 introduces a multi-module composition network by sharing common universal modules for modality collaboration and disentangling different modality modules to deal with modality entanglement.

Action Classification Image Classification +7

HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training

no code implementations ICCV 2023 Qinghao Ye, Guohai Xu, Ming Yan, Haiyang Xu, Qi Qian, Ji Zhang, Fei Huang

We achieve state-of-the-art results on 15 well-established video-language understanding and generation tasks, especially on temporal-oriented datasets (e. g., SSv2-Template and SSv2-Label) with 8. 6% and 11. 1% improvement respectively.

TGIF-Action TGIF-Frame +7

Semantic Data Augmentation based Distance Metric Learning for Domain Generalization

no code implementations2 Aug 2022 Mengzhu Wang, Jianlong Yuan, Qi Qian, Zhibin Wang, Hao Li

Further, we provide an in-depth analysis of the mechanism and rational behind our approach, which gives us a better understanding of why leverage logits in lieu of features can help domain generalization.

Data Augmentation Domain Generalization +1

An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation

no code implementations25 May 2022 Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Xiangyang Ji, Antoni B. Chan

With our empirical result obtained from 1, 330 models, we provide the following main observations: 1) ERM combined with data augmentation can achieve state-of-the-art performance if we choose a proper pre-trained model respecting the data property; 2) specialized algorithms further improve the robustness on top of ERM when handling a specific type of distribution shift, e. g., GroupDRO for spurious correlation and CORAL for large-scale out-of-distribution data; 3) Comparing different pre-training modes, architectures and data sizes, we provide novel observations about pre-training on distribution shift, which sheds light on designing or selecting pre-training strategy for different kinds of distribution shifts.

Data Augmentation

RBGNet: Ray-based Grouping for 3D Object Detection

1 code implementation CVPR 2022 Haiyang Wang, Shaoshuai Shi, Ze Yang, Rongyao Fang, Qi Qian, Hongsheng Li, Bernt Schiele, LiWei Wang

In order to learn better representations of object shape to enhance cluster features for predicting 3D boxes, we propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays uniformly emitted from cluster centers.

3D Object Detection Object +1

Reconstruction Task Finds Universal Winning Tickets

no code implementations23 Feb 2022 Ruichen Li, Binghui Li, Qi Qian, LiWei Wang

Pruning well-trained neural networks is effective to achieve a promising accuracy-efficiency trade-off in computer vision regimes.

Image Reconstruction object-detection +1

Improved Fine-Tuning by Better Leveraging Pre-Training Data

no code implementations24 Nov 2021 Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Xiangyang Ji, Antoni Chan, Rong Jin

The generalization result of using pre-training data shows that the excess risk bound on a target task can be improved when the appropriate pre-training data is included in fine-tuning.

Image Classification Learning Theory

Dash: Semi-Supervised Learning with Dynamic Thresholding

no code implementations1 Sep 2021 Yi Xu, Lei Shang, Jinxing Ye, Qi Qian, Yu-Feng Li, Baigui Sun, Hao Li, Rong Jin

In this work we develop a simple yet powerful framework, whose key idea is to select a subset of training examples from the unlabeled data when performing existing SSL methods so that only the unlabeled examples with pseudo labels related to the labeled data will be used to train models.

Semi-Supervised Image Classification

Why Does Multi-Epoch Training Help?

no code implementations13 May 2021 Yi Xu, Qi Qian, Hao Li, Rong Jin

Stochastic gradient descent (SGD) has become the most attractive optimization method in training large-scale deep neural networks due to its simplicity, low computational cost in each updating step, and good performance.

A Theoretical Analysis of Learning with Noisily Labeled Data

no code implementations8 Apr 2021 Yi Xu, Qi Qian, Hao Li, Rong Jin

Noisy labels are very common in deep supervised learning.

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

1 code implementation CVPR 2021 Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, Hao Li

To alleviate the confirmation bias problem and improve the quality of pseudo annotations, we further propose a co-rectify scheme based on Instant-Teaching, denoted as Instant-Teaching$^*$.

Ranked #12 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Object object-detection +2

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition

2 code implementations1 Feb 2021 Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin

Comparing with previous NAS methods, the proposed Zen-NAS is magnitude times faster on multiple server-side and mobile-side GPU platforms with state-of-the-art accuracy on ImageNet.

Image Classification Neural Architecture Search

Zen-NAS: A Zero-Shot NAS for High-Performance Image Recognition

2 code implementations ICCV 2021 Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin

To address this issue, instead of using an accuracy predictor, we propose a novel zero-shot index dubbed Zen-Score to rank the architectures.

Neural Architecture Search Vocal Bursts Intensity Prediction

WeMix: How to Better Utilize Data Augmentation

no code implementations3 Oct 2020 Yi Xu, Asaf Noy, Ming Lin, Qi Qian, Hao Li, Rong Jin

To this end, we develop two novel algorithms, termed "AugDrop" and "MixLoss", to correct the data bias in the data augmentation.

Data Augmentation

Improved Knowledge Distillation via Full Kernel Matrix Transfer

1 code implementation30 Sep 2020 Qi Qian, Hao Li, Juhua Hu

Recently, a number of works propose to transfer the pairwise similarity between examples to distill relative information.

Knowledge Distillation Model Compression

Semi-Anchored Detector for One-Stage Object Detection

no code implementations10 Sep 2020 Lei Chen, Qi Qian, Hao Li

The anchor-free strategy benefits the classification task but can lead to sup-optimum for the regression task due to the lack of prior bounding boxes.

Classification General Classification +4

Neural Architecture Design for GPU-Efficient Networks

2 code implementations24 Jun 2020 Ming Lin, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin

To address this issue, we propose a general principle for designing GPU-efficient networks based on extensive empirical studies.

Neural Architecture Search

Towards Understanding Label Smoothing

no code implementations20 Jun 2020 Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin

Label smoothing regularization (LSR) has a great success in training deep neural networks by stochastic algorithms such as stochastic gradient descent and its variants.

Weakly Supervised Representation Learning with Coarse Labels

1 code implementation ICCV 2021 Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Juhua Hu

To mitigate this challenge, we propose an algorithm to learn the fine-grained patterns for the target task, when only its coarse-class labels are available.

Learning with coarse labels Representation Learning

Hierarchically Robust Representation Learning

no code implementations CVPR 2020 Qi Qian, Juhua Hu, Hao Li

Experiments on benchmark data sets demonstrate the effectiveness of the robust deep representations.

Representation Learning

DR Loss: Improving Object Detection by Distributional Ranking

1 code implementation CVPR 2020 Qi Qian, Lei Chen, Hao Li, Rong Jin

This architecture is efficient but can suffer from the imbalance issue with respect to two aspects: the inter-class imbalance between the number of candidates from foreground and background classes and the intra-class imbalance in the hardness of background candidates, where only a few candidates are hard to be identified.

Object object-detection +1

Robust Gaussian Process Regression for Real-Time High Precision GPS Signal Enhancement

no code implementations3 Jun 2019 Ming Lin, Xiaomin Song, Qi Qian, Hao Li, Liang Sun, Shenghuo Zhu, Rong Jin

We validate the superiority of the proposed method in our real-time high precision positioning system against several popular state-of-the-art robust regression methods.

regression

Large-scale Distance Metric Learning with Uncertainty

no code implementations CVPR 2018 Qi Qian, Jiasheng Tang, Hao Li, Shenghuo Zhu, Rong Jin

Furthermore, we can show that the metric is learned from latent examples only, but it can preserve the large margin property even for the original data.

Metric Learning

Robust Optimization over Multiple Domains

no code implementations19 May 2018 Qi Qian, Shenghuo Zhu, Jiasheng Tang, Rong Jin, Baigui Sun, Hao Li

Hence, we propose to learn the model and the adversarial distribution simultaneously with the stochastic algorithm for efficiency.

BIG-bench Machine Learning Cloud Computing +1

Similarity Learning via Adaptive Regression and Its Application to Image Retrieval

no code implementations6 Dec 2015 Qi Qian, Inci M. Baytas, Rong Jin, Anil Jain, Shenghuo Zhu

The similarity between pairs of images can be measured by the distances between their high dimensional representations, and the problem of learning the appropriate similarity is often addressed by distance metric learning.

Image Retrieval Metric Learning +2

Towards Making High Dimensional Distance Metric Learning Practical

no code implementations15 Sep 2015 Qi Qian, Rong Jin, Lijun Zhang, Shenghuo Zhu

In this work, we present a dual random projection frame for DML with high dimensional data that explicitly addresses the limitation of dimensionality reduction for DML.

Dimensionality Reduction Metric Learning +1

Fine-Grained Visual Categorization via Multi-stage Metric Learning

no code implementations CVPR 2015 Qi Qian, Rong Jin, Shenghuo Zhu, Yuanqing Lin

To this end, we proposed a multi-stage metric learning framework that divides the large-scale high dimensional learning problem to a series of simple subproblems, achieving $\mathcal{O}(d)$ computational complexity.

Fine-Grained Visual Categorization Metric Learning

Efficient Distance Metric Learning by Adaptive Sampling and Mini-Batch Stochastic Gradient Descent (SGD)

no code implementations3 Apr 2013 Qi Qian, Rong Jin, Jin-Feng Yi, Lijun Zhang, Shenghuo Zhu

Although stochastic gradient descent (SGD) has been successfully applied to improve the efficiency of DML, it can still be computationally expensive because in order to ensure that the solution is a PSD matrix, it has to, at every iteration, project the updated distance metric onto the PSD cone, an expensive operation.

Computational Efficiency Metric Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.