Search Results for author: Jiaqi Wang

Found 103 papers, 63 papers with code

Are We on the Right Way for Evaluating Large Vision-Language Models?

1 code implementation29 Mar 2024 Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Jiaqi Wang, Yu Qiao, Dahua Lin, Feng Zhao

We evaluate 16 leading LVLMs on MMStar to assess their multi-modal capabilities, and on 7 benchmarks with the proposed metrics to investigate their data leakage and actual multi-modal gain.

World Knowledge

InternLM2 Technical Report

1 code implementation26 Mar 2024 Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

4k Long-Context Understanding

Long-CLIP: Unlocking the Long-Text Capability of CLIP

1 code implementation22 Mar 2024 Beichen Zhang, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Jiaqi Wang

Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-shot classification, text-image retrieval, and text-image generation by aligning image and text modalities.

Image Retrieval Language Modelling +3

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition

1 code implementation20 Mar 2024 Ziyu Liu, Zeyi Sun, Yuhang Zang, Wei Li, Pan Zhang, Xiaoyi Dong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang

Notably, our approach demonstrates a significant improvement in performance on 5 fine-grained visual recognition benchmarks, 11 few-shot image recognition datasets, and the 2 object detection datasets under the zero-shot recognition setting.

Contrastive Learning Fine-Grained Visual Recognition +3

AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models

no code implementations13 Mar 2024 YiFei Gao, Jiaqi Wang, Zhiyu Lin, Jitao Sang

Remarkably, our findings shed light on a consistent AIGC \textbf{hallucination bias}: the object hallucinations induced by synthetic images are characterized by a greater quantity and a more uniform position distribution, even these synthetic images do not manifest unrealistic or additional relevant visual features compared to natural images.

Hallucination

Deep learning for multi-label classification of coral conditions in the Indo-Pacific via underwater photogrammetry

1 code implementation9 Mar 2024 Xinlei Shao, Hongruixuan Chen, Kirsty Magson, Jiaqi Wang, Jian Song, Jundong Chen, Jun Sasaki

A dataset containing over 20, 000 high-resolution coral images of different health conditions and stressors was constructed based on the field survey.

Decision Making Ensemble Learning +1

Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder

no code implementations27 Feb 2024 Jiaqi Wang, Zhenxi Song, Zhengyu Ma, Xipeng Qiu, Min Zhang, Zhiguo Zhang

Reconstructing natural language from non-invasive electroencephalography (EEG) holds great promise as a language decoding technology for brain-computer interfaces (BCIs).

Brain Decoding EEG +2

CoRelation: Boosting Automatic ICD Coding Through Contextualized Code Relation Learning

no code implementations24 Feb 2024 Junyu Luo, Xiaochen Wang, Jiaqi Wang, Aofei Chang, Yaqing Wang, Fenglong Ma

Automatic International Classification of Diseases (ICD) coding plays a crucial role in the extraction of relevant information from clinical notes for proper recording and billing.

Relation

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models

1 code implementation22 Feb 2024 Yuhang Cao, Pan Zhang, Xiaoyi Dong, Dahua Lin, Jiaqi Wang

We present DualFocus, a novel framework for integrating macro and micro perspectives within multi-modal large language models (MLLMs) to enhance vision-language task performance.

Hallucination

VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models

no code implementations16 Feb 2024 Ziyi Yin, Muchao Ye, Tianrong Zhang, Jiaqi Wang, Han Liu, Jinghui Chen, Ting Wang, Fenglong Ma

Correspondingly, we propose a novel VQAttack model, which can iteratively generate both image and text perturbations with the designed modules: the large language model (LLM)-enhanced image attack and the cross-modal joint attack module.

Adversarial Robustness Language Modelling +3

SepRep-Net: Multi-source Free Domain Adaptation via Model Separation And Reparameterization

no code implementations13 Feb 2024 Ying Jin, Jiaqi Wang, Dahua Lin

We consider multi-source free domain adaptation, the problem of adapting multiple existing models to a new domain without accessing the source data.

Source-Free Domain Adaptation

Position Paper: Assessing Robustness, Privacy, and Fairness in Federated Learning Integrated with Foundation Models

no code implementations2 Feb 2024 Xi Li, Jiaqi Wang

Federated Learning (FL), while a breakthrough in decentralized machine learning, contends with significant challenges such as limited data availability and the variability of computational resources, which can stifle the performance and scalability of the models.

Data Augmentation Fairness +2

Recent Advances in Predictive Modeling with Electronic Health Records

no code implementations2 Feb 2024 Jiaqi Wang, Junyu Luo, Muchao Ye, Xiaochen Wang, Yuan Zhong, Aofei Chang, Guanjie Huang, Ziyi Yin, Cao Xiao, Jimeng Sun, Fenglong Ma

This survey systematically reviews recent advances in deep learning-based predictive models using EHR data.

Rethinking Personalized Federated Learning with Clustering-based Dynamic Graph Propagation

no code implementations29 Jan 2024 Jiaqi Wang, Yuzhong Chen, Yuhang Wu, Mahashweta Das, Hao Yang, Fenglong Ma

Subsequently, we design a precise personalized model distribution strategy to allow clients to obtain the most suitable model from the server side.

Clustering Personalized Federated Learning

Automated Fusion of Multimodal Electronic Health Records for Better Medical Predictions

1 code implementation20 Jan 2024 Suhan Cui, Jiaqi Wang, Yuan Zhong, Han Liu, Ting Wang, Fenglong Ma

The widespread adoption of Electronic Health Record (EHR) systems in healthcare institutes has generated vast amounts of medical data, offering significant opportunities for improving healthcare services through deep learning techniques.

Neural Architecture Search

Vulnerabilities of Foundation Model Integrated Federated Learning Under Adversarial Threats

no code implementations18 Jan 2024 Chen Wu, Xi Li, Jiaqi Wang

Federated Learning (FL) addresses critical issues in machine learning related to data privacy and security, yet suffering from data insufficiency and imbalance under certain circumstances.

Federated Learning

Enhancing Evolving Domain Generalization through Dynamic Latent Representations

no code implementations16 Jan 2024 Binghui Xie, Yongqiang Chen, Jiaqi Wang, Kaiwen Zhou, Bo Han, Wei Meng, James Cheng

However, in non-stationary tasks where new domains evolve in an underlying continuous structure, such as time, merely extracting the invariant features is insufficient for generalization to the evolving new domains.

Evolving Domain Generalization

Large Language Models for Robotics: Opportunities, Challenges, and Perspectives

no code implementations9 Jan 2024 Jiaqi Wang, Zihao Wu, Yiwei Li, Hanqi Jiang, Peng Shu, Enze Shi, Huawen Hu, Chong Ma, Yiheng Liu, Xuhui Wang, Yincheng Yao, Xuan Liu, Huaqin Zhao, Zhengliang Liu, Haixing Dai, Lin Zhao, Bao Ge, Xiang Li, Tianming Liu, Shu Zhang

Notably, in the realm of robot task planning, LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions.

Robot Task Planning

Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases

1 code implementation22 Dec 2023 Zhangyang Qi, Ye Fang, Mengchen Zhang, Zeyi Sun, Tong Wu, Ziwei Liu, Dahua Lin, Jiaqi Wang, Hengshuang Zhao

We conducted a series of structured experiments to evaluate their performance in various industrial application scenarios, offering a comprehensive perspective on their practical utility.

HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image

no code implementations7 Dec 2023 Tong Wu, Zhibing Li, Shuai Yang, Pan Zhang, Xinggang Pan, Jiaqi Wang, Dahua Lin, Ziwei Liu

Extensive experiments demonstrate the effectiveness of HyperDreamer in modeling region-aware materials with high-resolution textures and enabling user-friendly editing.

Semantic Segmentation

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

1 code implementation6 Dec 2023 Zeyi Sun, Ye Fang, Tong Wu, Pan Zhang, Yuhang Zang, Shu Kong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang

Alpha-CLIP not only preserves the visual recognition ability of CLIP but also enables precise control over the emphasis of image contents.

3D Generation

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

1 code implementation5 Dec 2023 Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao

Multimodal Large Language Models (MLLMs) have excelled in 2D image-text comprehension and image generation, but their understanding of the 3D world is notably deficient, limiting progress in 3D language understanding and generation.

3D Generation Reading Comprehension

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

1 code implementation29 Nov 2023 Qidong Huang, Xiaoyi Dong, Pan Zhang, Bin Wang, Conghui He, Jiaqi Wang, Dahua Lin, Weiming Zhang, Nenghai Yu

Based on the observation, OPERA introduces a penalty term on the model logits during the beam-search decoding to mitigate the over-trust issue, along with a rollback strategy that retrospects the presence of summary tokens in the previously generated tokens, and re-allocate the token selection if necessary.

Hallucination

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization

1 code implementation28 Nov 2023 Zhiyuan Zhao, Bin Wang, Linke Ouyang, Xiaoyi Dong, Jiaqi Wang, Conghui He

Multimodal large language models have made significant advancements in recent years, yet they still suffer from a common issue known as the "hallucination problem", in which the models generate textual descriptions that inaccurately depict or entirely fabricate content from associated images.

Hallucination

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

1 code implementation21 Nov 2023 Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Conghui He, Jiaqi Wang, Feng Zhao, Dahua Lin

In the realm of large multi-modal models (LMMs), efficient modality alignment is crucial yet often constrained by the scarcity of high-quality image-text data.

Descriptive visual instruction following +2

Adversarial Prompt Tuning for Vision-Language Models

1 code implementation19 Nov 2023 Jiaming Zhang, Xingjun Ma, Xin Wang, Lingyu Qiu, Jiaqi Wang, Yu-Gang Jiang, Jitao Sang

With the rapid advancement of multimodal learning, pre-trained Vision-Language Models (VLMs) such as CLIP have demonstrated remarkable capacities in bridging the gap between visual and language modalities.

Adversarial Robustness

AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation

1 code implementation13 Nov 2023 Junyang Wang, Yuhang Wang, Guohai Xu, Jing Zhang, Yukai Gu, Haitao Jia, Jiaqi Wang, Haiyang Xu, Ming Yan, Ji Zhang, Jitao Sang

Despite making significant progress in multi-modal tasks, current Multi-modal Large Language Models (MLLMs) encounter the significant challenge of hallucinations, which may lead to harmful consequences.

Attribute Hallucination +2

Hierarchical Pretraining on Multimodal Electronic Health Records

1 code implementation11 Oct 2023 Xiaochen Wang, Junyu Luo, Jiaqi Wang, Ziyi Yin, Suhan Cui, Yuan Zhong, Yaqing Wang, Fenglong Ma

Pretraining has proven to be a powerful technique in natural language processing (NLP), exhibiting remarkable success in various NLP downstream tasks.

MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation

no code implementations4 Oct 2023 Yuan Zhong, Suhan Cui, Jiaqi Wang, Xiaochen Wang, Ziyi Yin, Yaqing Wang, Houping Xiao, Mengdi Huai, Ting Wang, Fenglong Ma

Health risk prediction is one of the fundamental tasks under predictive modeling in the medical domain, which aims to forecast the potential health risks that patients may face in the future using their historical Electronic Health Records (EHR).

Data Augmentation

Intelligent machines work in unstructured environments by differential neuromorphic computing

no code implementations16 Sep 2023 Shengbo Wang, Shuo Gao, Chenyu Tang, Edoardo Occhipinti, Cong Li, Shurui Wang, Jiaqi Wang, Hubin Zhao, Guohua Hu, Arokia Nathan, Ravinder Dahiya, Luigi Occhipinti

By mimicking the intrinsic nature of human low-level perception mechanisms, the electronic memristive neuromorphic circuit-based method, presented here shows the potential for adapting to diverse sensing technologies and helping intelligent machines generate smart high-level decisions in the real world.

Autonomous Driving Decision Making

MLLM-DataEngine: An Iterative Refinement Approach for MLLM

1 code implementation25 Aug 2023 Zhiyuan Zhao, Linke Ouyang, Bin Wang, Siyuan Huang, Pan Zhang, Xiaoyi Dong, Jiaqi Wang, Conghui He

Despite the great advance of Multimodal Large Language Models (MLLMs) in both instruction dataset building and benchmarking, the independence of training and evaluation makes current MLLMs hard to further improve their capability under the guidance of evaluation results with a relatively low human cost.

Benchmarking

VIGC: Visual Instruction Generation and Correction

2 code implementations24 Aug 2023 Bin Wang, Fan Wu, Xiao Han, Jiahui Peng, Huaping Zhong, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, Conghui He

A practical solution to this problem would be to utilize the available multimodal large language models (MLLMs) to generate instruction data for vision-language tasks.

Hallucination Image Captioning +1

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models

1 code implementation21 Aug 2023 Conghui He, Zhenjiang Jin, Chao Xu, Jiantao Qiu, Bin Wang, Wei Li, Hang Yan, Jiaqi Wang, Dahua Lin

The rise in popularity of ChatGPT and GPT-4 has significantly accelerated the development of large models, leading to the creation of numerous impressive large language models(LLMs) and multimodal large language models (MLLMs).

FOLT: Fast Multiple Object Tracking from UAV-captured Videos Based on Optical Flow

no code implementations14 Aug 2023 Mufeng Yao, Jiaqi Wang, Jinlong Peng, Mingmin Chi, Chao Liu

Given the extracted flow, the flow-guided feature augmentation is designed to augment the object detection feature based on its optical flow, which improves the detection of small objects.

motion prediction Multiple Object Tracking +4

Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization

1 code implementation7 Aug 2023 Yujie Zhou, Wenwen Qiang, Anyi Rao, Ning Lin, Bing Su, Jiaqi Wang

Specifically, 1) we maximize the MI between visual and semantic space for distribution alignment; 2) we leverage the temporal information for estimating the MI by encouraging MI to increase as more frames are observed.

Action Recognition Mutual Information Estimation +1

Efficient Prediction of Peptide Self-assembly through Sequential and Graphical Encoding

1 code implementation17 Jul 2023 Zihan Liu, Jiaqi Wang, Yun Luo, Shuang Zhao, Wenbin Li, Stan Z. Li

In recent years, there has been an explosion of research on the application of deep learning to the prediction of various peptide properties, due to the significant development and market potential of peptides.

Benchmarking

Real-time Workload Pattern Analysis for Large-scale Cloud Databases

no code implementations5 Jul 2023 Jiaqi Wang, Tianyi Li, Anni Wang, Xiaoze Liu, Lu Chen, Jie Chen, Jianye Liu, Junyang Wu, Feifei Li, Yunjun Gao

This has led to the increasing volume of database workloads, which provides the opportunity for pattern analysis.

Review of Large Vision Models and Visual Prompt Engineering

no code implementations3 Jul 2023 Jiaqi Wang, Zhengliang Liu, Lin Zhao, Zihao Wu, Chong Ma, Sigang Yu, Haixing Dai, Qiushi Yang, Yiheng Liu, Songyao Zhang, Enze Shi, Yi Pan, Tuo Zhang, Dajiang Zhu, Xiang Li, Xi Jiang, Bao Ge, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang

This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering.

Prompt Engineering

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

no code implementations2 Jun 2023 Zhangyang Qi, Jiaqi Wang, Xiaoyang Wu, Hengshuang Zhao

Multi-view 3D object detection is becoming popular in autonomous driving due to its high effectiveness and low cost.

3D Object Detection Autonomous Driving +2

CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers

1 code implementation27 May 2023 Dachuan Shi, Chaofan Tao, Anyi Rao, Zhendong Yang, Chun Yuan, Jiaqi Wang

Although extensively studied for unimodal models, the acceleration for multimodal models, especially the vision-language Transformers, is relatively under-explored.

Image Captioning Image Retrieval +5

Prompt Engineering for Healthcare: Methodologies and Applications

no code implementations28 Apr 2023 Jiaqi Wang, Enze Shi, Sigang Yu, Zihao Wu, Chong Ma, Haixing Dai, Qiushi Yang, Yanqing Kang, Jinru Wu, Huawen Hu, Chenxi Yue, Haiyang Zhang, Yiheng Liu, Yi Pan, Zhengliang Liu, Lichao Sun, Xiang Li, Bao Ge, Xi Jiang, Dajiang Zhu, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang

Prompt engineering is a critical technique in the field of natural language processing that involves designing and optimizing the prompts used to input information into models, aiming to enhance their performance on specific tasks.

Machine Translation Prompt Engineering +3

ImpressionGPT: An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT

2 code implementations17 Apr 2023 Chong Ma, Zihao Wu, Jiaqi Wang, Shaochen Xu, Yaonai Wei, Zhengliang Liu, Xi Jiang, Lei Guo, Xiaoyan Cai, Shu Zhang, Tuo Zhang, Dajiang Zhu, Dinggang Shen, Tianming Liu, Xiang Li

The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians, and it is typically written by radiologists based on the 'Findings' section.

In-Context Learning

V3Det: Vast Vocabulary Visual Detection Dataset

no code implementations ICCV 2023 Jiaqi Wang, Pan Zhang, Tao Chu, Yuhang Cao, Yujie Zhou, Tong Wu, Bin Wang, Conghui He, Dahua Lin

2) Hierarchical Category Organization: The vast vocabulary of V3Det is organized by a hierarchical category tree which annotates the inclusion relationship among categories, encouraging the exploration of category relationships in vast and open vocabulary object detection.

Chatbot Object +2

Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences

1 code implementation17 Feb 2023 Yujie Zhou, Haodong Duan, Anyi Rao, Bing Su, Jiaqi Wang

Specifically, we construct a negative-sample-free triplet steam structure that is composed of an anchor stream without any masking, a spatial masking stream with Central Spatial Masking (CSM), and a temporal masking stream with Motion Attention Temporal Masking (MATM).

Action Recognition Contrastive Learning +4

Siamese transformer with hierarchical concept embedding for fine-grained image recognition

no code implementations Science China Information Sciences 2023 Yilin Lyu, Liping Jing, Jiaqi Wang, Mingzhe Guo, Xinyue Wang & Jian Yu

In particular, one subnetwork is for coarse-scale patches to learn the discriminative regions with the aid of the innate multi-head self-attention mechanism of the transformer.

Fine-Grained Image Recognition

UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers

1 code implementation31 Jan 2023 Dachuan Shi, Chaofan Tao, Ying Jin, Zhendong Yang, Chun Yuan, Jiaqi Wang

Real-world data contains a vast amount of multimodal information, among which vision and language are the two most representative modalities.

Image Captioning Image Classification +7

Semi-Supervised Semantic Segmentation via Gentle Teaching Assistant

1 code implementation NIPS 2022 Ying Jin, Jiaqi Wang, Dahua Lin

Semi-Supervised Semantic Segmentation aims at training the segmentation model with limited labeled data and a large amount of unlabeled data.

Segmentation Semi-Supervised Semantic Segmentation

Self-supervised Domain Adaptation for Breaking the Limits of Low-quality Fundus Image Quality Enhancement

1 code implementation17 Jan 2023 Qingshan Hou, Peng Cao, Jiaqi Wang, Xiaoli Liu, Jinzhu Yang, Osmar R. Zaiane

Most of the existing image enhancement methods mainly focus on improving the image quality by leveraging the guidance of high-quality images, which is difficult to be collected in medical applications.

Domain Adaptation Image Enhancement

Multi-Level Logit Distillation

1 code implementation CVPR 2023 Ying Jin, Jiaqi Wang, Dahua Lin

Through this framework, the prediction alignment is not only conducted at the instance level, but also at the batch and class level, through which the student model learns instance prediction, input correlation, and category correlation simultaneously.

Knowledge Distillation

In Differential Privacy, There is Truth: On Vote Leakage in Ensemble Private Learning

1 code implementation22 Sep 2022 Jiaqi Wang, Roei Schuster, Ilia Shumailov, David Lie, Nicolas Papernot

When learning from sensitive data, care must be taken to ensure that training algorithms address privacy concerns.

Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction

1 code implementation26 Aug 2022 Tong Wu, Jiaqi Wang, Xingang Pan, Xudong Xu, Christian Theobalt, Ziwei Liu, Dahua Lin

Previous methods based on neural volume rendering mostly train a fully implicit model with MLPs, which typically require hours of training for a single scene.

Surface Reconstruction

An Understanding-Oriented Robust Machine Reading Comprehension Model

1 code implementation1 Jul 2022 Feiliang Ren, Yongkang Liu, Bochao Li, Shilei Liu, Bingchao Wang, Jiaqi Wang, Chunchao Liu, Qi Ma

In this paper, we propose an understanding-oriented machine reading comprehension model to address three kinds of robustness issues, which are over sensitivity, over stability and generalization.

Machine Reading Comprehension Multi-Task Learning +1

Deep Amortized Relational Model with Group-Wise Hierarchical Generative Process

no code implementations AAAI 2022 Huafeng Liu, Tong Zhou, Jiaqi Wang, Liping Jing

In this paper, we propose Deep amortized Relational Model (DaRM) with group-wise hierarchical generative process for community discovery and link prediction on relational data (e. g., graph, network).

Community Detection Link Prediction

What Are Expected Queries in End-to-End Object Detection?

1 code implementation2 Jun 2022 Shilong Zhang, Xinjiang Wang, Jiaqi Wang, Jiangmiao Pang, Kai Chen

As both sparse and dense queries are imperfect, then \emph{what are expected queries in end-to-end object detection}?

Instance Segmentation object-detection +2

PYSKL: Towards Good Practices for Skeleton Action Recognition

1 code implementation19 May 2022 Haodong Duan, Jiaqi Wang, Kai Chen, Dahua Lin

The toolbox supports a wide variety of skeleton action recognition algorithms, including approaches based on GCN and CNN.

Action Recognition Skeleton Based Action Recognition

MINI: Mining Implicit Novel Instances for Few-Shot Object Detection

no code implementations6 May 2022 Yuhang Cao, Jiaqi Wang, Yiqi Lin, Dahua Lin

The offline mining mechanism leverages a self-supervised discriminative model to collaboratively mine implicit novel instances with a trained FSOD network.

Few-Shot Object Detection object-detection

Deep Understanding based Multi-Document Machine Reading Comprehension

no code implementations25 Feb 2022 Feiliang Ren, Yongkang Liu, Bochao Li, Zhibo Wang, Yu Guo, Shilei Liu, Huimin Wu, Jiaqi Wang, Chunchao Liu, Bingchao Wang

Most existing multi-document machine reading comprehension models mainly focus on understanding the interactions between the input question and documents, but ignore following two kinds of understandings.

Machine Reading Comprehension TriviaQA

Few-Shot Object Detection via Association and DIscrimination

1 code implementation NeurIPS 2021 Yuhang Cao, Jiaqi Wang, Ying Jin, Tong Wu, Kai Chen, Ziwei Liu, Dahua Lin

1) In the association step, in contrast to implicitly leveraging multiple base classes, we construct a compact novel class feature space via explicitly imitating a specific base class feature space.

Few-Shot Object Detection Object +3

FedTriNet: A Pseudo Labeling Method with Three Players for Federated Semi-supervised Learning

no code implementations12 Sep 2021 Liwei Che, Zewei Long, Jiaqi Wang, Yaqing Wang, Houping Xiao, Fenglong Ma

In particular, we propose to use three networks and a dynamic quality control mechanism to generate high-quality pseudo labels for unlabeled data, which are added to the training set.

Federated Learning

UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer

3 code implementations9 Sep 2021 Haonan Wang, Peng Cao, Jiaqi Wang, Osmar R. Zaiane

Specifically, the CTrans module is an alternate of the U-Net skip connections, which consists of a sub-module to conduct the multi-scale Channel Cross fusion with Transformer (named CCT) and a sub-module Channel-wise Cross-Attention (named CCA) to guide the fused multi-scale channel-wise information to effectively connect to the decoder features for eliminating the ambiguity.

Ranked #2 on Medical Image Segmentation on GlaS (IoU metric)

Image Segmentation Medical Image Segmentation +2

FedCon: A Contrastive Framework for Federated Semi-Supervised Learning

no code implementations9 Sep 2021 Zewei Long, Jiaqi Wang, Yaqing Wang, Houping Xiao, Fenglong Ma

Most existing FedSSL methods focus on the classical scenario, i. e, the labeled and unlabeled data are stored at the client side.

Adaptively Optimize Content Recommendation Using Multi Armed Bandit Algorithms in E-commerce

no code implementations30 Jul 2021 Ding Xiang, Becky West, Jiaqi Wang, Xiquan Cui, Jinzhou Huang

Second, we compare the accumulative rewards of the three MAB algorithms with more than 1, 000 trials using actual historical A/B test datasets.

Thompson Sampling

Cluster-Wise Hierarchical Generative Model for Deep Amortized Clustering

no code implementations CVPR 2021 Huafeng Liu, Jiaqi Wang, Liping Jing

In this paper, we propose Cluster-wise Hierarchical Generative Model for deep amortized clustering (CHiGac).

Clustering

Interpretable Image Recognition by Constructing Transparent Embedding Space

2 code implementations ICCV 2021 Jiaqi Wang, Huafeng Liu, Xinyue Wang, Liping Jing

This plug-in embedding space is spanned by transparent basis concepts which are constructed on the Grassmann manifold.

CARAFE++: Unified Content-Aware ReAssembly of FEatures

no code implementations7 Dec 2020 Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin

Feature reassembly, i. e. feature downsampling and upsampling, is a key operation in a number of modern convolutional network architectures, e. g., residual networks and feature pyramids.

Image Inpainting Instance Segmentation +3

CodeCMR: Cross-Modal Retrieval For Function-Level Binary Source Code Matching

1 code implementation NeurIPS 2020 Zeping Yu, Wenxin Zheng, Jiaqi Wang, Qiyi Tang, Sen Nie, Shi Wu

We adopt Deep Pyramid Convolutional Neural Network (DPCNN) for source code feature extraction and Graph Neural Network (GNN) for binary code feature extraction.

Computer Security Cross-Modal Retrieval +2

Texture Memory-Augmented Deep Patch-Based Image Inpainting

1 code implementation28 Sep 2020 Rui Xu, Minghao Guo, Jiaqi Wang, Xiaoxiao Li, Bolei Zhou, Chen Change Loy

By bringing together the best of both paradigms, we propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions.

Image Inpainting Retrieval +1

Active Learning for Product Type Ontology Enhancement in E-commerce

no code implementations19 Sep 2020 Yun Zhu, Sayyed M. Zahiri, Jiaqi Wang, Han-Yu Chen, Faizan Javed

Entity-based semantic search has been widely adopted in modern search engines to improve search accuracy by understanding users' intent.

Active Learning Vocal Bursts Type Prediction

MMFashion: An Open-Source Toolbox for Visual Fashion Analysis

3 code implementations18 May 2020 Xin Liu, Jiancheng Li, Jiaqi Wang, Ziwei Liu

This toolbox supports a wide spectrum of fashion analysis tasks, including Fashion Attribute Prediction, Fashion Recognition and Retrieval, Fashion Landmark Detection, Fashion Parsing and Segmentation and Fashion Compatibility and Recommendation.

Attribute Retrieval

FMore: An Incentive Scheme of Multi-dimensional Auction for Federated Learning in MEC

no code implementations22 Feb 2020 Rongfei Zeng, Shixun Zhang, Jiaqi Wang, Xiaowen Chu

In MEC, edge nodes would not like to voluntarily participate in learning, and they differ in the provision of multi-dimensional resources, both of which might deteriorate the performance of federated learning.

Edge-computing Federated Learning

Side-Aware Boundary Localization for More Precise Object Detection

3 code implementations ECCV 2020 Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin

To tackle the difficulty of precise localization in the presence of displacements with large variance, we further propose a two-step localization scheme, which first predicts a range of movement through bucket prediction and then pinpoints the precise position within the predicted bucket.

Object object-detection +2

On the Robustness of the Backdoor-based Watermarking in Deep Neural Networks

no code implementations18 Jun 2019 Masoumeh Shafieinejad, Jiaqi Wang, Nils Lukas, Xinda Li, Florian Kerschbaum

We focus on backdoor-based watermarking and propose two -- a black-box and a white-box -- attacks that remove the watermark.

Hierarchical Attention Generative Adversarial Networks for Cross-domain Sentiment Classification

no code implementations27 Mar 2019 Yuebing Zhang, Duoqian Miao, Jiaqi Wang

Cross-domain sentiment classification (CDSC) is an importance task in domain adaptation and sentiment classification.

Classification Domain Adaptation +4

Hybrid Task Cascade for Instance Segmentation

5 code implementations CVPR 2019 Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin

In exploring a more effective approach, we find that the key to a successful instance segmentation cascade is to fully leverage the reciprocal relationship between detection and segmentation.

Instance Segmentation object-detection +4

Region Proposal by Guided Anchoring

1 code implementation CVPR 2019 Jiaqi Wang, Kai Chen, Shuo Yang, Chen Change Loy, Dahua Lin

State-of-the-art detectors mostly rely on a dense anchoring scheme, where anchors are sampled uniformly over the spatial domain with a predefined set of scales and aspect ratios.

object-detection Object Detection +1

Optimizing Video Object Detection via a Scale-Time Lattice

1 code implementation CVPR 2018 Kai Chen, Jiaqi Wang, Shuo Yang, Xingcheng Zhang, Yuanjun Xiong, Chen Change Loy, Dahua Lin

High-performance object detection relies on expensive convolutional networks to compute features, often leading to significant challenges in applications, e. g. those that require detecting objects from video streams in real time.

Object object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.