Search Results for author: Qi Chen

Found 128 papers, 62 papers with code

Towards Long-Range ENSO Prediction with an Explainable Deep Learning Model

no code implementations25 Mar 2025 Qi Chen, Yinghao Cui, Guobin Hong, Karumuri Ashok, Yuchun Pu, Xiaogu Zheng, Xuanze Zhang, Wei Zhong, Peng Zhan, Zhonglei Wang

El Ni\~no-Southern Oscillation (ENSO) is a prominent mode of interannual climate variability with far-reaching global impacts.

Deep Learning

Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization

no code implementations22 Mar 2025 Zhuo Tao, Liang Li, Qi Chen, Yunbin Tu, Zheng-Jun Zha, Ming-Hsuan Yang, Yuankai Qi, Qingming Huang

To address this problem, we propose a new COllaborative Temporal consistEncy Learning (COTEL) framework that leverages the synergy between saliency detection and moment localization to strengthen the video-language alignment.

Saliency Detection Sentence +1

Efficient Response Generation Method Selection for Fine-Tuning Large Language Models

no code implementations17 Feb 2025 Xuan Ren, Qi Chen, Lingqiao Liu

Using this strategy, we can evaluate a small subset of the generated output from each response generation strategy option, then select the most effective strategy.

Response Generation

Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization

1 code implementation6 Feb 2025 Yuanye Liu, Jiahang Xu, Li Lyna Zhang, Qi Chen, Xuan Feng, Yang Chen, Zhongxin Guo, Yuqing Yang, Peng Cheng

Large Language Models (LLMs) have shown significant capability across various tasks, with their real-world effectiveness often driven by prompt design.

MADP: Multi-Agent Deductive Planning for Enhanced Cognitive-Behavioral Mental Health Question Answer

no code implementations27 Jan 2025 Qi Chen, Dexi Liu

To address this, we propose a framework named Multi-Agent Deductive Planning (MADP), which is based on the interactions between the various psychological elements of CBT.

IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation

1 code implementation9 Jan 2025 Qi Chen, Changli Wu, Jiayi Ji, Yiwei Ma, Danni Yang, Xiaoshuai Sun

To tackle intent ambiguity, we designed a Prompt-Aware Decoder (PAD) that guides the decoding process by deriving task-driven signals from the interaction between the expression and visual features.

Decoder Referring Expression +1

EpiCoder: Encompassing Diversity and Complexity in Code Generation

no code implementations8 Jan 2025 Yaoxiang Wang, Haoling Li, Xin Zhang, Jie Wu, Xiao Liu, Wenxiang Hu, Zhongxin Guo, Yangyu Huang, Ying Xin, Yujiu Yang, Jinsong Su, Qi Chen, Scarlett Li

Effective instruction tuning is indispensable for optimizing code LLMs, aligning model behavior with user expectations and enhancing model performance in real-world applications.

Code Generation Diversity

Text-Driven Tumor Synthesis

no code implementations24 Dec 2024 Xinran Li, Yi Shuai, Chen Liu, Qi Chen, Qilong Wu, Pengfei Guo, Dong Yang, Can Zhao, Pedro R. A. S. Bassi, Daguang Xu, Kang Wang, Yang Yang, Alan Yuille, Zongwei Zhou

Tumor synthesis can generate examples that AI often misses or over-detects, improving AI performance by training on these challenging cases.

Contrastive Learning Tumor Segmentation

Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning

1 code implementation14 Dec 2024 Hai-Ming Xu, Qi Chen, Lei Wang, Lingqiao Liu

Additionally, we demonstrate that our attention map-based grounding technique significantly outperforms direct localization predictions from MiniCPM-Llama3-V 2. 5, highlighting the potential of using attention maps from pretrained MLLMs and paving the way for future innovations in this domain.

TAG

VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization

no code implementations13 Dec 2024 Tao Liu, Ziyang Ma, Qi Chen, Feilong Chen, Shuai Fan, Xie Chen, Kai Yu

We present VQTalker, a Vector Quantization-based framework for multilingual talking head generation that addresses the challenges of lip synchronization and natural motion across diverse languages.

Motion Generation Quantization +2

Global Estimation of Building-Integrated Facade and Rooftop Photovoltaic Potential by Integrating 3D Building Footprint and Spatio-Temporal Datasets

1 code implementation2 Dec 2024 Qing Yu, Kechuan Dong, Zhiling Guo, Jiaxing Li, Hongjun Tan, Yanxiu Jin, Jian Yuan, Haoran Zhang, Junwei Liu, Qi Chen, Jinyue Yan

This research tackles the challenges of estimating Building-Integrated Photovoltaics (BIPV) potential across various temporal and spatial scales, accounting for different geographical climates and urban morphology.

A Survey of Medical Vision-and-Language Applications and Their Techniques

1 code implementation19 Nov 2024 Qi Chen, Ruoshan Zhao, Sinuo Wang, Vu Minh Hieu Phan, Anton Van Den Hengel, Johan Verjans, Zhibin Liao, Minh-Son To, Yong Xia, Jian Chen, Yutong Xie, Qi Wu

Unlike general vision-and-language models trained on diverse, non-specialized datasets, MVLMs are purpose-built for the medical domain, automatically extracting and interpreting critical information from medical images and textual reports to support clinical decision-making.

Decision Making Diagnostic +7

KMM: Key Frame Mask Mamba for Extended Motion Generation

1 code implementation10 Nov 2024 Zeyu Zhang, Hang Gao, Akide Liu, Qi Chen, Feng Chen, Yiran Wang, Danning Li, Hao Tang

The recent Mamba architecture shows promising results in efficiently modeling long and complex sequences, yet two significant challenges remain: Firstly, directly applying Mamba to extended motion generation is ineffective, as the limited capacity of the implicit memory leads to memory decay.

Contrastive Learning Mamba +1

CIT: Rethinking Class-incremental Semantic Segmentation with a Class Independent Transformation

1 code implementation5 Nov 2024 Jinchao Ge, BoWen Zhang, Akide Liu, Minh Hieu Phan, Qi Chen, Yangyang Shu, Yang Zhao

Class-incremental semantic segmentation (CSS) requires that a model learn to segment new classes without forgetting how to segment previous ones: this is typically achieved by distilling the current knowledge and incorporating the latest data.

Class-Incremental Semantic Segmentation Segmentation

Guardians of Discourse: Evaluating LLMs on Multilingual Offensive Language Detection

no code implementations21 Oct 2024 Jianfei He, Lilin Wang, Jiaying Wang, Zhenyu Liu, Hongbin Na, Zimu Wang, Wei Wang, Qi Chen

Identifying offensive language is essential for maintaining safety and sustainability in the social media era.

SPFresh: Incremental In-Place Update for Billion-Scale Vector Search

no code implementations18 Oct 2024 Yuming Xu, Hengyu Liang, Jin Li, Shuotao Xu, Qi Chen, Qianxi Zhang, Cheng Li, Ziyue Yang, Fan Yang, Yuqing Yang, Peng Cheng, Mao Yang

LIRE achieves low-overhead vector updates by only reassigning vectors at the boundary between partitions, where in a high-quality vector index the amount of such vectors are deemed small.

Information Retrieval Question Answering

PAPL-SLAM: Principal Axis-Anchored Monocular Point-Line SLAM

no code implementations16 Oct 2024 Guanghao Li, Yu Cao, Qi Chen, Yifan Yang, Jian Pu

In point-line SLAM systems, the utilization of line structural information and the optimization of lines are two significant problems.

Line Detection

Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles

1 code implementation9 Oct 2024 Qi Chen, BoWen Zhang, Gang Wang, Qi Wu

To address these challenges, we introduce SPLAT, a benchmark leveraging Situation Puzzles to evaluate and elicit LAteral Thinking of LLMs.

Question Answering

Integrative Decoding: Improve Factuality via Implicit Self-consistency

1 code implementation2 Oct 2024 Yi Cheng, Xiao Liang, Yeyun Gong, Wen Xiao, Song Wang, Yuji Zhang, Wenjun Hou, Kaishuai Xu, Wenge Liu, Wenjie Li, Jian Jiao, Qi Chen, Peng Cheng, Wayne Xiong

Self-consistency-based approaches, which involve repeatedly sampling multiple outputs and selecting the most consistent one as the final response, prove to be remarkably effective in improving the factual accuracy of large language models.

TruthfulQA

Accelerated Multi-Contrast MRI Reconstruction via Frequency and Spatial Mutual Learning

1 code implementation21 Sep 2024 Qi Chen, Xiaohan Xing, Zhen Chen, Zhiwei Xiong

To exploit complementary information from the auxiliary modality, we propose a Cross-Modal Selective fusion (CMS-fusion) module that selectively incorporate the frequency and spatial features from the auxiliary modality to enhance the corresponding branch of the target modality.

MRI Reconstruction

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

1 code implementation16 Sep 2024 Di Liu, Meng Chen, Baotong Lu, Huiqiang Jiang, Zhenhua Han, Qianxi Zhang, Qi Chen, Chengruidong Zhang, Bailu Ding, Kai Zhang, Chen Chen, Fan Yang, Yuqing Yang, Lili Qiu

This paper proposes RetrievalAttention, a training-free approach to both accelerate attention computation and reduce GPU memory consumption.

Retrieval

Analyzing Tumors by Synthesis

no code implementations9 Sep 2024 Qi Chen, Yuxiang Lai, Xiaoxi Chen, Qixin Hu, Alan Yuille, Zongwei Zhou

We also present case studies in the liver, pancreas, and kidneys reveal that AI trained on synthetic tumors can achieve performance comparable to, or better than, AI only trained on real data.

XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training

no code implementations28 Jul 2024 Biao Wu, Yutong Xie, Zeyu Zhang, Minh Hieu Phan, Qi Chen, Ling Chen, Qi Wu

To this end, this paper proposes a XLIP (Masked modelling for medical Language-Image Pre-training) framework to enhance pathological learning and feature learning via unpaired data.

Contrastive Learning Language Modelling

InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation

1 code implementation14 Jul 2024 Zeyu Zhang, Akide Liu, Qi Chen, Feng Chen, Ian Reid, Richard Hartley, Bohan Zhuang, Hao Tang

Text-to-motion generation holds potential for film, gaming, and robotics, yet current methods often prioritize short motion generation, making it challenging to produce long motion sequences effectively: (1) Current methods struggle to handle long motion sequences as a single input due to prohibitively high computational cost; (2) Breaking down the generation of long motion sequences into shorter segments can result in inconsistent transitions and requires interpolation or inpainting, which lacks entire sequence modeling.

Mamba Motion Generation

MMR-Mamba: Multi-Modal MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion

no code implementations27 Jun 2024 Jing Zou, Lanqing Liu, Qi Chen, Shujun Wang, Zhanli Hu, Xiaohan Xing, Jing Qin

To accelerate the acquisition process, a practical approach is to reconstruct images of the target modality, which requires longer scanning times, from under-sampled k-space data using the fully-sampled reference modality with shorter scanning times as guidance.

Mamba MRI Reconstruction

Meta-Learning Neural Procedural Biases

no code implementations12 Jun 2024 Christian Raymond, Qi Chen, Bing Xue, Mengjie Zhan

The goal of few-shot learning is to generalize and achieve high performance on new unseen learning tasks, where each task has only a limited number of examples available.

Few-Shot Learning

Intersectional Unfairness Discovery

1 code implementation31 May 2024 Gezheng Xu, Qi Chen, Charles Ling, Boyu Wang, Changjian Shui

To further evaluate the generated unseen but possible unfair intersectional sensitive attributes, we formulate them as prompts and use modern generative AI to produce new texts and images.

Attribute Fairness

Sharpness-Aware Minimization for Evolutionary Feature Construction in Regression

no code implementations11 May 2024 Hengzhe Zhang, Qi Chen, Bing Xue, Wolfgang Banzhaf, Mengjie Zhang

In recent years, genetic programming (GP)-based evolutionary feature construction has achieved significant success.

regression Symbolic Regression

AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding

1 code implementation6 May 2024 Tao Liu, Feilong Chen, Shuai Fan, Chenpeng Du, Qi Chen, Xie Chen, Kai Yu

The paper introduces AniTalker, an innovative framework designed to generate lifelike talking faces from a single portrait.

Metric Learning Self-Supervised Learning

STT: Stateful Tracking with Transformers for Autonomous Driving

no code implementations30 Apr 2024 Longlong Jing, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng, Sangjin Lee, Chris Sweeney, Qiurui He, Wei-Chih Hung, Tong He, Xingyi Zhou, Farshid Moussavi, Zijian Guo, Yin Zhou, Mingxing Tan, Weilong Yang, CongCong Li

In this paper, we propose STT, a Stateful Tracking model built with Transformers, that can consistently track objects in the scenes while also predicting their states accurately.

Autonomous Driving

GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting

no code implementations29 Apr 2024 Bo Chen, Shoukang Hu, Qi Chen, Chenpeng Du, Ran Yi, Yanmin Qian, Xie Chen

We present GStalker, a 3D audio-driven talking face generation model with Gaussian Splatting for both fast training (40 minutes) and real-time rendering (125 FPS) with a 3$\sim$5 minute video for training material, in comparison with previous 2D and 3D NeRF-based modeling frameworks which require hours of training and seconds of rendering per frame.

NeRF Talking Face Generation

DKE-Research at SemEval-2024 Task 2: Incorporating Data Augmentation with Generative Models and Biomedical Knowledge to Enhance Inference Robustness

no code implementations14 Apr 2024 Yuqi Wang, Zeqiang Wang, Wei Wang, Qi Chen, Kaizhu Huang, Anh Nguyen, Suparna De

Safe and reliable natural language inference is critical for extracting insights from clinical trial reports but poses challenges due to biases in large pre-trained language models.

Data Augmentation Diversity +3

A dataset of primary nasopharyngeal carcinoma MRI with multi-modalities segmentation

no code implementations4 Apr 2024 Yin Li, Qi Chen, Kai Wang, Meige Li, Liping Si, Yingwei Guo, Yu Xiong, Qixing Wang, Yang Qin, Ling Xu, Patrick van der Smagt, Jun Tang, Nutan Chen

Multi-modality magnetic resonance imaging data with various sequences facilitate the early diagnosis, tumor segmentation, and disease staging in the management of nasopharyngeal carcinoma (NPC).

Management Tumor Segmentation

Fast and Efficient Local Search for Genetic Programming Based Loss Function Learning

1 code implementation1 Mar 2024 Christian Raymond, Qi Chen, Bing Xue, Mengjie Zhang

In this paper, we develop upon the topic of loss function learning, an emergent meta-learning paradigm that aims to learn loss functions that significantly improve the performance of the models trained under them.

Meta-Learning

Towards Generalizable Tumor Synthesis

1 code implementation CVPR 2024 Qi Chen, Xiaoxi Chen, Haorui Song, Zhiwei Xiong, Alan Yuille, Chen Wei, Zongwei Zhou

Tumor synthesis enables the creation of artificial tumors in medical images, facilitating the training of AI models for tumor detection and segmentation.

Computed Tomography (CT)

Understanding the Weakness of Large Language Model Agents within a Complex Android Environment

1 code implementation9 Feb 2024 Mingzhe Xing, Rongkai Zhang, Hui Xue, Qi Chen, Fan Yang, Zhen Xiao

These challenges motivate AndroidArena, an environment and benchmark designed to evaluate LLM agents on a modern operating system.

Date Understanding Language Modeling +2

Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale

1 code implementation2 Feb 2024 Yangyang Shu, Xiaofeng Cao, Qi Chen, BoWen Zhang, Ziqin Zhou, Anton Van Den Hengel, Lingqiao Liu

Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.

Unsupervised Domain Adaptation

Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion

no code implementations29 Dec 2023 Yun Chen, Lingxiao Yang, Qi Chen, Jian-Huang Lai, Xiaohua Xie

We introduce a two-stage pipeline to effectively train our network: Stage I utilizes inter-speech contrastive learning to model fine-grained emotion and intra-speech disentanglement learning to better separate emotion and content.

Contrastive Learning Disentanglement +1

WebVLN: Vision-and-Language Navigation on Websites

1 code implementation25 Dec 2023 Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu

Vision-and-Language Navigation (VLN) task aims to enable AI agents to accurately understand and follow natural language instructions to navigate through real-world environments, ultimately reaching specific target locations.

Navigate Vision and Language Navigation

SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles

1 code implementation8 Dec 2023 Deyuan Qu, Qi Chen, Tianyu Bai, HongSheng Lu, Heng Fan, Hao Zhang, Song Fu, Qing Yang

Cooperative perception for connected and automated vehicles is traditionally achieved through the fusion of feature maps from two or more vehicles.

3D Object Detection object-detection

Generating Valid and Natural Adversarial Examples with Large Language Models

no code implementations20 Nov 2023 Zimu Wang, Wei Wang, Qi Chen, Qiufeng Wang, Anh Nguyen

Deep learning-based natural language processing (NLP) models, particularly pre-trained language models (PLMs), have been revealed to be vulnerable to adversarial attacks.

Adversarial Attack valid

Zero-Shot Medical Information Retrieval via Knowledge Graph Embedding

no code implementations31 Oct 2023 Yuqi Wang, Zeqiang Wang, Wei Wang, Qi Chen, Kaizhu Huang, Anh Nguyen, Suparna De

In the era of the Internet of Things (IoT), the retrieval of relevant medical information has become essential for efficient clinical decision-making.

Decision Making Information Retrieval +2

Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement Learning

1 code implementation6 Oct 2023 Yinda Chen, Wei Huang, Shenglong Zhou, Qi Chen, Zhiwei Xiong

By extracting semantic information from unlabeled data, self-supervised methods can improve the performance of downstream tasks, among which the mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.

Multi-agent Reinforcement Learning reinforcement-learning +3

Model-enhanced Vector Index

1 code implementation NeurIPS 2023 Hailin Zhang, Yujing Wang, Qi Chen, Ruiheng Chang, Ting Zhang, Ziming Miao, Yingyan Hou, Yang Ding, Xupeng Miao, Haonan Wang, Bochen Pang, Yuefeng Zhan, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Xing Xie, Mao Yang, Bin Cui

We empirically show that our model achieves better performance on the commonly used academic benchmarks MSMARCO Passage and Natural Questions, with comparable serving latency to dense retrieval solutions.

model Natural Questions +2

Domain Adaptive Synapse Detection with Weak Point Annotations

no code implementations31 Aug 2023 Qi Chen, Wei Huang, Yueyi Zhang, Zhiwei Xiong

In the second stage, we improve model generalizability on target data by regenerating square masks to get high-quality pseudo labels.

Segmentation

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation

1 code implementation31 Aug 2023 Changli Wu, Yiwei Ma, Qi Chen, Haowei Wang, Gen Luo, Jiayi Ji, Xiaoshuai Sun

In 3D Referring Expression Segmentation (3D-RES), the earlier approach adopts a two-stage paradigm, extracting segmentation proposals and then matching them with referring expressions.

Navigate Referring Expression +3

Dynamic Strategy Chain: Dynamic Zero-Shot CoT for Long Mental Health Support Generation

no code implementations21 Aug 2023 Qi Chen, Dexi Liu

The combination of chain-of-thought (CoT) prompting and Large Language Models (LLMs) is employed and get the SOTA performance on various NLP tasks, especially on text generation tasks.

Text Generation

Likelihood-Based Text-to-Image Evaluation with Patch-Level Perceptual and Semantic Credit Assignment

1 code implementation16 Aug 2023 Qi Chen, Chaorui Deng, Zixiong Huang, BoWen Zhang, Mingkui Tan, Qi Wu

In this paper, we propose to evaluate text-to-image generation performance by directly estimating the likelihood of the generated images using a pre-trained likelihood-based text-to-image generative model, i. e., a higher likelihood indicates better perceptual quality and better text-image alignment.

Text-to-Image Generation

Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

1 code implementation ICCV 2023 Chaorui Deng, Qi Chen, Pengda Qin, Da Chen, Qi Wu

In text-video retrieval, recent works have benefited from the powerful learning capabilities of pre-trained text-image foundation models (e. g., CLIP) by adapting them to the video domain.

Retrieval Video Captioning +1

Chinese Financial Text Emotion Mining: GCGTS -- A Character Relationship-based Approach for Simultaneous Aspect-Opinion Pair Extraction

no code implementations4 Aug 2023 Qi Chen, Dexi Liu

This innovative structure reduces the excessive reliance on pre-trained language models and emphasizes the modeling of structure and local relationships, thereby improving the performance of the model on Chinese financial texts.

Extract Aspect Sentiment Analysis

Improving Tuning-Free Real Image Editing with Proximal Guidance

1 code implementation8 Jun 2023 Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Anastasis Stathopoulos, Xiaoxiao He, Yuxiao Chen, Di Liu, Qilong Zhangli, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas

Null-text inversion (NTI) optimizes null embeddings to align the reconstruction and inversion trajectories with larger CFG scales, enabling real image editing with cross-attention control.

Act Like a Radiologist: Radiology Report Generation across Anatomical Regions

2 code implementations26 May 2023 Qi Chen, Yutong Xie, Biao Wu, Xiaomin Chen, James Ang, Minh-Son To, Xiaojun Chang, Qi Wu

To address these issues, we propose X-RGen, a radiologist-minded report generation framework across six anatomical regions.

 Ranked #1 on Medical Report Generation on IU X-Ray (using extra training data)

Decoder Medical Report Generation +1

Algorithm-Dependent Bounds for Representation Learning of Multi-Source Domain Adaptation

1 code implementation4 Apr 2023 Qi Chen, Mario Marchand

We further provide algorithm-dependent generalization bounds for these two settings, where the generalization is characterized by the mutual information between the parameters and the data.

Domain Adaptation Generalization Bounds +1

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder

no code implementations30 Mar 2023 Chenpeng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian

Additionally, we propose a novel method for generating continuous video frames with the DDIM image decoder trained on individual frames, eliminating the need for modelling the joint distribution of consecutive frames directly.

Decoder Talking Face Generation

IRGen: Generative Modeling for Image Retrieval

1 code implementation17 Mar 2023 Yidan Zhang, Ting Zhang, Dong Chen, Yujing Wang, Qi Chen, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Mao Yang, Qingmin Liao, Jingdong Wang, Baining Guo

While generative modeling has become prevalent across numerous research fields, its integration into the realm of image retrieval remains largely unexplored and underjustified.

Image Retrieval Retrieval

AutoMatch: A Large-scale Audio Beat Matching Benchmark for Boosting Deep Learning Assistant Video Editing

no code implementations3 Mar 2023 Sen Pei, Jingya Yu, Qi Chen, Wozhou He

In this paper, we investigate a novel and practical problem, namely audio beat matching (ABM), which aims to recommend the proper transition time stamps based on the background music.

Video Editing

SUPS: A Simulated Underground Parking Scenario Dataset for Autonomous Driving

1 code implementation25 Feb 2023 Jiawei Hou, Qi Chen, Yurong Cheng, Guang Chen, xiangyang xue, Taiping Zeng, Jian Pu

However, there is a lack of underground parking scenario datasets with multiple sensors and well-labeled images that support both SLAM tasks and perception tasks, such as semantic segmentation and parking slot detection.

3D Reconstruction Autonomous Driving +4

Path Integral Method for Pricing Proportional Step Double-Barrier Option with Time Dependent Parameters

no code implementations15 Feb 2023 Qi Chen, Chao Guo

Path integral method in quantum mechanics provides a new thinking for barrier option pricing.

GMConv: Modulating Effective Receptive Fields for Convolutional Kernels

no code implementations9 Feb 2023 Qi Chen, Chao Li, Jia Ning, Stephen Lin, Kun He

Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work.

Image Classification object-detection +1

Online Loss Function Learning

no code implementations30 Jan 2023 Christian Raymond, Qi Chen, Bing Xue, Mengjie Zhang

Loss function learning is a new meta-learning paradigm that aims to automate the essential task of designing a loss function for a machine learning model.

Meta-Learning

StrokeGAN+: Few-Shot Semi-Supervised Chinese Font Generation with Stroke Encoding

no code implementations11 Nov 2022 Jinshan Zeng, Yefei Wang, Qi Chen, Yunxin Liu, Mingwen Wang, Yuan YAO

The effectiveness of the proposed model for the zero-shot traditional Chinese font generation is also evaluated in this paper.

Font Generation

On Learning Fairness and Accuracy on Multiple Subgroups

1 code implementation19 Oct 2022 Changjian Shui, Gezheng Xu, Qi Chen, Jiaqi Li, Charles Ling, Tal Arbel, Boyu Wang, Christian Gagné

In the upper-level, the fair predictor is updated to be close to all subgroup specific predictors.

Fairness

Pareto-aware Neural Architecture Generation for Diverse Computational Budgets

1 code implementation14 Oct 2022 Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan

More critically, these independent search processes cannot share their learned knowledge (i. e., the distribution of good architectures) with each other and thus often result in limited search results.

A Hamiltonian Approach to Floating Barrier Option Pricing

no code implementations26 Sep 2022 Qi Chen, Hong-tao Wang, Chao Guo

Hamiltonian approach in quantum mechanics provides a new thinking for barrier option pricing.

Learning Symbolic Model-Agnostic Loss Functions via Meta-Learning

2 code implementations19 Sep 2022 Christian Raymond, Qi Chen, Bing Xue, Mengjie Zhang

In this paper, we develop upon the emerging topic of loss function learning, which aims to learn loss functions that significantly improve the performance of the models trained under them.

Meta-Learning

Learning Distinct and Representative Styles for Image Captioning

1 code implementation17 Sep 2022 Qi Chen, Chaorui Deng, Qi Wu

Our innovative idea is to explore the rich modes in the training caption corpus to learn a set of "mode embeddings", and further use them to control the mode of the generated captions for existing image captioning models.

Diversity Image Captioning +1

Towards Lightweight Super-Resolution with Dual Regression Learning

2 code implementations16 Jul 2022 Yong Guo, Mingkui Tan, Zeshuai Deng, Jingdong Wang, Qi Chen, JieZhang Cao, Yanwu Xu, Jian Chen

Nevertheless, it is hard for existing model compression methods to accurately identify the redundant components due to the extremely large SR mapping space.

Image Super-Resolution Model Compression +1

Optimization-Induced Graph Implicit Nonlinear Diffusion

1 code implementation29 Jun 2022 Qi Chen, Yifei Wang, Yisen Wang, Jiansheng Yang, Zhouchen Lin

Moreover, we show that the optimization-induced variants of our models can boost the performance and improve training stability and efficiency as well.

Formation Tracking for a Multi-Auv System Based on an Adaptive Sliding Mode Method in the Water Flow Environment

no code implementations9 Jun 2022 Xin Li, Daqi Zhu, Bing Sun, Qi Chen, Wenyang Gan, Zhigang Li

At last, a robust sliding mode controller with continuous model predictive control strategy for the multi-AUV system is developed to achieve leader-follower formation tracking under the presence of bounded flow disturbances, and simulations are implemented to confirm the effectiveness of the proposed method.

Model Predictive Control

A Neural Corpus Indexer for Document Retrieval

1 code implementation6 Jun 2022 Yujing Wang, Yingyan Hou, Haonan Wang, Ziming Miao, Shibin Wu, Hao Sun, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Allen Sun, Weiwei Deng, Qi Zhang, Mao Yang

To this end, we propose Neural Corpus Indexer (NCI), a sequence-to-sequence network that generates relevant document identifiers directly for a designated query.

Decoder Retrieval +1

Fair Representation Learning through Implicit Path Alignment

no code implementations26 May 2022 Changjian Shui, Qi Chen, Jiaqi Li, Boyu Wang, Christian Gagné

We consider a fair representation learning perspective, where optimal predictors, on top of the data representation, are ensured to be invariant with respect to different sub-groups.

Fairness Representation Learning

Proposal-free Lidar Panoptic Segmentation with Pillar-level Affinity

no code implementations19 Apr 2022 Qi Chen, Sourabh Vora

We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation.

Classification Clustering +4

Path Integral Method for Proportional Step and Proportional Double-Barrier Step Option Pricing

no code implementations16 Dec 2021 Qi Chen, Chao Guo

Path integral method in quantum mechanics provides a new thinking for barrier option pricing.

PolarStream: Streaming Object Detection and Segmentation with Polar Pillars

no code implementations NeurIPS 2021 Qi Chen, Sourabh Vora, Oscar Beijbom

Recent works recognized lidars as an inherently streaming data source and showed that the end-to-end latency of lidar perception models can be reduced significantly by operating on wedge-shaped point cloud sectors rather then the full point cloud.

Object object-detection +1

V2C: Visual Voice Cloning

no code implementations CVPR 2022 Qi Chen, Yuanqing Li, Yuankai Qi, Jiaqiu Zhou, Mingkui Tan, Qi Wu

Existing Voice Cloning (VC) tasks aim to convert a paragraph text to a speech with desired voice specified by a reference audio.

Voice Cloning

Tracklet-Switch Adversarial Attack against Pedestrian Multi-Object Tracking Trackers

5 code implementations17 Nov 2021 Delv Lin, Qi Chen, Chengyu Zhou, Kun He

Multi-Object Tracking (MOT) has achieved aggressive progress and derived many excellent deep learning trackers.

Adversarial Attack Multi-Object Tracking +1

PolarStream: Streaming Lidar Object Detection and Segmentation with Polar Pillars

no code implementations14 Jun 2021 Qi Chen, Sourabh Vora, Oscar Beijbom

Recent works recognized lidars as an inherently streaming data source and showed that the end-to-end latency of lidar perception models can be reduced significantly by operating on wedge-shaped point cloud sectors rather then the full point cloud.

LIDAR Semantic Segmentation object-detection +1

Pareto-Frontier-aware Neural Architecture Generation for Diverse Budgets

no code implementations27 Feb 2021 Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan

To this end, we propose a Pareto-Frontier-aware Neural Architecture Generator (NAG) which takes an arbitrary budget as input and produces the Pareto optimal architecture for the target budget.

Towards Accurate and Compact Architectures via Neural Architecture Transformer

2 code implementations20 Feb 2021 Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Zhipeng Li, Jian Chen, Peilin Zhao, Junzhou Huang

To address this issue, we propose a Neural Architecture Transformer++ (NAT++) method which further enlarges the set of candidate transitions to improve the performance of architecture optimization.

Neural Architecture Search valid

StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke Encoding

1 code implementation16 Dec 2020 Jinshan Zeng, Qi Chen, Yunxin Liu, Mingwen Wang, Yuan YAO

However, these deep generative models may suffer from the mode collapse issue, which significantly degrades the diversity and quality of generated results.

Diversity Font Generation

Modular Graph Attention Network for Complex Visual Relational Reasoning

no code implementations22 Nov 2020 Yihan Zheng, Zhiquan Wen, Mingkui Tan, Runhao Zeng, Qi Chen, YaoWei Wang, Qi Wu

Moreover, to capture the complex logic in a query, we construct a relational graph to represent the visual objects and their relationships, and propose a multi-step reasoning method to progressively understand the complex logic.

Graph Attention Question Answering +5

CoFF: Cooperative Spatial Feature Fusion for 3D Object Detection on Autonomous Vehicles

no code implementations24 Sep 2020 Jingda Guo, Dominic Carrillo, Sihai Tang, Qi Chen, Qing Yang, Song Fu, Xi Wang, Nannan Wang, Paparao Palacharla

To reduce the amount of transmitted data, feature map based fusion is recently proposed as a practical solution to cooperative 3D object detection by autonomous vehicles.

3D Object Detection Autonomous Vehicles +2

Beyond $\mathcal{H}$-Divergence: Domain Adaptation Theory With Jensen-Shannon Divergence

no code implementations30 Jul 2020 Changjian Shui, Qi Chen, Jun Wen, Fan Zhou, Christian Gagné, Boyu Wang

We reveal the incoherence between the widely-adopted empirical domain adversarial training and its generally-assumed theoretical counterpart based on $\mathcal{H}$-divergence.

Domain Adaptation Transfer Learning

GLOW : Global Weighted Self-Attention Network for Web Search

1 code implementation10 Jul 2020 Xuan Shan, Chuanjie Liu, Yiqian Xia, Qi Chen, Yusi Zhang, Kaize Ding, Yaobo Liang, Angen Luo, Yuxiang Luo

Deep matching models aim to facilitate search engines retrieving more relevant documents by mapping queries and documents into semantic vectors in the first-stage retrieval.

Document Ranking Information Retrieval +2

DCANet: Learning Connected Attentions for Convolutional Neural Networks

no code implementations9 Jul 2020 Xu Ma, Jingda Guo, Sihai Tang, Zhinan Qiao, Qi Chen, Qing Yang, Song Fu

With DCANet, all attention blocks in a CNN model are trained jointly, which improves the ability of attention learning.

Attention-guided Context Feature Pyramid Network for Object Detection

2 code implementations23 May 2020 Junxu Cao, Qi Chen, Jun Guo, Ruichao Shi

For object detection, how to address the contradictory requirement between feature map resolution and receptive field on high-resolution inputs still remains an open question.

Instance Segmentation Object +4

Intelligent Home 3D: Automatic 3D-House Design from Linguistic Descriptions Only

1 code implementation CVPR 2020 Qi Chen, Qi Wu, Rui Tang, Yu-Han Wang, Shuai Wang, Mingkui Tan

To this end, we propose a House Plan Generative Model (HPGM) that first translates the language input to a structural graph representation and then predicts the layout of rooms with a Graph Conditioned Layout Prediction Network (GC LPN) and generates the interior texture with a Language Conditioned Texture GAN (LCT-GAN).

Text to 3D

MonaLog: a Lightweight System for Natural Language Inference Based on Monotonicity

1 code implementation SCiL 2020 Hai Hu, Qi Chen, Kyle Richardson, Atreyee Mukherjee, Lawrence S. Moss, Sandra Kuebler

We present a new logic-based inference engine for natural language inference (NLI) called MonaLog, which is based on natural logic and the monotonicity calculus.

Data Augmentation Natural Language Inference

F-Cooper: Feature based Cooperative Perception for Autonomous Vehicle Edge Computing System Using 3D Point Clouds

1 code implementation13 Sep 2019 Qi Chen

Autonomous vehicles are heavily reliant upon their sensors to perfect the perception of surrounding environments, however, with the current state of technology, the data which a vehicle uses is confined to that from its own sensors.

3D Object Detection Autonomous Driving +4

FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture

no code implementations4 Jul 2019 Wenjun Liu, Yuchun Huang, Ying Li, Qi Chen

Specifically, we first propose the Multi-Dilation (MD) module, which can synthesize the crack features of multiple context sizes via dilated convolution with multiple rates.

Decoder

Semi-parametric Bayesian variable selection for gene-environment interactions

3 code implementations3 Jun 2019 Jie Ren, Fei Zhou, Xiaoxi Li, Qi Chen, Hongmei Zhang, Shuangge Ma, Yu Jiang, Cen Wu

Existing Bayesian methods for G$\times$E interaction studies are challenged by the high-dimensional nature of the study and the complexity of environmental influences.

Methodology

Natural Language Inference with Monotonicity

no code implementations WS 2019 Hai Hu, Qi Chen, Larry Moss

This paper describes a working system which performs natural language inference using polarity-marked parse trees.

Natural Language Inference

Auto-Embedding Generative Adversarial Networks for High Resolution Image Synthesis

1 code implementation27 Mar 2019 Yong Guo, Qi Chen, Jian Chen, Qingyao Wu, Qinfeng Shi, Mingkui Tan

To address this issue, we develop a novel GAN called Auto-Embedding Generative Adversarial Network (AEGAN), which simultaneously encodes the global structure features and captures the fine-grained details.

Generative Adversarial Network Image Generation +2

DUNet: A deformable network for retinal vessel segmentation

no code implementations3 Nov 2018 Qiangguo Jin, Zhaopeng Meng, Tuan D. Pham, Qi Chen, Leyi Wei, Ran Su

Results show that more detailed vessels are extracted by DUNet and it exhibits state-of-the-art performance for retinal vessel segmentation with a global accuracy of 0. 9697/0. 9722/0. 9724 and AUC of 0. 9856/0. 9868/0. 9863 on DRIVE, STARE and CHASE_DB1 respectively.

Retinal Vessel Segmentation Segmentation

Semantic Segmentation for Urban Planning Maps based on U-Net

no code implementations28 Sep 2018 Zhiling Guo, Hiroaki Shengoku, Guangming Wu, Qi Chen, Wei Yuan, Xiaodan Shi, Xiaowei Shao, Yongwei Xu, Ryosuke Shibasaki

The results indicate the proposed method can serve as a viable tool for urban planning map semantic segmentation task with high accuracy and efficiency.

Segmentation Semantic Segmentation

Dual Reconstruction Nets for Image Super-Resolution with Gradient Sensitive Loss

no code implementations19 Sep 2018 Yong Guo, Qi Chen, Jian Chen, Junzhou Huang, Yanwu Xu, JieZhang Cao, Peilin Zhao, Mingkui Tan

However, most deep learning methods employ feed-forward architectures, and thus the dependencies between LR and HR images are not fully exploited, leading to limited learning performance.

Image Super-Resolution

SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data

no code implementations1 Apr 2018 Qi Chen, Weichao Qiu, Yi Zhang, Lingxi Xie, Alan Yuille

But, this raises an important problem in active vision: given an {\bf infinite} data space, how to effectively sample a {\bf finite} subset to train a visual classifier?

Classification General Classification

UnrealStereo: Controlling Hazardous Factors to Analyze Stereo Vision

no code implementations14 Dec 2016 Yi Zhang, Weichao Qiu, Qi Chen, Xiaolin Hu, Alan Yuille

We generate a large synthetic image dataset with automatically computed hazardous regions and analyze algorithms on these regions.

Image Generation

A Hierarchical Distributed Processing Framework for Big Image Data

no code implementations3 Jul 2016 Le Dong, Zhiyu Lin, Yan Liang, Ling He, Ning Zhang, Qi Chen, Xiaochun Cao, Ebroul lzquierdo

The proposed ICP framework consists of two mechanisms, i. e. SICP (Static ICP) and DICP (Dynamic ICP).

Data classification using the Dempster-Shafer method

no code implementations2 Sep 2014 Qi Chen, Amanda Whitbrook, Uwe Aickelin, Chris Roadknight

In this paper, the Dempster-Shafer method is employed as the theoretical basis for creating data classification systems.

Attribute Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.