Search Results for author: Kun Li

Found 130 papers, 44 papers with code

Small Molecule Drug Discovery Through Deep Learning:Progress, Challenges, and Opportunities

no code implementations13 Feb 2025 Kun Li, Yida Xiong, Hongzhi Zhang, Xiantao Cai, Bo Du, Wenbin Hu

Due to their excellent drug-like and pharmacokinetic properties, small molecule drugs are widely used to treat various diseases, making them a critical component of drug discovery.

Drug Discovery Out-of-Distribution Generalization

Exploiting Ensemble Learning for Cross-View Isolated Sign Language Recognition

1 code implementation4 Feb 2025 Fei Wang, Kun Li, Yiqi Nie, Zhangling Duan, Peng Zou, Zhiliang Wu, Yuwei Wang, Yanyan Wei

In this paper, we present our solution to the Cross-View Isolated Sign Language Recognition (CV-ISLR) challenge held at WWW 2025.

Ensemble Learning Sign Language Recognition

Can Molecular Evolution Mechanism Enhance Molecular Representation?

no code implementations27 Jan 2025 Kun Li, Longtao Hu, Xiantao Cai, Jia Wu, Wenbin Hu

We extract and analyze the changes in the evolutionary pathway and explore combining it with existing molecular representations.

Molecular Property Prediction molecular representation +1

Prompt-Aware Controllable Shadow Removal

no code implementations25 Jan 2025 Kerui Chen, Zhiliang Wu, Wenjin Hou, Kun Li, Hehe Fan, Yi Yang

PACSRNet consists of two key modules: a prompt-aware module that generates shadow masks for the specified subject based on the user prompt, and a shadow removal module that uses the shadow prior from the first module to restore the content in the shadowed regions.

Shadow Removal

Enhancing Intent Understanding for Ambiguous Prompts through Human-Machine Co-Adaptation

no code implementations25 Jan 2025 Yangfan He, Jianhui Wang, Kun Li, Yijin Wang, Li Sun, Jun Yin, Miao Zhang, Xueqian Wang

Modern image generation systems can produce high-quality visuals, yet user prompts often contain ambiguities, requiring multiple revisions.

Image Generation Language Modeling +2

LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning

no code implementations15 Jan 2025 Tuowei Wang, Xingyu Chen, Kun Li, Ting Cao, Ju Ren, Yaoxue Zhang

We implement LeMo as an end-to-end fine-tuning system compatible with various LLM architectures and other optimization techniques.

Computational Efficiency Informativeness +1

Enhancing Low-Cost Video Editing with Lightweight Adaptors and Temporal-Aware Inversion

2 code implementations8 Jan 2025 Yangfan He, Sida Li, Kun Li, Jianhui Wang, Binxu Li, Tianyu Shi, Jun Yin, Miao Zhang, Xueqian Wang

Recent advancements in text-to-image (T2I) generation using diffusion models have enabled cost-effective video-editing applications by leveraging pre-trained models, eliminating the need for resource-intensive training.

Video Editing

Scale-wise Bidirectional Alignment Network for Referring Remote Sensing Image Segmentation

no code implementations1 Jan 2025 Kun Li, George Vosselman, Michael Ying Yang

The goal of referring remote sensing image segmentation (RRSIS) is to extract specific pixel-level regions within an aerial image via a natural language expression.

feature selection Image Segmentation +1

Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition

1 code implementation19 Dec 2024 Kun Li, Dan Guo, Guoliang Chen, Chunxiao Fan, Jingyuan Xu, Zhiliang Wu, Hehe Fan, Meng Wang

In addition, we propose a new prototypical diversity amplification loss to strengthen the model's capacity by amplifying the differences between different prototypes.

Emotion Recognition Micro-Action Recognition

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

no code implementations14 Dec 2024 Zhangbin Li, Jinxing Zhou, Jing Zhang, Shengeng Tang, Kun Li, Dan Guo

The M-KPT and S-KPT modules are performed in parallel for each temporal segment, allowing balanced tracking of salient and sounding objects.

Audio-visual Question Answering Object Tracking +2

GPTDrawer: Enhancing Visual Synthesis through ChatGPT

no code implementations11 Dec 2024 Kun Li, Xinwei Chen, Tianyou Song, Hansong Zhang, Wenzhe Zhang, Qing Shan

In the burgeoning field of AI-driven image generation, the quest for precision and relevance in response to textual prompts remains paramount.

Image Generation Keyword Extraction

Repetitive Action Counting with Hybrid Temporal Relation Modeling

no code implementations10 Dec 2024 Kun Li, Xinge Peng, Dan Guo, Xun Yang, Meng Wang

Repetitive Action Counting (RAC) aims to count the number of repetitive actions occurring in videos.

Relation Repetitive Action Counting

FOF-X: Towards Real-time Detailed Human Reconstruction from a Single Image

no code implementations8 Dec 2024 Qiao Feng, Yebin Liu, Yu-Kun Lai, Jingyu Yang, Kun Li

We introduce FOF-X for real-time reconstruction of detailed human geometry from a single image.

Towards Pixel-Level Prediction for Gaze Following: Benchmark and Approach

no code implementations30 Nov 2024 Feiyang Liu, Dan Guo, Jingyuan Xu, Zihao He, Shengeng Tang, Kun Li, Meng Wang

Following the gaze of other people and analyzing the target they are looking at can help us understand what they are thinking, and doing, and predict the actions that may follow.

Segmentation

Crowd3D++: Robust Monocular Crowd Reconstruction with Upright Space

no code implementations9 Nov 2024 Jing Huang, Hao Wen, Tianyi Zhou, Haozhe Lin, Yu-Kun Lai, Kun Li

This paper aims to reconstruct hundreds of people's 3D poses, shapes, and locations from a single image with unknown camera parameters.

Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management

no code implementations25 Oct 2024 Tuowei Wang, Ruwen Fan, Minxing Huang, Zixu Hao, Kun Li, Ting Cao, Youyou Lu, Yaoxue Zhang, Ju Ren

In this paper, we propose Ripple, a novel approach that accelerates LLM inference on smartphones by optimizing neuron placement in flash memory.

Management

Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model

no code implementations17 Oct 2024 Yida Xiong, Kun Li, Weiwei Liu, Jia Wu, Bo Du, Shirui Pan, Wenbin Hu

TransDLM leverages standardized chemical nomenclature as semantic representations of molecules and implicitly embeds property requirements into textual descriptions, thereby preventing error propagation during diffusion process.

Drug Discovery Language Modeling +2

Joint Trajectory Replanning for Mars Ascent Vehicle under Propulsion System Faults: A Suboptimal Learning-Based Warm-Start Approach

no code implementations29 Sep 2024 Kun Li, Guangtao Ran, Yanning Guo, Ju H. Park, Yao Zhang

During the Mars ascent vehicle (MAV) launch missions, when encountering a thrust drop type of propulsion system fault problem, the general trajectory replanning methods relying on step-by-step judgments may fail to make timely decisions, potentially leading to mission failure.

Decision Making

Offline Signature Verification Based on Feature Disentangling Aided Variational Autoencoder

no code implementations29 Sep 2024 Hansong Zhang, Jiangjian Guo, Kun Li, Yang Zhang, Yimei Zhao

First, genuine signatures and skilled forgeries are highly similar in their appearances, resulting in a small inter-class distance.

ZeroMamba: Exploring Visual State Space Model for Zero-Shot Learning

no code implementations27 Aug 2024 Wenjin Hou, Dingjie Fu, Kun Li, Shiming Chen, Hehe Fan, Yi Yang

Due to the limited receptive fields of CNNs and the quadratic complexity of ViTs, however, these visual backbones achieve suboptimal visual-semantic interactions.

Mamba Representation Learning +1

HumanCoser: Layered 3D Human Generation via Semantic-Aware Diffusion Model

no code implementations21 Aug 2024 Yi Wang, Jian Ma, Ruizhi Shao, Qiao Feng, Yu-Kun Lai, Kun Li

Specifically, to achieve layer-wise clothing generation, we propose a dual-representation decoupling framework for generating clothing decoupled from the human body, in conjunction with an innovative multi-layer fusion volume rendering method.

Human Animation Virtual Try-on

Fragment-Masked Molecular Optimization

no code implementations17 Aug 2024 Kun Li, Xiantao Cai, Jia Wu, Bo Du, Wenbin Hu

Molecular optimization is a crucial aspect of drug discovery, aimed at refining molecular structures to enhance drug efficacy and minimize side effects, ultimately accelerating the overall drug development process.

Drug Discovery

Prototype Learning for Micro-gesture Classification

no code implementations6 Aug 2024 Guoliang Chen, Fei Wang, Kun Li, Zhiliang Wu, Hehe Fan, Yi Yang, Meng Wang, Dan Guo

In this paper, we briefly introduce the solution developed by our team, HFUT-VUT, for the track of Micro-gesture Classification in the MiGA challenge at IJCAI 2024.

Action Recognition Classification +2

A Novel Edge Laplacian-based Approach for Adaptive Formation Control of Uncertain Multi-agent Systems with Unified Relative Error Performance

no code implementations1 Aug 2024 Kun Li, Kai Zhao, Yongduan Song, Lihua Xie

For most existing prescribed performance formation control methods, performance requirements are not directly imposed on the relative states between agents but on the consensus error, which lacks a clear physical interpretation of their solution.

PAS: Data-Efficient Plug-and-Play Prompt Augmentation System

no code implementations8 Jul 2024 Miao Zheng, Hao Liang, Fan Yang, Haoze Sun, Tianpeng Li, Lingchu Xiong, Yan Zhang, Youzhen Wu, Kun Li, Yanjun Shen, MingAn Lin, Tao Zhang, Guosheng Dong, Yujing Qiao, Kun Fang, WeiPeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou

This combination of high performance, efficiency, and flexibility makes PAS a valuable system for enhancing the usability and effectiveness of LLMs through improved prompt engineering.

Prompt Engineering

SBoRA: Low-Rank Adaptation with Regional Weight Updates

1 code implementation7 Jul 2024 Lai-Man Po, Yuyang Liu, Haoxuan Wu, Tianqi Zhang, Wing-Yin Yu, Zhuohan Wang, Zeyu Jiang, Kun Li

This paper introduces Standard Basis LoRA (SBoRA), a novel parameter-efficient fine-tuning approach for Large Language Models that builds upon the pioneering works of Low-Rank Adaptation (LoRA) and Orthogonal Adaptation.

Arithmetic Reasoning parameter-efficient fine-tuning

MMAD: Multi-label Micro-Action Detection in Videos

1 code implementation7 Jul 2024 Kun Li, Dan Guo, Pengyu Liu, Guoliang Chen, Meng Wang

To support the MMAD task, we introduce a new dataset named Multi-label Micro-Action-52 (MMA-52), specifically designed to facilitate the detailed analysis and exploration of complex human micro-actions.

Action Detection

Micro-gesture Online Recognition using Learnable Query Points

no code implementations5 Jul 2024 Pengyu Liu, Fei Wang, Kun Li, Guoliang Chen, Yanyan Wei, Shengeng Tang, Zhiliang Wu, Dan Guo

The Micro-gesture Online Recognition task involves identifying the category and locating the start and end times of micro-gestures in video clips.

Action Detection

Solving Motion Planning Tasks with a Scalable Generative Model

1 code implementation3 Jul 2024 Yihan Hu, Siqi Chai, Zhening Yang, Jingyu Qian, Kun Li, Wenxin Shao, Haichao Zhang, Wei Xu, Qiang Liu

We conclude that the proposed generative model may serve as a foundation for a variety of motion planning tasks, including data generation, simulation, planning, and online training.

Autonomous Driving Motion Planning +1

Purple-teaming LLMs with Adversarial Defender Training

no code implementations1 Jul 2024 Jingyan Zhou, Kun Li, Junan Li, Jiawen Kang, Minda Hu, Xixin Wu, Helen Meng

In PAD, we automatically collect conversational data that cover the vulnerabilities of an LLM around specific safety risks in a self-play manner, where the attacker aims to elicit unsafe responses and the defender generates safe responses to these attacks.

Generative Adversarial Network Red Teaming

Learning from Exemplars for Interactive Image Segmentation

no code implementations17 Jun 2024 Kun Li, Hao Cheng, George Vosselman, Michael Ying Yang

Previous studies have demonstrated impressive performance in extracting a single target mask through interactive segmentation.

Image Segmentation Interactive Segmentation +2

Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers

no code implementations16 Jun 2024 Tianhua Zhang, Kun Li, Hongyin Luo, Xixin Wu, James Glass, Helen Meng

A novel approach is then proposed to assess retriever's preference for these candidates by the probability of answers conditioned on the conversational query by marginalizing the Top-$K$ passages.

Conversational Question Answering Passage Retrieval +1

A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+

no code implementations8 Jun 2024 Jianzhao Wang, Yanyan Wei, Dehua Hu, Yilin Zhang, Shengeng Tang, Kun Li, Zhao Zhang

In the second stage, we employ the InternImage network to train for the semantic segmentation task using the generated pseudo ground truths.

Rain Removal Segmentation +2

Joint Spatial-Temporal Modeling and Contrastive Learning for Self-supervised Heart Rate Measurement

no code implementations7 Jun 2024 Wei Qian, Qi Li, Kun Li, Xinke Wang, Xiao Sun, Meng Wang, Dan Guo

This paper briefly introduces the solutions developed by our team, HFUT-VUT, for Track 1 of self-supervised heart rate measurement in the 3rd Vision-based Remote Physiological Signal Sensing (RePSS) Challenge hosted at IJCAI 2024.

Contrastive Learning Self-Supervised Learning

Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation

1 code implementation30 May 2024 Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu

Further analysis shows that EDIT can generate high-quality CoTs with more correct key reasoning steps.

Imitation Learning

Improve Student's Reasoning Generalizability through Cascading Decomposed CoTs Distillation

1 code implementation30 May 2024 Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu

Specifically, by restructuring the training objectives -- removing the answer from outputs and concatenating the question with the rationale as input -- CasCoD's two-step learning process ensures that students focus on learning rationales without interference from the preset answers, thus improving reasoning generalizability.

Diversity

Regressor-free Molecule Generation to Support Drug Response Prediction

no code implementations23 May 2024 Kun Li, Xiuwen Gong, Shirui Pan, Jia Wu, Bo Du, Wenbin Hu

As a result, we introduce regressor-free guidance molecule generation to ensure sampling within a more effective space and support DRP.

Common Sense Reasoning Drug Response Prediction +2

Mix of Experts Language Model for Named Entity Recognition

no code implementations30 Apr 2024 Xinwei Chen, Kun Li, Tianyou Song, Jiangjian Guo

Named Entity Recognition (NER) is an essential steppingstone in the field of natural language processing.

Language Modeling Language Modelling +3

Few-shot Name Entity Recognition on StackOverflow

no code implementations15 Apr 2024 Xinwei Chen, Kun Li, Tianyou Song, Jiangjian Guo

StackOverflow, with its vast question repository and limited labeled examples, raise an annotation challenge for us.

Meta-Learning named-entity-recognition +2

AUG: A New Dataset and An Efficient Model for Aerial Image Urban Scene Graph Generation

no code implementations11 Apr 2024 Yansheng Li, Kun Li, Yongjun Zhang, LinLin Wang, Dingwen Zhang

To fill in the gap of the overhead view dataset, this paper constructs and releases an aerial image urban scene graph generation (AUG) dataset.

Graph Generation Relationship Detection +1

LPSNet: End-to-End Human Pose and Shape Estimation with Lensless Imaging

no code implementations CVPR 2024 Haoyang Ge, Qiao Feng, Hailong Jia, Xiongzheng Li, Xiangjun Yin, You Zhou, Jingyu Yang, Kun Li

Human pose and shape (HPS) estimation with lensless imaging is not only beneficial to privacy protection but also can be used in covert surveillance scenarios due to the small size and simple structure of this device.

Decoder

Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding

1 code implementation21 Mar 2024 Jingjing Hu, Dan Guo, Kun Li, Zhan Si, Xun Yang, Xiaojun Chang, Meng Wang

Inspired by the activity-silent and persistent activity mechanisms in human visual perception biology, we design a Unified Static and Dynamic Network (UniSDNet), to learn the semantic association between the video and text/audio queries in a cross-modal environment for efficient video grounding.

Video Grounding

Prototyping and Experimental Results for Environment-Aware Millimeter Wave Beam Alignment via Channel Knowledge Map

no code implementations13 Mar 2024 Zhuoyin Dai, Di wu, Zhenjun Dong, Kun Li, Dingyang Ding, Sihan Wang, Yong Zeng

In this paper, to alleviate the large training overhead in millimeter wave (mmWave) beam alignment, an environment-aware and training-free beam alignment prototype is established based on a typical CKM, termed beam index map (BIM).

Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture

2 code implementations CVPR 2024 Fei Wang, Dan Guo, Kun Li, Zhun Zhong, Meng Wang

To this end, we present FD4MM, a new paradigm of Frequency Decoupling for Motion Magnification with a Multi-level Isomorphic Architecture to capture multi-level high-frequency details and a stable low-frequency structure (motion field) in video space.

Motion Magnification Representation Learning

Benchmarking Micro-action Recognition: Dataset, Methods, and Applications

1 code implementation8 Mar 2024 Dan Guo, Kun Li, Bin Hu, Yan Zhang, Meng Wang

It offers insights into the feelings and intentions of individuals and is important for human-oriented applications such as emotion recognition and psychological assessment.

Benchmarking Micro-Action Recognition

Convincing Rationales for Visual Question Answering Reasoning

1 code implementation6 Feb 2024 Kun Li, George Vosselman, Michael Ying Yang

Visual Question Answering (VQA) is a challenging task of predicting the answer to a question about the content of an image.

Question Answering Visual Question Answering

KeDuSR: Real-World Dual-Lens Super-Resolution via Kernel-Free Matching

1 code implementation28 Dec 2023 Huanjing Yue, Zifan Cui, Kun Li, Jingyu Yang

Different from them, we propose to first align the Ref with the center region (namely the overlapped FoV area) of the LR input by combining global warping and local warping to make the aligned Ref be sharp and consistent.

Super-Resolution

CLDR: Contrastive Learning Drug Response Models from Natural Language Supervision

1 code implementation17 Dec 2023 Kun Li, Wenbin Hu

At the same time, in order to enhance the continuous representation capability of the numerical text, a common-sense numerical knowledge graph is introduced.

Common Sense Reasoning Contrastive Learning +2

Joint2Human: High-quality 3D Human Generation via Compact Spherical Embedding of 3D Joints

no code implementations CVPR 2024 Muxin Zhang, Qiao Feng, Zhuo Su, Chao Wen, Zhou Xue, Kun Li

In this work, we introduce Joint2Human, a novel method that leverages 2D diffusion models to generate detailed 3D human geometry directly, ensuring both global structure and local details.

3D Generation

R2Human: Real-Time 3D Human Appearance Rendering from a Single Image

no code implementations10 Dec 2023 Yuanwang Yang, Qiao Feng, Yu-Kun Lai, Kun Li

In this paper, we propose R2Human, the first approach for real-time inference and rendering of photorealistic 3D human appearance from a single image.

Neural Rendering

Layered 3D Human Generation via Semantic-Aware Diffusion Model

no code implementations10 Dec 2023 Yi Wang, Jian Ma, Ruizhi Shao, Qiao Feng, Yu-Kun Lai, Yebin Liu, Kun Li

To keep the generated clothing consistent with the target text, we propose a semantic-confidence strategy for clothing that can eliminate the non-clothing content generated by the model.

EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer

1 code implementation7 Dec 2023 Fei Wang, Dan Guo, Kun Li, Meng Wang

Then, we introduce a novel dynamic filter that eliminates noise cues and preserves critical features in the motion magnification and amplification generation phases.

Denoising Motion Magnification

SpeechAct: Towards Generating Whole-body Motion from Speech

no code implementations29 Nov 2023 Jinsong Zhang, Minjie Zhu, Yuxiang Zhang, Yebin Liu, Kun Li

Then, we regress the motion representation from the audio signal by a translation model employing our contrastive motion learning method.

Decoder Motion Generation

High-Quality Animatable Dynamic Garment Reconstruction from Monocular Videos

no code implementations2 Nov 2023 Xiongzheng Li, Jinsong Zhang, Yu-Kun Lai, Jingyu Yang, Kun Li

To alleviate the ambiguity estimating 3D garments from monocular videos, we design a multi-hypothesis deformation module that learns spatial representations of multiple plausible deformations.

Garment Reconstruction

Towards Grouping in Large Scenes with Occlusion-aware Spatio-temporal Transformers

no code implementations30 Oct 2023 Jinsong Zhang, Lingfeng Gu, Yu-Kun Lai, Xueyang Wang, Kun Li

To explore the potential spatio-temporal relationship, we propose spatio-temporal transformers to simultaneously extract trajectory information and fuse inter-person features in a hierarchical manner.

CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability

1 code implementation22 Oct 2023 Minxuan Lv, Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu

Neural network models are vulnerable to adversarial examples, and adversarial transferability further increases the risk of adversarial attacks.

Adversarial Attack

MeaeQ: Mount Model Extraction Attacks with Efficient Queries

1 code implementation21 Oct 2023 Chengwei Dai, Minxuan Lv, Kun Li, Wei Zhou

We study model extraction attacks in natural language processing (NLP) where attackers aim to steal victim models by repeatedly querying the open Application Programming Interfaces (APIs).

Active Learning Diversity +1

Transformer-based Multimodal Change Detection with Multitask Consistency Constraints

1 code implementation13 Oct 2023 BiYuan Liu, HuaiXin Chen, Kun Li, Michael Ying Yang

We observe that the current change detection methods struggle with the multitask conflicts between semantic and height change detection tasks.

Change Detection Earth Observation

A New Transformation Approach for Uplift Modeling with Binary Outcome

no code implementations9 Oct 2023 Kun Li, Liangshu Zhu

The main drawback of these approaches is that in general it does not use the information in the treatment indicator beyond the construction of the transformed outcome and usually is not efficient.

Marketing

Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding

1 code implementation12 Sep 2023 Jiaxiu Li, Kun Li, Jia Li, Guoliang Chen, Dan Guo, Meng Wang

Compared with the general video grounding task, MTVG focuses on meticulous actions and changes on the face.

Sentence text similarity +1

Exploiting Diverse Feature for Multimodal Sentiment Analysis

no code implementations25 Aug 2023 Jia Li, Wei Qian, Kun Li, Qi Li, Dan Guo, Meng Wang

Specifically, we achieve the results of 0. 8492 and 0. 8439 for MuSe-Personalisation in terms of arousal and valence CCC.

Multimodal Sentiment Analysis

Dual-path TokenLearner for Remote Photoplethysmography-based Physiological Measurement with Facial Videos

1 code implementation15 Aug 2023 Wei Qian, Dan Guo, Kun Li, Xilan Tian, Meng Wang

Specifically, the proposed Dual-TL uses a Spatial TokenLearner (S-TL) to explore associations in different facial ROIs, which promises the rPPG prediction far away from noisy ROI disturbances.

ViGT: Proposal-free Video Grounding with Learnable Token in Transformer

no code implementations11 Aug 2023 Kun Li, Dan Guo, Meng Wang

First, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention (i. e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality.

Feature Correlation regression +1

Data Augmentation for Human Behavior Analysis in Multi-Person Conversations

no code implementations3 Aug 2023 Kun Li, Dan Guo, Guoliang Chen, Feiyang Liu, Meng Wang

In this paper, we present the solution of our team HFUT-VUT for the MultiMediate Grand Challenge 2023 at ACM Multimedia 2023.

FusionAD: Multi-modality Fusion for Prediction and Planning Tasks of Autonomous Driving

1 code implementation2 Aug 2023 Tengju Ye, Wei Jing, Chunyong Hu, Shikun Huang, Lingping Gao, Fangzhen Li, Jingke Wang, Ke Guo, Wencong Xiao, Weibo Mao, Hang Zheng, Kun Li, Junbo Chen, Kaicheng Yu

Building a multi-modality multi-task neural network toward accurate and robust performance is a de-facto standard in perception task of autonomous driving.

Autonomous Driving Prediction

Joint Skeletal and Semantic Embedding Loss for Micro-gesture Classification

1 code implementation20 Jul 2023 Kun Li, Dan Guo, Guoliang Chen, Xinge Peng, Meng Wang

In this paper, we briefly introduce the solution of our team HFUT-VUT for the Micros-gesture Classification in the MiGA challenge at IJCAI 2023.

Action Classification Classification +2

Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer

1 code implementation ICCV 2023 Wing-Yin Yu, Lai-Man Po, Ray C. C. Cheung, Yuzhi Zhao, Yu Xue, Kun Li

To address these issues, we propose a novel Deformable Motion Modulation (DMM) that utilizes geometric kernel offset with adaptive weight modulation to simultaneously perform feature alignment and style transfer.

motion prediction Pose Transfer +2

ATWM: Defense against adversarial malware based on adversarial training

no code implementations11 Jul 2023 Kun Li, Fan Zhang, Wei Guo

In order to defend against malware attacks, researchers have proposed many Windows malware detection models based on deep learning.

Adversarial Defense Deep Learning +1

CMDFusion: Bidirectional Fusion Network with Cross-modality Knowledge Distillation for LIDAR Semantic Segmentation

1 code implementation9 Jul 2023 Jun Cen, Shiwei Zhang, Yixuan Pei, Kun Li, Hang Zheng, Maochun Luo, Yingya Zhang, Qifeng Chen

In this way, RGB images are not required during inference anymore since the 2D knowledge branch provides 2D information according to the 3D LIDAR input.

Autonomous Vehicles Knowledge Distillation +2

Interactive Image Segmentation with Cross-Modality Vision Transformers

1 code implementation5 Jul 2023 Kun Li, George Vosselman, Michael Ying Yang

Interactive image segmentation aims to segment the target from the background with the manual guidance, which takes as input multimodal data such as images, clicks, scribbles, and bounding boxes.

Image Segmentation Interactive Segmentation +2

Efficient HDR Reconstruction from Real-World Raw Images

no code implementations17 Jun 2023 Qirui Yang, Yihao Liu, Qihua Chen, Huanjing Yue, Kun Li, Jingyu Yang

The widespread usage of high-definition screens on edge devices stimulates a strong demand for efficient high dynamic range (HDR) algorithms.

4k Denoising +2

Physics-Informed Ensemble Representation for Light-Field Image Super-Resolution

1 code implementation31 May 2023 Manchang Jin, Gaosheng Liu, Kunshu Hu, Xin Luo, Kun Li, Jingyu Yang

Recent learning-based approaches have achieved significant progress in light field (LF) image super-resolution (SR) by exploring convolution-based or transformer-based network structures.

Decoder Image Super-Resolution

FGAM:Fast Adversarial Malware Generation Method Based on Gradient Sign

no code implementations22 May 2023 Kun Li, Fan Zhang, Wei Guo

Adversarial attacks are to deceive the deep learning model by generating adversarial samples.

Deep Learning Malware Detection

SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting

no code implementations15 May 2023 Xiaoying Zhang, Baolin Peng, Kun Li, Jingyan Zhou, Helen Meng

Building end-to-end task bots and maintaining their integration with new functionalities using minimal human efforts is a long-standing challenge in dialog research.

dialog state tracking

Standardized Benchmark Dataset for Localized Exposure to a Realistic Source at 10$-$90 GHz

1 code implementation3 May 2023 Ante Kapetanovic, Dragan Poljak, Kun Li

To address this issue, in this paper, the limited available data on the incident power density and resultant maximum temperature rise on the skin surface considering various steady-state exposure scenarios at 10$-$90 GHz have been statistically modeled.

HybridFusion: LiDAR and Vision Cross-Source Point Cloud Fusion

no code implementations10 Apr 2023 Yu Wang, Shuhui Bu, Lin Chen, Yifei Dong, Kun Li, Xuefeng Cao, Ke Li

First, the point cloud is divided into small patches, and a matching patch set is selected based on global descriptors and spatial distribution, which constitutes the coarse matching process.

Point Cloud Registration

Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning

no code implementations ICCV 2023 Haibiao Xuan, Xiongzheng Li, Jinsong Zhang, Hongwen Zhang, Yebin Liu, Kun Li

Also, we model global and local spatial relationships in a 3D scene and a textual description respectively based on the scene graph, and introduce a partlevel action mechanism to represent interactions as atomic body part states.

Angle-Constrained Formation Control under Directed Non-Triangulated Sensing Graphs (Extended Version)

no code implementations6 Mar 2023 Kun Li, Zhixi Shen, Gangshan Jing, Yongduan Song

Angle-constrained formation control has attracted much attention from control community due to the advantage that inter-edge angles are invariant under uniform translations, rotations, and scalings of the whole formation.

Causal Inference Based Single-branch Ensemble Trees For Uplift Modeling

no code implementations3 Feb 2023 Fanglan Zheng, Menghan Wang, Kun Li, Jiang Tian, Xiaojia Xiang

In this manuscript (ms), we propose causal inference based single-branch ensemble trees for uplift modeling, namely CIET.

Causal Inference

Crowd3D: Towards Hundreds of People Reconstruction from a Single Image

no code implementations CVPR 2023 Hao Wen, Jing Huang, Huili Cui, Haozhe Lin, Yukun Lai, Lu Fang, Kun Li

However, existing methods cannot deal with large scenes containing hundreds of people, which encounter the challenges of large number of people, large variations in human scale, and complex spatial distribution.

Fleet Rebalancing for Expanding Shared e-Mobility Systems: A Multi-agent Deep Reinforcement Learning Approach

1 code implementation11 Nov 2022 Man Luo, Bowen Du, Wenzhe Zhang, Tianyou Song, Kun Li, HongMing Zhu, Mark Birkin, Hongkai Wen

This is particularly challenging in the context of expanding systems, because i) the range of the EVs is limited while charging time is typically long, which constrain the viable rebalancing operations; and ii) the EV stations in the system are dynamically changing, i. e., the legitimate targets for rebalancing operations can vary over time.

Deep Reinforcement Learning Multi-agent Reinforcement Learning

Towards a Better Model with Dual Transformer for Drug Response Prediction

1 code implementation23 Oct 2022 Kun Li, Jia Wu, Bo Du, Sergey V. Petoukhov, Huiting Xu, Zheman Xiao, Wenbin Hu

For the branch of cell lines genomics, we use the multi-headed attention mechanism to globally represent the genomics sequence.

Drug Response Prediction

ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution

no code implementations17 Oct 2022 Sheng Shen, Huanjing Yue, Jingyu Yang, Kun Li

Specifically, we propose a modulation based transformer as the upsampler, which modulates the pixel features in discrete space via a periodic nonlinear function to generate features for continuous pixels.

Image Super-Resolution

FOF: Learning Fourier Occupancy Field for Monocular Real-time Human Reconstruction

no code implementations5 Jun 2022 Qiao Feng, Yebin Liu, Yu-Kun Lai, Jingyu Yang, Kun Li

Based on FOF, we design the first 30+FPS high-fidelity real-time monocular human reconstruction framework.

Ranked #2 on 3D Human Reconstruction on CustomHumans (using extra training data)

3D Human Reconstruction

HDhuman: High-quality Human Novel-view Rendering from Sparse Views

no code implementations20 Jan 2022 Tiansong Zhou, Jing Huang, Tao Yu, Ruizhi Shao, Kun Li

To this end, we propose HDhuman, which uses a human reconstruction network with a pixel-aligned spatial transformer and a rendering network with geometry-guided pixel-wise feature integration to achieve high-quality human reconstruction and rendering.

2k Neural Rendering +2

Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition

1 code implementation17 Jan 2022 PengFei Liu, Kun Li, Helen Meng

Emotion recognition is a challenging and actively-studied research area that plays a critical role in emotion-aware human-computer interaction systems.

Multimodal Emotion Recognition

High-Fidelity Human Avatars From a Single RGB Camera

no code implementations CVPR 2022 Hao Zhao, Jinsong Zhang, Yu-Kun Lai, Zerong Zheng, Yingdi Xie, Yebin Liu, Kun Li

To cope with the complexity of textures and generate photo-realistic results, we propose a reference-based neural rendering network and exploit a bottom-up sharpening-guided fine-tuning strategy to obtain detailed textures.

Neural Rendering Vocal Bursts Intensity Prediction

Implicit Transformer Network for Screen Content Image Continuous Super-Resolution

1 code implementation NeurIPS 2021 Jingyu Yang, Sheng Shen, Huanjing Yue, Kun Li

Nowadays, there is an explosive growth of screen contents due to the wide application of screen sharing, remote cooperation, and online education.

Super-Resolution

Large-Scale Hyperspectral Image Clustering Using Contrastive Learning

1 code implementation15 Nov 2021 Yaoming Cai, Zijia Zhang, Yan Liu, Pedram Ghamisi, Kun Li, Xiaobo Liu, Zhihua Cai

Specifically, we exploit a symmetric twin neural network comprised of a projection head with a dimensionality of the cluster number to conduct dual contrastive learning from a spectral-spatial augmentation pool.

Clustering Contrastive Learning +3

LSTM-RPA: A Simple but Effective Long Sequence Prediction Algorithm for Music Popularity Prediction

1 code implementation27 Oct 2021 Kun Li, Meng Li, Yanling Li, Min Lin

The traditional trend prediction models can better predict the short trend than the long trend.

Prediction

Real-Time Anchor-Free Single-Stage 3D Detection with IoU-Awareness

no code implementations29 Jul 2021 Runzhou Ge, Zhuangzhuang Ding, Yihan Hu, Wenxin Shao, Li Huang, Kun Li, Qiang Liu

Extended from our last year's award-winning model AFDet, we have made a handful of modifications to the base model, to improve the accuracy and at the same time to greatly reduce the latency.

Data Augmentation

Economic Recession Prediction Using Deep Neural Network

no code implementations21 Jul 2021 ZiHao Wang, Kun Li, Steve Q. Xia, Hongfu Liu

We investigate the effectiveness of different machine learning methodologies in predicting economic cycles.

BIG-bench Machine Learning Prediction

Out-of-Scope Domain and Intent Classification through Hierarchical Joint Modeling

1 code implementation30 Apr 2021 PengFei Liu, Kun Li, Helen Meng

User queries for a real-world dialog system may sometimes fall outside the scope of the system's capabilities, but appropriate system responses will enable smooth processing throughout the human-computer interaction.

Classification General Classification +3

Open Intent Discovery through Unsupervised Semantic Clustering and Dependency Parsing

1 code implementation25 Apr 2021 PengFei Liu, Youzhang Ning, King Keung Wu, Kun Li, Helen Meng

This paper presents an unsupervised two-stage approach to discover intents and generate meaningful intent labels automatically from a collection of unlabeled utterances in a domain.

Clustering Dependency Parsing +4

An Accurate and Efficient Large-scale Regression Method through Best Friend Clustering

no code implementations22 Apr 2021 Kun Li, Liang Yuan, Yunquan Zhang, Gongwei Chen

As the data size in Machine Learning fields grows exponentially, it is inevitable to accelerate the computation by utilizing the ever-growing large number of available cores provided by high-performance computing hardware.

Clustering regression

PISE: Person Image Synthesis and Editing with Decoupled GAN

1 code implementation CVPR 2021 Jinsong Zhang, Kun Li, Yu-Kun Lai, Jingyu Yang

The results of qualitative and quantitative experiments demonstrate the superiority of our model on human pose transfer.

Human Parsing Pose Transfer

Polycaprolactone/graphite nanoplates composite nanopapers

no code implementations25 Jan 2021 Kun Li, Daniele Battegazzore, Orietta Monticelli, Alberto Fina

Nanopapers based on graphene and related materials were recently proposed for application in heat spreader applications.

Materials Science

PoNA: Pose-guided Non-local Attention for Human Pose Transfer

1 code implementation13 Dec 2020 Kun Li, Jinsong Zhang, Yebin Liu, Yu-Kun Lai, Qionghai Dai

In each block, we propose a pose-guided non-local attention (PoNA) mechanism with a long-range dependency scheme to select more important regions of image features to transfer.

Generative Adversarial Network Person Re-Identification +1

Human Pose Transfer by Adaptive Hierarchical Deformation

1 code implementation13 Dec 2020 Jinsong Zhang, Xingzi Liu, Kun Li

Existing methods cannot effectively utilize the input information, which often fail to preserve the style and shape of hair and clothes.

Pose Transfer Semantic Parsing +1

Constituency Lattice Encoding for Aspect Term Extraction

1 code implementation COLING 2020 Yunyi Yang, Kun Li, Xiaojun Quan, Weizhou Shen, Qinliang Su

One of the remaining challenges for aspect term extraction in sentiment analysis resides in the extraction of phrase-level aspect terms, which is non-trivial to determine the boundaries of such terms.

Aspect Term Extraction and Sentiment Classification Sentence +1

GPS-Net: Graph-based Photometric Stereo Network

no code implementations NeurIPS 2020 Zhuokun Yao, Kun Li, Ying Fu, Haofeng Hu, Boxin Shi

For all-pixel operation, we propose the Normal Regression Network to make efficient use of the intra-image spatial information for predicting a surface normal map with rich details.

Cross-MPI: Cross-scale Stereo for Image Super-Resolution using Multiplane Images

no code implementations CVPR 2021 Yuemei Zhou, Gaochang Wu, Ying Fu, Kun Li, Yebin Liu

Various combinations of cameras enrich computational photography, among which reference-based superresolution (RefSR) plays a critical role in multiscale imaging systems.

Image Super-Resolution

Unsupervised Pre-training for Biomedical Question Answering

no code implementations27 Sep 2020 Vaishnavi Kommaraju, Karthick Gunasekaran, Kun Li, Trapit Bansal, Andrew McCallum, Ivana Williams, Ana-Maria Istrate

We explore the suitability of unsupervised representation learning methods on biomedical text -- BioBERT, SciBERT, and BioSentVec -- for biomedical question answering.

Question Answering Representation Learning +1

A Vertical Federated Learning Method for Interpretable Scorecard and Its Application in Credit Scoring

no code implementations14 Sep 2020 Fanglan Zheng, Erihe, Kun Li, Jiang Tian, Xiaojia Xiang

With the success of big data and artificial intelligence in many fields, the applications of big data driven models are expected in financial risk management especially credit scoring and rating.

Management Vertical Federated Learning

Visual-speech Synthesis of Exaggerated Corrective Feedback

no code implementations12 Sep 2020 Yaohua Bu, Weijun Li, Tianyi Ma, Shengqi Chen, Jia Jia, Kun Li, Xiaobo Lu

To provide more discriminative feedback for the second language (L2) learners to better identify their mispronunciation, we propose a method for exaggerated visual-speech feedback in computer-assisted pronunciation training (CAPT).

Speech Synthesis

Adaptive 3D Face Reconstruction from a Single Image

no code implementations8 Jul 2020 Kun Li, Jing Yang, Nianhong Jiao, Jinsong Zhang, Yu-Kun Lai

3D face reconstruction from a single image is a challenging problem, especially under partial occlusions and extreme poses.

3D Face Reconstruction Pose Estimation

A Federated F-score Based Ensemble Model for Automatic Rule Extraction

no code implementations7 Jul 2020 Kun Li, Fanglan Zheng, Jiang Tian, Xiaojia Xiang

In this manuscript, we propose a federated F-score based ensemble tree model for automatic rule extraction, namely Fed-FEARE.

Federated Learning Marketing

Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation

no code implementations ACL 2020 Kun Li, Chengbo Chen, Xiaojun Quan, Qing Ling, Yan Song

In this paper, we formulate the data augmentation as a conditional generation task: generating a new sentence while preserving the original opinion targets and labels.

Data Augmentation Extract Aspect +3

Accurate 3D Localization for MAV Swarms by UWB and IMU Fusion

1 code implementation28 Jul 2018 Jiaxin Li, Yingcai Bi, Kun Li, Kangli Wang, Feng Lin, Ben M. Chen

Driven by applications like Micro Aerial Vehicles (MAVs), driver-less cars, etc, localization solution has become an active research topic in the past decade.

Robotics

Meta Inverse Reinforcement Learning via Maximum Reward Sharing for Human Motion Analysis

no code implementations7 Oct 2017 Kun Li, Joel W. Burdick

Observing that each demonstrator has an inherent reward for each state and the task-specific behaviors mainly depend on a small number of key states, we propose a meta IRL algorithm that first models the reward function for each task as a distribution conditioned on a baseline reward function shared by all tasks and dependent only on the demonstrator, and then finds the most likely reward function in the distribution that explains the task-specific behaviors.

reinforcement-learning Reinforcement Learning +1

A Function Approximation Method for Model-based High-Dimensional Inverse Reinforcement Learning

no code implementations23 Aug 2017 Kun Li, Joel W. Burdick

This works handles the inverse reinforcement learning problem in high-dimensional state spaces, which relies on an efficient solution of model-based high-dimensional reinforcement learning problems.

reinforcement-learning Reinforcement Learning +1

Bellman Gradient Iteration for Inverse Reinforcement Learning

no code implementations24 Jul 2017 Kun Li, Yanan Sui, Joel W. Burdick

We introduce a strategy to flexibly handle different types of actions with two approximations of the Bellman Optimality Equation, and a Bellman Gradient Iteration method to compute the gradient of the Q-value with respect to the reward function.

reinforcement-learning Reinforcement Learning +1

Clinical Patient Tracking in the Presence of Transient and Permanent Occlusions via Geodesic Feature

no code implementations22 Jul 2017 Kun Li, Joel W. Burdick

This paper develops a method to use RGB-D cameras to track the motions of a human spinal cord injury patient undergoing spinal stimulation and physical rehabilitation.

Robust Non-Rigid Registration with Reweighted Position and Transformation Sparsity

no code implementations15 Mar 2017 Kun Li, Jingyu Yang, Yu-Kun Lai, Daoliang Guo

Non-rigid registration is challenging because it is ill-posed with high degrees of freedom and is thus sensitive to noise and outliers.

Position

Inverse Reinforcement Learning with Multi-Relational Chains for Robot-Centered Smart Home

no code implementations16 Aug 2014 Kun Li, Max Q. -H. Meng

In a robot-centered smart home, the robot observes the home states with its own sensors, and then it can change certain object states according to an operator's commands for remote operations, or imitate the operator's behaviors in the house for autonomous operations.

reinforcement-learning Reinforcement Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.