Search Results for author: Kun Wu

Found 36 papers, 13 papers with code

KAT-V1: Kwai-AutoThink Technical Report

no code implementations11 Jul 2025 Zizheng Zhan, Ken Deng, Huaixi Tang, Wen Xiang, Kun Wu, Weihao Li, Wenqiang Zhu, Jingxuan Xu, Lecheng Huang, Zongxian Feng, Shaojie Wang, Shangpeng Yan, Jiaheng Liu, Zhongyuan Peng, Zuchen Gao, Haoyang Huang, Ziqi Zhan, Yanan Wu, Yuanxing Zhang, Jian Yang, Guang Chen, Haotian Zhang, Bin Chen, Bing Yu

We present Kwaipilot-AutoThink (KAT), an open-source 40B large language model developed to address the overthinking problem in reasoning-intensive tasks, where an automatic thinking training paradigm is proposed to dynamically switch between reasoning and non-reasoning modes based on task complexity.

FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

no code implementations10 Jun 2025 Yifei Su, Ning Liu, Dong Chen, Zhen Zhao, Kun Wu, Meng Li, Zhiyuan Xu, Zhengping Che, Jian Tang

To effectively exploit temporal information in robotic manipulation, we propose FreqPolicy, a novel approach that first imposes frequency consistency constraints on flow-based visuomotor policies.

Action Generation Image Generation +1

HACTS: a Human-As-Copilot Teleoperation System for Robot Learning

no code implementations31 Mar 2025 Zhiyuan Xu, Yinuo Zhao, Kun Wu, Ning Liu, Junjie Ji, Zhengping Che, Chi Harold Liu, Jian Tang

Teleoperation is essential for autonomous robot learning, especially in manipulation tasks that require human demonstrations or corrections.

Autonomous Vehicles Imitation Learning +2

Stabilization Analysis and Mode Recognition of Kerosene Supersonic Combustion: A Deep Learning Approach Based on Res-CNN-beta-VAE

no code implementations17 Mar 2025 Weiming Xu, Tao Yang, Chang Liu, Kun Wu, Peng Zhang

The scramjet engine is a key propulsion system for hypersonic vehicles, leveraging supersonic airflow to achieve high specific impulse, making it a promising technology for aerospace applications.

Clustering Dimensionality Reduction

ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning

no code implementations22 Dec 2024 Kun Wu, Yinuo Zhao, Zhiyuan Xu, Zhengping Che, Chengxiang Yin, Chi Harold Liu, Feiferi Feng, Jian Tang

Motivated by the theoretical analysis, we propose a novel algorithm, ACL-QL, which uses two learnable adaptive weight functions to control the conservative level over each transition.

D4RL Q-Learning +1

Code generation and runtime techniques for enabling data-efficient deep learning training on GPUs

no code implementations6 Dec 2024 Kun Wu

Together, these contributions show that code generation and runtime techniques can systematically mitigate the data management bottlenecks in deep learning training, which stem from the data-intensive nature of workloads and the oversimplification inherent in the deep learning training software stack.

Code Generation Deep Learning

TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation

1 code implementation19 Sep 2024 Junjie Wen, Yichen Zhu, Jinming Li, Minjie Zhu, Kun Wu, Zhiyuan Xu, Ning Liu, Ran Cheng, Chaomin Shen, Yaxin Peng, Feifei Feng, Jian Tang

Vision-Language-Action (VLA) models have shown remarkable potential in visuomotor control and instruction comprehension through end-to-end learning processes.

Vision-Language-Action

Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models

no code implementations11 Sep 2024 Jiahang Cao, Qiang Zhang, Jingkai Sun, Jiaxu Wang, Hao Cheng, Yulin Li, Jun Ma, Kun Wu, Zhiyuan Xu, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, Renjing Xu

Diffusion models have been widely employed in the field of 3D manipulation due to their efficient capability to learn distributions, allowing for precise prediction of action trajectories.

Mamba

Lifelong Histopathology Whole Slide Image Retrieval via Distance Consistency Rehearsal

1 code implementation11 Jul 2024 Xinyu Zhu, Zhiguo Jiang, Kun Wu, Jun Shi, Yushan Zheng

Content-based histopathological image retrieval (CBHIR) has gained attention in recent years, offering the capability to return histopathology images that are content-wise similar to the query one from an established database.

Image Retrieval Retrieval

Pan-cancer Histopathology WSI Pre-training with Position-aware Masked Autoencoder

1 code implementation10 Jul 2024 Kun Wu, Zhiguo Jiang, Kunming Tang, Jun Shi, Fengying Xie, Wei Wang, Haibo Wu, Yushan Zheng

The results have demonstrated the effectiveness and generalization of PAMA in discriminative WSI representation learning and pan-cancer WSI pre-training.

Cancer Classification Position +2

TransMA: an explainable multi-modal deep learning model for predicting properties of ionizable lipid nanoparticles in mRNA delivery

1 code implementation8 Jul 2024 Kun Wu, Zixu Wang, Xiulong Yang, Yangyang Chen, Zhenqi Han, Jialu Zhang, Lizhuang Liu

We design the mol-attention mechanism block, enabling it to align coarse and fine-grained atomic features and captures relationships between atomic spatial and sequential structures.

Mamba

Unsupervised Domain Adaptation for Brain Vessel Segmentation through Transwarp Contrastive Learning

1 code implementation23 Feb 2024 Fengming Lin, Yan Xia, Michael MacRaild, Yash Deo, Haoran Dou, Qiongyao Liu, Kun Wu, Nishant Ravikumar, Alejandro F. Frangi

Unsupervised domain adaptation (UDA) aims to align the labelled source distribution with the unlabelled target distribution to obtain domain-invariant predictive models.

Contrastive Learning Medical Image Analysis +1

SoMeLVLM: A Large Vision Language Model for Social Media Processing

no code implementations20 Feb 2024 Xinnong Zhang, Haoyu Kuang, Xinyi Mou, Hanjia Lyu, Kun Wu, Siming Chen, Jiebo Luo, Xuanjing Huang, Zhongyu Wei

The powerful Large Vision Language Models make it possible to handle a variety of tasks simultaneously, but even with carefully designed prompting methods, the general domain models often fall short in aligning with the unique speaking style and context of social media tasks.

Language Modeling Language Modelling

Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering

no code implementations20 Dec 2023 Chengxiang Yin, Zhengping Che, Kun Wu, Zhiyuan Xu, Jian Tang

Visual Question Answering (VQA) has emerged as one of the most challenging tasks in artificial intelligence due to its multi-modal nature.

Question Answering Visual Question Answering

Cross-Modal Reasoning with Event Correlation for Video Question Answering

no code implementations20 Dec 2023 Chengxiang Yin, Zhengping Che, Kun Wu, Zhiyuan Xu, Qinru Qiu, Jian Tang

Video Question Answering (VideoQA) is a very attractive and challenging research direction aiming to understand complex semantics of heterogeneous data from two domains, i. e., the spatio-temporal video content and the word sequence in question.

Question Answering Video Question Answering

Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures

no code implementations16 Jan 2023 Kun Wu, Mert Hidayetoğlu, Xiang Song, Sitao Huang, Da Zheng, Israt Nisa, Wen-mei Hwu

Relational graph neural networks (RGNNs) are graph neural networks with dedicated structures for modeling the different types of nodes and edges in heterogeneous graphs.

8k C++ code +1

Continual Few-Shot Learning with Adversarial Class Storage

no code implementations10 Jul 2022 Kun Wu, Chengxiang Yin, Jian Tang, Zhiyuan Xu, Yanzhi Wang, Dejun Yang

In this paper, we define a new problem called continual few-shot learning, in which tasks arrive sequentially and each task is associated with a few training samples.

continual few-shot learning Few-Shot Learning +1

Lesion-Aware Contrastive Representation Learning for Histopathology Whole Slide Images Analysis

1 code implementation27 Jun 2022 Jun Li, Yushan Zheng, Kun Wu, Jun Shi, Fengying Xie, Zhiguo Jiang

In this paper, we proposed a novel contrastive representation learning framework named Lesion-Aware Contrastive Learning (LACL) for histopathology whole slide image analysis.

Contrastive Learning Representation Learning +1

Faster and Better Grammar-based Text-to-SQL Parsing via Clause-level Parallel Decoding and Alignment Loss

no code implementations26 Apr 2022 Kun Wu, Lijie Wang, Zhenghua Li, Xinyan Xiao

Grammar-based parsers have achieved high performance in the cross-domain text-to-SQL parsing task, but suffer from low decoding efficiency due to the much larger number of actions for grammar selection than that of tokens in SQL queries.

SQL Parsing Text to SQL +1

CADRE: A Cascade Deep Reinforcement Learning Framework for Vision-based Autonomous Urban Driving

1 code implementation17 Feb 2022 Yinuo Zhao, Kun Wu, Zhiyuan Xu, Zhengping Che, Qi Lu, Jian Tang, Chi Harold Liu

Vision-based autonomous urban driving in dense traffic is quite challenging due to the complicated urban environment and the dynamics of the driving behaviors.

Deep Reinforcement Learning reinforcement-learning +1

Graph Neural Network Training with Data Tiering

no code implementations10 Nov 2021 Seung Won Min, Kun Wu, Mert Hidayetoğlu, JinJun Xiong, Xiang Song, Wen-mei Hwu

With our data tiering method, we additionally provide a new data placement and access strategy to further minimize the CPU-GPU communication overhead.

Fraud Detection Graph Neural Network

Human Pose Transfer with Augmented Disentangled Feature Consistency

no code implementations23 Jul 2021 Kun Wu, Chengxiang Yin, Zhengping Che, Bo Jiang, Jian Tang, Zheng Guan, Gangyi Ding

Deep generative models have made great progress in synthesizing images with arbitrary human poses and transferring poses of one person to others.

Data Augmentation Pose Transfer

Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture

1 code implementation4 Mar 2021 Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, JinJun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu

In this work, we propose a novel GPU-oriented data communication approach for GCN training, where GPU threads directly access sparse features in host memory through zero-copy accesses without much CPU help.

Recommendation Systems

PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses

1 code implementation20 Jan 2021 Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, JinJun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu

While this process accounts for a significant portion of the training time, we find existing GNN implementations using popular deep neural network (DNN) libraries such as PyTorch are limited to a CPU-centric approach for the entire data preparation step.

Graph Neural Network

Hierarchical Graph Attention Network for Few-Shot Visual-Semantic Learning

no code implementations ICCV 2021 Chengxiang Yin, Kun Wu, Zhengping Che, Bo Jiang, Zhiyuan Xu, Jian Tang

Deep learning has made tremendous success in computer vision, natural language processing and even visual-semantic learning, which requires a huge amount of labeled training data.

Graph Attention Image Captioning +2

TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware Datatypes

1 code implementation28 Dec 2020 Carl Pearson, Kun Wu, I-Hsin Chung, JinJun Xiong, Wen-mei Hwu

MPI derived datatypes are an abstraction that simplifies handling of non-contiguous data in MPI applications.

Distributed, Parallel, and Cluster Computing

Knowledge Transfer in Multi-Task Deep Reinforcement Learning for Continuous Control

1 code implementation NeurIPS 2020 Zhiyuan Xu, Kun Wu, Zhengping Che, Jian Tang, Jieping Ye

While Deep Reinforcement Learning (DRL) has emerged as a promising approach to many complex tasks, it remains challenging to train a single DRL agent that is capable of undertaking multiple different continuous control tasks.

continuous-control Continuous Control +5

Cannot find the paper you are looking for? You can Submit a new open access paper.