Search Results for author: Kun Hu

Found 26 papers, 14 papers with code

DC-PCN: Point Cloud Completion Network with Dual-Codebook Guided Quantization

no code implementations19 Jan 2025 Qiuxia Wu, Haiyang Huang, Kunming Su, Zhiyong Wang, Kun Hu

Despite achieving encouraging results, a significant issue remains: these methods often overlook the variability in point clouds sampled from a single 3D object surface.

Decoder Point Cloud Completion +1

B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens

1 code implementation13 Dec 2024 Zhuqiang Lu, Zhenfei Yin, Mengwei He, Zhihui Wang, Zicheng Liu, Zhiyong Wang, Kun Hu

To restrict the number of visual tokens, existing VLLMs either: (1) uniformly downsample videos into a fixed number of frames or (2) reducing the number of visual tokens encoded from each frame.

Language Modeling Language Modelling +2

DuoCast: Duo-Probabilistic Meteorology-Aware Model for Extended Precipitation Nowcasting

1 code implementation2 Dec 2024 Penghui Wen, Lei Bai, Mengwei He, Patrick Filippi, Feng Zhang, Thomas Francis Bishop, Zhiyong Wang, Kun Hu

Recently, extended short-term precipitation nowcasting struggles with decreasing precision because of insufficient consideration of meteorological knowledge, such as weather fronts which significantly influence precipitation intensity, duration, and spatial distribution.

SFDFusion: An Efficient Spatial-Frequency Domain Fusion Network for Infrared and Visible Image Fusion

1 code implementation30 Oct 2024 Kun Hu, Qingle Zhang, Maoxun Yuan, Yitian Zhang

Next, to introduce frequency domain information, we construct a Frequency Domain Fusion Module (FDFM) that transforms the spatial domain to the frequency domain through Fast Fourier Transform (FFT) and then integrates frequency domain information.

Infrared And Visible Image Fusion

RI-MAE: Rotation-Invariant Masked AutoEncoders for Self-Supervised Point Cloud Representation Learning

1 code implementation31 Aug 2024 Kunming Su, Qiuxia Wu, Panpan Cai, Xiaogang Zhu, Xuequan Lu, Zhiyong Wang, Kun Hu

Finally, the predictor predicts the latent features of the masked patches using the output latent embeddings from the student, supervised by the outputs from the teacher.

Representation Learning Self-Supervised Learning

SITransformer: Shared Information-Guided Transformer for Extreme Multimodal Summarization

1 code implementation28 Aug 2024 Sicheng Liu, Lintao Wang, Xiaogang Zhu, Xuequan Lu, Zhiyong Wang, Kun Hu

Extreme Multimodal Summarization with Multimodal Output (XMSMO) becomes an attractive summarization approach by integrating various types of information to create extremely concise yet informative summaries for individual modalities.

Radio Frequency Signal based Human Silhouette Segmentation: A Sequential Diffusion Approach

1 code implementation27 Jul 2024 Penghui Wen, Kun Hu, Dong Yuan, Zhiyuan Ning, Changyang Li, Zhiyong Wang

Additionally, the spatio-temporal patterns have not been fully explored for human motion dynamics in HSS.

Segmentation

Motion Keyframe Interpolation for Any Human Skeleton via Temporally Consistent Point Cloud Sampling and Reconstruction

no code implementations13 May 2024 Clinton Mo, Kun Hu, Chengjiang Long, Dong Yuan, Zhiyong Wang

Comprehensive experiments demonstrate the effectiveness of PC-MRL in motion interpolation for desired skeletons without supervision from native datasets.

Motion Interpolation Representation Learning

SurgicalPart-SAM: Part-to-Whole Collaborative Prompting for Surgical Instrument Segmentation

2 code implementations22 Dec 2023 Wenxi Yue, Jing Zhang, Kun Hu, Qiuxia Wu, ZongYuan Ge, Yong Xia, Jiebo Luo, Zhiyong Wang

Specifically, we achieve this by proposing (1) Collaborative Prompts that describe instrument structures via collaborating category-level and part-level texts; (2) Cross-Modal Prompt Encoder that encodes text prompts jointly with visual embeddings into discriminative part-level representations; and (3) Part-to-Whole Adaptive Fusion and Hierarchical Decoding that adaptively fuse the part-level representations into a whole for accurate instrument segmentation in surgical scenarios.

Segmentation Semantic Segmentation

Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance

no code implementations31 Aug 2023 Zexin Hu, Kun Hu, Clinton Mo, Lei Pan, Zhiyong Wang

Sketch-based terrain generation seeks to create realistic landscapes for virtual environments in various applications such as computer games, animation and virtual reality.

Denoising

Bridging the Gap: Sketch-Aware Interpolation Network for High-Quality Animation Sketch Inbetweening

1 code implementation25 Aug 2023 Jiaming Shen, Kun Hu, Wei Bao, Chang Wen Chen, Zhiyong Wang

Hand-drawn 2D animation workflow is typically initiated with the creation of sketch keyframes.

Robust Audio Anti-Spoofing with Fusion-Reconstruction Learning on Multi-Order Spectrograms

1 code implementation18 Aug 2023 Penghui Wen, Kun Hu, Wenxi Yue, Sen Zhang, Wanlei Zhou, Zhiyong Wang

Robust audio anti-spoofing has been increasingly challenging due to the recent advancements on deepfake techniques.

Face Swapping

SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation

1 code implementation17 Aug 2023 Wenxi Yue, Jing Zhang, Kun Hu, Yong Xia, Jiebo Luo, Zhiyong Wang

However, we observe two problems with this naive pipeline: (1) the domain gap between natural objects and surgical instruments leads to inferior generalisation of SAM; and (2) SAM relies on precise point or box locations for accurate segmentation, requiring either extensive manual guidance or a well-performing specialist detector for prompt preparation, which leads to a complex multi-stage pipeline.

Image Segmentation Segmentation +1

Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation

1 code implementation CVPR 2023 Clinton Ansun Mo, Kun Hu, Chengjiang Long, Zhiyong Wang

Deriving sophisticated 3D motions from sparse keyframes is a particularly challenging problem, due to continuity and exceptionally skeletal precision.

Motion Interpolation Motion Synthesis

Multi-Scale Control Signal-Aware Transformer for Motion Synthesis without Phase

no code implementations3 Mar 2023 Lintao Wang, Kun Hu, Lei Bai, Yu Ding, Wanli Ouyang, Zhiyong Wang

As past poses often contain useful auxiliary hints, in this paper, we propose a task-agnostic deep learning method, namely Multi-scale Control Signal-aware Transformer (MCS-T), with an attention based encoder-decoder architecture to discover the auxiliary information implicitly for synthesizing controllable motion without explicitly requiring auxiliary information such as phase.

Decoder Feature Engineering +1

Robust Knowledge Adaptation for Federated Unsupervised Person ReID

no code implementations18 Jan 2023 Jianfeng Weng, Kun Hu, Tingting Yao, Jingya Wang, Zhiyong Wang

Thus, in this work, a federated unsupervised cluster-contrastive (FedUCC) learning method is proposed for Person ReID.

Federated Learning Person Re-Identification

ICD-Face: Intra-class Compactness Distillation for Face Recognition

no code implementations ICCV 2023 Zhipeng Yu, Jiaheng Liu, Haoyu Qin, Yichao Wu, Kun Hu, Jiayi Tian, Ding Liang

Knowledge distillation is an effective model compression method to improve the performance of a lightweight student model by transferring the knowledge of a well-performed teacher model, which has been widely adopted in many computer vision tasks, including face recognition (FR).

Face Recognition Knowledge Distillation +1

TLDW: Extreme Multimodal Summarisation of News Videos

no code implementations16 Oct 2022 Peggy Tang, Kun Hu, Lei Zhang, Jiebo Luo, Zhiyong Wang

Multimodal summarisation with multimodal output is drawing increasing attention due to the rapid growth of multimedia data.

Sentence

Multi-level Adversarial Spatio-temporal Learning for Footstep Pressure based FoG Detection

no code implementations22 Sep 2022 Kun Hu, Shaohui Mei, Wei Wang, Kaylena A. Ehgoetz Martens, Liang Wang, Simon J. G. Lewis, David D. Feng, Zhiyong Wang

The proposed scheme also sheds light on improving subject-level clinical studies from other scenarios as it can be integrated with many existing deep architectures.

M2-Net: Multi-stages Specular Highlight Detection and Removal in Multi-scenes

1 code implementation20 Jul 2022 Zhaoyangfan Huang, Kun Hu, Xingjun Wang

The framework consists of three main components, highlight feature extractor module, highlight coarse removal module, and highlight refine removal module.

Highlight Detection highlight removal

OTExtSum: Extractive Text Summarisation with Optimal Transport

1 code implementation Findings (NAACL) 2022 Peggy Tang, Kun Hu, Rui Yan, Lei Zhang, Junbin Gao, Zhiyong Wang

Optimal sentence extraction is conceptualised as obtaining an optimal summary that minimises the transportation cost to a given document regarding their semantic distributions.

Sentence

Sign Language Translation with Hierarchical Spatio-TemporalGraph Neural Network

no code implementations14 Nov 2021 Jichao Kan, Kun Hu, Markus Hagenbuchner, Ah Chung Tsoi, Mohammed Bennamounm, Zhiyong Wang

Therefore, in this paper, these unique characteristics of sign languages are formulated as hierarchical spatio-temporal graph representations, including high-level and fine-level graphs of which a vertex characterizes a specified body part and an edge represents their interactions.

Graph Neural Network Machine Translation +3

A Framework in CRM Customer Lifecycle: Identify Downward Trend and Potential Issues Detection

no code implementations25 Feb 2018 Kun Hu, Zhe Li, Ying Liu, Luyin Cheng, Qi Yang, Yan Li

In the first prediction part, we focus on predicting the downward trend, which is an earlier stage of the customer lifecycle compared to churn.

Causal Inference Management +1

Cannot find the paper you are looking for? You can Submit a new open access paper.