Search Results for author: Kai Liu

Found 103 papers, 39 papers with code

Understanding the Tradeoff between Cost and Quality of Expert Annotations for Keyphrase Extraction

no code implementations COLING (LAW) 2020 Hung Chau, Saeid Balaneshin, Kai Liu, Ondrej Linda

We evaluate these annotation strategies with respect to their cost and on the task of learning keyphrase extraction models applied with an experimental dataset in the real-estate domain.

Keyphrase Extraction

Seed1.5-VL Technical Report

no code implementations11 May 2025 Dong Guo, Faming Wu, Feida Zhu, Fuxing Leng, Guang Shi, Haobin Chen, Haoqi Fan, Jian Wang, Jianyu Jiang, Jiawei Wang, Jingji Chen, Jingjia Huang, Kang Lei, Liping Yuan, Lishu Luo, PengFei Liu, Qinghao Ye, Rui Qian, Shen Yan, Shixiong Zhao, Shuai Peng, Shuangye Li, Sihang Yuan, Sijin Wu, Tianheng Cheng, Weiwei Liu, Wenqian Wang, Xianhan Zeng, Xiao Liu, Xiaobo Qin, Xiaohan Ding, Xiaojun Xiao, Xiaoying Zhang, Xuanwei Zhang, Xuehan Xiong, Yanghua Peng, Yangrui Chen, Yanwei Li, Yanxu Hu, Yi Lin, Yiyuan Hu, Yiyuan Zhang, Youbin Wu, Yu Li, Yudong Liu, Yue Ling, Yujia Qin, Zanbo Wang, Zhiwu He, Aoxue Zhang, Bairen Yi, Bencheng Liao, Can Huang, Can Zhang, Chaorui Deng, Chaoyi Deng, Cheng Lin, Cheng Yuan, Chenggang Li, Chenhui Gou, Chenwei Lou, Chengzhi Wei, Chundian Liu, Chunyuan Li, Deyao Zhu, Donghong Zhong, Feng Li, Feng Zhang, Gang Wu, Guodong Li, Guohong Xiao, Haibin Lin, Haihua Yang, Haoming Wang, Heng Ji, Hongxiang Hao, Hui Shen, Huixia Li, Jiahao Li, Jialong Wu, Jianhua Zhu, Jianpeng Jiao, Jiashi Feng, Jiaze Chen, Jianhui Duan, Jihao Liu, Jin Zeng, Jingqun Tang, Jingyu Sun, Joya Chen, Jun Long, Junda Feng, Junfeng Zhan, Junjie Fang, Junting Lu, Kai Hua, Kai Liu, Kai Shen, Kaiyuan Zhang, Ke Shen, Ke Wang, Keyu Pan, Kun Zhang, Kunchang Li, Lanxin Li, Lei LI, Lei Shi, Li Han, Liang Xiang, Liangqiang Chen, Lin Chen, Lin Li, Lin Yan, Liying Chi, Longxiang Liu, Mengfei Du, Mingxuan Wang, Ningxin Pan, Peibin Chen, Pengfei Chen, Pengfei Wu, Qingqing Yuan, Qingyao Shuai, Qiuyan Tao, Renjie Zheng, Renrui Zhang, Ru Zhang, Rui Wang, Rui Yang, Rui Zhao, Shaoqiang Xu, Shihao Liang, Shipeng Yan, Shu Zhong, Shuaishuai Cao, Shuangzhi Wu, Shufan Liu, Shuhan Chang, Songhua Cai, Tenglong Ao, Tianhao Yang, Tingting Zhang, Wanjun Zhong, Wei Jia, Wei Weng, Weihao Yu, Wenhao Huang, Wenjia Zhu, Wenli Yang, Wenzhi Wang, Xiang Long, XiangRui Yin, Xiao Li, Xiaolei Zhu, Xiaoying Jia, Xijin Zhang, Xin Liu, Xinchen Zhang, Xinyu Yang, Xiongcai Luo, Xiuli Chen, Xuantong Zhong, Xuefeng Xiao, Xujing Li, Yan Wu, Yawei Wen, Yifan Du, Yihao Zhang, Yining Ye, Yonghui Wu, Yu Liu, Yu Yue, Yufeng Zhou, Yufeng Yuan, Yuhang Xu, Yuhong Yang, Yun Zhang, Yunhao Fang, Yuntao Li, Yurui Ren, Yuwen Xiong, Zehua Hong, Zehua Wang, Zewei Sun, Zeyu Wang, Zhao Cai, Zhaoyue Zha, Zhecheng An, Zhehui Zhao, Zhengzhuo Xu, Zhipeng Chen, Zhiyong Wu, Zhuofan Zheng, ZiHao Wang, Zilong Huang, Ziyu Zhu, Zuquan Song

We present Seed1. 5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning.

Mixture-of-Experts Multimodal Reasoning +1

Low-bit Model Quantization for Deep Neural Networks: A Survey

no code implementations8 May 2025 Kai Liu, Qian Zheng, Kaiwen Tao, Zhiteng Li, Haotong Qin, Wenbo Li, Yong Guo, Xianglong Liu, Linghe Kong, Guihai Chen, Yulun Zhang, Xiaokang Yang

Therefore, it has become increasingly popular and critical to investigate how to perform the conversion and how to compensate for the information loss.

Quantization

A Thin Flexible Acoustic Transducer with piezoelectric-actuated microdomes for Underwater Communication

no code implementations22 Apr 2025 Rong Fu, Xinyu Zhang, Cheng-Hao Yu, Kai Liu, Tauhidul Haque, Leixin Ouyang, Mark Ming-Cheng Cheng

This paper presents a flexible thin-film underwater transducer based on a mesoporous PVDF membrane embedded with piezoelectrical-actuated microdomes.

NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results

1 code implementation20 Apr 2025 Zheng Chen, Kai Liu, Jue Gong, Jingkai Wang, Lei Sun, Zongwei Wu, Radu Timofte, Yulun Zhang, Xiangyu Kong, Xiaoxuan Yu, Hyunhee Park, Suejin Han, Hakjae Jeon, Dafeng Zhang, Hyung-Ju Chun, Donghun Ryou, Inju Ha, Bohyung Han, Lu Zhao, Yuyi Zhang, Pengyu Yan, Jiawei Hu, Pengwei Liu, Fengjun Guo, Hongyuan Yu, Pufan Xu, Zhijuan Huang, Shuyuan Cui, Peng Guo, Jiahui Liu, Dongkai Zhang, Heng Zhang, Huiyuan Fu, Huadong Ma, Yanhui Guo, Sisi Tian, Xin Liu, Jinwen Liang, Jie Liu, Jie Tang, Gangshan Wu, Zeyu Xiao, Zhuoyuan Li, Yinxiang Zhang, Wenxuan Cai, Vijayalaxmi Ashok Aralikatti, Nikhil Akalwadi, G Gyaneshwar Rao, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudenagudi, Marcos V. Conde, Alejandro Merino, Bruno Longarela, Javier Abad, Weijun Yuan, Zhan Li, Zhanglu Chen, Boyang Yao, Aagam Jain, Milan Kumar Singh, Ankit Kumar, Shubh Kawa, Divyavardhan Singh, Anjali Sarvaiya, Kishor Upla, Raghavendra Ramachandra, Chia-Ming Lee, Yu-Fan Lin, Chih-Chung Hsu, Risheek V Hiremath, Yashaswini Palani, YuXuan Jiang, Qiang Zhu, Siyue Teng, Fan Zhang, Shuyuan Zhu, Bing Zeng, David Bull, Jingwei Liao, Yuqing Yang, Wenda Shao, Junyi Zhao, Qisheng Xu, Kele Xu, Sunder Ali Khowaja, Ik Hyun Lee, Snehal Singh Tomar, Rajarshi Ray, Klaus Mueller, Sachin Chaudhary, Surya Vashisth, Akshay Dudhane, Praful Hambarde, Satya Naryan Tazi, Prashant Patil, Santosh Kumar Vipparthi, Subrahmanyam Murala, Bilel Benjdira, Anas M. Ali, Wadii Boulila, Zahra Moammeri, Ahmad Mahmoudi-Aznaveh, Ali Karbasi, Hossein Motamednia, Liangyan Li, Guanhua Zhao, Kevin Le, Yimo Ning, Haoxuan Huang, Jun Chen

This paper presents the NTIRE 2025 image super-resolution ($\times$4) challenge, one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025.

Image Super-Resolution valid

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

no code implementations30 Mar 2025 Kai Liu, Wei Li, Lai Chen, Shengqiong Wu, Yanhao Zheng, Jiayi Ji, Fan Zhou, Rongxin Jiang, Jiebo Luo, Hao Fei, Tat-Seng Chua

This paper introduces JavisDiT, a novel Joint Audio-Video Diffusion Transformer designed for synchronized audio-video generation (JAVG).

Video Generation

UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design

1 code implementation25 Mar 2025 Xiangzhe Kong, Zishen Zhang, Ziting Zhang, Rui Jiao, Jianzhu Ma, Wenbing Huang, Kai Liu, Yang Liu

The design of target-specific molecules such as small molecules, peptides, and antibodies is vital for biological research and drug discovery.

Drug Discovery Latent Diffusion Model for 3D

OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery

no code implementations22 Mar 2025 Vignesh Prabhakar, Md Amirul Islam, Adam Atanas, Yao-Ting Wang, Joah Han, Aastha Jhunjhunwala, Rucha Apte, Robert Clark, Kang Xu, Zihan Wang, Kai Liu

Large Language Models (LLMs) have demonstrated remarkable potential in advancing scientific knowledge and addressing complex challenges.

Knowledge Distillation

A Modular Dataset to Demonstrate LLM Abstraction Capability

no code implementations22 Mar 2025 Adam Atanas, Kai Liu

Large language models (LLMs) exhibit impressive capabilities but struggle with reasoning errors due to hallucinations and flawed logic.

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

no code implementations15 Mar 2025 Xin Jin, Haisheng Su, Kai Liu, Cong Ma, Wei Wu, Fei Hui, Junchi Yan

Inspired by the impressive performance of State Space Models (SSM) achieved in the field of 2D vision tasks, in this paper, we propose a novel Unified Mamba (UniMamba), which seamlessly integrates the merits of 3D convolution and SSM in a concise multi-head manner, aiming to perform "local and global" spatial context aggregation efficiently and simultaneously.

3D Object Detection Mamba +3

Fast Critical Clearing Time Calculation for Power Systems with Synchronous and Asynchronous Generation

no code implementations15 Mar 2025 Xuezao Wang, Yijun Xu, Wei Gu, Kai Liu, Shuai Lu, Mert Korkali, Lamine Mili

The increasing penetration of renewables is replacing traditional synchronous generation in modern power systems with low-inertia asynchronous converter-interfaced generators (CIGs).

CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution

1 code implementation21 Feb 2025 Kai Liu, Dehui Wang, Zhiteng Li, Zheng Chen, Yong Guo, Wenbo Li, Linghe Kong, Yulun Zhang

Experimentally, we observe that the degradation of quantization is mainly attributed to the quantization of activation instead of model weights.

Image Super-Resolution Quantization

Robust 6DoF Pose Tracking Considering Contour and Interior Correspondence Uncertainty for AR Assembly Guidance

no code implementations17 Feb 2025 Jixiang Chen, Jing Chen, Kai Liu, Haochen Chang, Shanfeng Fu, Jian Yang

It utilizes a fan-shaped search strategy to refine correspondences and models local contour shape and noise uncertainty as mixed probability distribution, resulting in a highly robust contour energy function.

Optical Flow Estimation Pose Tracking

SCDiar: a streaming diarization system based on speaker change detection and speech recognition

no code implementations28 Jan 2025 Naijun Zheng, Xucheng Wan, Kai Liu, Zhou Huan

In hours-long meeting scenarios, real-time speech stream often struggles with achieving accurate speaker diarization, commonly leading to speaker identification and speaker count errors.

Change Detection speaker-diarization +4

Distributed Model Predictive Control Design for Multi-agent Systems via Bayesian Optimization

no code implementations22 Jan 2025 Hossein Nejatbakhsh Esfahani, Kai Liu, Javad Mohammadpour Velni

This paper introduces a new approach that leverages Multi-agent Bayesian Optimization (MABO) to design Distributed Model Predictive Control (DMPC) schemes for multi-agent systems.

Bayesian Optimization Distributed Optimization +1

UAV-DETR: Efficient End-to-End Object Detection for Unmanned Aerial Vehicle Imagery

1 code implementation3 Jan 2025 Huaxiang Zhang, Kai Liu, Zhongxue Gan, Guo-Niu Zhu

End-to-end models that do not depend on such manually designed components are mainly designed for natural images, which are less effective for UAV imagery.

object-detection Object Detection

ACQ: A Unified Framework for Automated Programmatic Creativity in Online Advertising

no code implementations9 Dec 2024 Ruizhi Wang, Kai Liu, Bingjie Li, Yu Rong, Qingpeng Cai, Fei Pan, Peng Jiang

ACQ comprises two components: a prediction module to estimate the cost of a photo under different numbers of ad creatives, and an allocation module to decide the quota for photos considering their estimated costs in the prediction module.

Multiple-choice Multi-Task Learning

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

3 code implementations4 Nov 2024 Xingwu Sun, Yanfeng Chen, Yiqing Huang, Ruobing Xie, Jiaqi Zhu, Kai Zhang, Shuaipeng Li, Zhen Yang, Jonny Han, Xiaobo Shu, Jiahao Bu, Zhongzhi Chen, Xuemeng Huang, Fengzong Lian, Saiyong Yang, Jianfeng Yan, Yuyuan Zeng, Xiaoqin Ren, Chao Yu, Lulu Wu, Yue Mao, Jun Xia, Tao Yang, Suncong Zheng, Kan Wu, Dian Jiao, Jinbao Xue, Xipeng Zhang, Decheng Wu, Kai Liu, Dengpeng Wu, Guanghui Xu, Shaohua Chen, Shuang Chen, Xiao Feng, Yigeng Hong, Junqiang Zheng, Chengcheng Xu, Zongwei Li, Xiong Kuang, Jianglu Hu, Yiqi Chen, Yuchi Deng, Guiyang Li, Ao Liu, Chenchen Zhang, Shihui Hu, Zilong Zhao, Zifan Wu, Yao Ding, Weichao Wang, Han Liu, Roberts Wang, Hao Fei, Peijie Yu, Ze Zhao, Xun Cao, Hai Wang, Fusheng Xiang, Mengyuan Huang, Zhiyuan Xiong, Bin Hu, Xuebin Hou, Lei Jiang, Jianqiang Ma, Jiajia Wu, Yaping Deng, Yi Shen, Qian Wang, Weijie Liu, Jie Liu, Meng Chen, Liang Dong, Weiwen Jia, Hu Chen, Feifei Liu, Rui Yuan, Huilin Xu, Zhenxiang Yan, Tengfei Cao, Zhichao Hu, Xinhua Feng, Dong Du, TingHao Yu, Yangyu Tao, Feng Zhang, Jianchen Zhu, Chengzhong Xu, Xirui Li, Chong Zha, Wen Ouyang, Yinben Xia, Xiang Li, Zekun He, Rongpeng Chen, Jiawei Song, Ruibin Chen, Fan Jiang, Chongqing Zhao, Bo wang, Hao Gong, Rong Gan, Winston Hu, Zhanhui Kang, Yong Yang, Yuhong Liu, Di Wang, Jie Jiang

In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens.

Logical Reasoning Mathematical Problem-Solving +1

Learning Identifiable Factorized Causal Representations of Cellular Responses

1 code implementation29 Oct 2024 Haiyi Mao, Romain Lopez, Kai Liu, Jan-Christian Hütter, David Richmond, Panayiotis V. Benos, Lin Qiu

The study of cells and their responses to genetic or chemical perturbations promises to accelerate the discovery of therapeutic targets.

Delving into the Reversal Curse: How Far Can Large Language Models Generalize?

1 code implementation24 Oct 2024 Zhengkai Lin, Zhihang Fu, Kai Liu, Liang Xie, Binbin Lin, Wenxiao Wang, Deng Cai, Yue Wu, Jieping Ye

(2) This generalization ability is highly correlated to the structure of the fact "A is B" in the training documents.

Multiple-choice

Att2CPC: Attention-Guided Lossy Attribute Compression of Point Clouds

1 code implementation23 Oct 2024 Kai Liu, Kang You, Pan Gao, Manoranjan Paul

In this paper, we focus on the task of learned lossy point cloud attribute compression (PCAC).

Attribute

CKSP: Cross-species Knowledge Sharing and Preserving for Universal Animal Activity Recognition

no code implementations22 Oct 2024 Axiu Mao, Meilu Zhu, Zhaojin Guo, Zheng He, Tomas Norton, Kai Liu

The results show that our approach remarkably boosts the classification performance compared to the baseline method (one-for-one framework) solely trained on individual-species data, with increments of 6. 04%, 2. 06%, and 3. 66% in accuracy, and 10. 33%, 3. 67%, and 7. 90% in F1-score for the horse, sheep, and cattle datasets, respectively.

Activity Recognition

Diff-PCC: Diffusion-based Neural Compression for 3D Point Clouds

no code implementations20 Aug 2024 Kai Liu, Kang You, Pan Gao

Stable diffusion networks have emerged as a groundbreaking development for their ability to produce realistic and detailed visual content.

XCB: an effective contextual biasing approach to bias cross-lingual phrases in speech recognition

no code implementations20 Aug 2024 Xucheng Wan, Naijun Zheng, Kai Liu, Huan Zhou

Contextualized ASR models have been demonstrated to effectively improve the recognition accuracy of uncommon phrases when a predefined phrase list is available.

speech-recognition Speech Recognition

Parallel Speculative Decoding with Adaptive Draft Length

1 code implementation13 Aug 2024 Tianyu Liu, Yun Li, Qitan Lv, Kai Liu, Jianchen Zhu, Winston Hu

Speculative decoding (SD), where an extra draft model is employed to provide multiple \textit{draft} tokens first and then the original target model verifies these tokens in parallel, has shown great power for LLM inference acceleration.

Text Generation

Enhancing LLM's Cognition via Structurization

1 code implementation23 Jul 2024 Kai Liu, Zhihang Fu, Chao Chen, Wei zhang, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye

Besides, we show the feasibility of distilling advanced LLMs' language processing abilities to a smaller yet effective StruXGPT-7B to execute structurization, addressing the practicality of our approach.

Hallucination Hallucination Evaluation +1

ESOD: Efficient Small Object Detection on High-Resolution Images

1 code implementation23 Jul 2024 Kai Liu, Zhihang Fu, Sheng Jin, Ze Chen, Fan Zhou, Rongxin Jiang, Yaowu Chen, Jieping Ye

The resulting Efficient Small Object Detection (ESOD) approach is a generic framework, which can be applied to both CNN- and ViT-based detectors to save the computation and GPU memory costs.

Object object-detection +1

Rethinking Out-of-Distribution Detection on Imbalanced Data Distribution

1 code implementation23 Jul 2024 Kai Liu, Zhihang Fu, Sheng Jin, Chao Chen, Ze Chen, Rongxin Jiang, Fan Zhou, Yaowu Chen, Jieping Ye

Detecting and rejecting unknown out-of-distribution (OOD) samples is critical for deployed neural networks to void unreliable predictions.

Out-of-Distribution Detection

Structure-aware Domain Knowledge Injection for Large Language Models

1 code implementation23 Jul 2024 Kai Liu, Ze Chen, Zhihang Fu, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye

Remarkably, our method demonstrates the potential of comparable improvement against the state-of-the-art MMedLM2 on MMedBench, while significantly reducing the training costs to 5%.

Question Answering

Bucket Pre-training is All You Need

no code implementations10 Jul 2024 Hongtao Liu, Qiyao Peng, Qing Yang, Kai Liu, Hongyan Xu

Large language models (LLMs) have demonstrated exceptional performance across various natural language processing tasks.

All

An efficient text augmentation approach for contextualized Mandarin speech recognition

no code implementations14 Jun 2024 Naijun Zheng, Xucheng Wan, Kai Liu, Ziqing Du, Zhou Huan

Although contextualized automatic speech recognition (ASR) systems are commonly used to improve the recognition of uncommon words, their effectiveness is hindered by the inherent limitations of speech-text data availability.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution

1 code implementation10 Jun 2024 Kai Liu, Haotong Qin, Yong Guo, Xin Yuan, Linghe Kong, Guihai Chen, Yulun Zhang

Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment, which allows advanced SR models to enjoy compact low-bit parameters and efficient integer/bitwise constructions for storage compression and inference acceleration, respectively.

Image Super-Resolution Quantization

Differentiable Model Scaling using Differentiable Topk

1 code implementation12 May 2024 Kai Liu, Ruohui Wang, Jianfei Gao, Kai Chen

Specifically, for image classification on ImageNet, our DMS improves the top-1 accuracy of EfficientNet-B0 and Deit-Tiny by 1. 4% and 0. 6%, respectively, and outperforms the state-of-the-art zero-shot NAS method, ZiCo, by 1. 3% while requiring only 0. 4 GPU days for searching.

Image Classification Language Modeling +6

Learning-to-solve unit commitment based on few-shot physics-guided spatial-temporal graph convolution network

no code implementations2 May 2024 Mei Yang, Gao Qiu andJunyong Liu, Kai Liu

This letter proposes a few-shot physics-guided spatial temporal graph convolutional network (FPG-STGCN) to fast solve unit commitment (UC).

Mixture-of-Instructions: Comprehensive Alignment of a Large Language Model through the Mixture of Diverse System Prompting Instructions

no code implementations29 Apr 2024 Bowen Xu, Shaoyu Wu, Kai Liu, Lulu Hu

With the proliferation of large language models (LLMs), the comprehensive alignment of such models across multiple tasks has emerged as a critical area of research.

Language Modeling Language Modelling +2

Pointsoup: High-Performance and Extremely Low-Decoding-Latency Learned Geometry Codec for Large-Scale Point Cloud Scenes

1 code implementation21 Apr 2024 Kang You, Kai Liu, Li Yu, Pan Gao, Dandan Ding

Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces.

Decoder

Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation

1 code implementation12 Apr 2024 Yanhao Zheng, Kai Liu

Specifically, in the region-proposal stage, proposals that contain novel instances showcase lower objectness scores, since they are treated as background proposals during the training phase.

Object object-detection +3

Binomial Self-compensation for Motion Error in Dynamic 3D Scanning

1 code implementation10 Apr 2024 Geyou Zhang, Ce Zhu, Kai Liu

Phase shifting profilometry (PSP) is favored in high-precision 3D scanning due to its high accuracy, robustness, and pixel-wise property.

3D Reconstruction

EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs

no code implementations5 Mar 2024 Hanlin Tang, Yifu Sun, Decheng Wu, Kai Liu, Jianchen Zhu, Zhanhui Kang

To our best knowledge, we are the first work that achieves almost lossless quantization performance for LLMs under a data-independent setting and our algorithm runs over 10 times faster than the data-dependent methods.

Data Free Quantization

INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection

1 code implementation6 Feb 2024 Chao Chen, Kai Liu, Ze Chen, Yi Gu, Yue Wu, Mingyuan Tao, Zhihang Fu, Jieping Ye

Knowledge hallucination have raised widespread concerns for the security and reliability of deployed LLMs.

Diversity Hallucination +1

Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection

no code implementations NeurIPS 2023 Chao Chen, Zhihang Fu, Kai Liu, Ze Chen, Mingyuan Tao, Jieping Ye

Most existing OOD detection methods focused on exploring advanced training skills or training-free tricks to prevent the model from yielding overconfident confidence score for unknown samples.

Out-of-Distribution Detection

Query-LIFE: Query-aware Language Image Fusion Embedding for E-Commerce Relevance

no code implementations26 Nov 2023 Hai Zhu, Yuankai Guo, Ronggang Dou, Kai Liu

Query-LIFE utilizes a query-based multimodal fusion to effectively incorporate the image and title based on the product types.

Contrastive Learning

Phase Guided Light Field for Spatial-Depth High Resolution 3D Imaging

no code implementations17 Nov 2023 Geyou Zhang, Ce Zhu, Kai Liu, Yipeng Liu

On 3D imaging, light field cameras typically are of single shot, and however, they heavily suffer from low spatial resolution and depth accuracy.

Stereo Matching

A Spatial-Temporal Transformer based Framework For Human Pose Assessment And Correction in Education Scenarios

no code implementations1 Nov 2023 Wenyang Hu, Kai Liu, Libin Liu, Huiliang Shang

Human pose assessment and correction play a crucial role in applications across various fields, including computer vision, robotics, sports analysis, healthcare, and entertainment.

Pose Estimation

E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity

no code implementations24 Oct 2023 Yun Li, Lin Niu, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang

Traditional pruning methods are known to be challenging to work in Large Language Models (LLMs) for Generative AI because of their unaffordable training process and large computational demands.

Language Modeling Language Modelling +1

GraphText: Graph Reasoning in Text Space

1 code implementation2 Oct 2023 Jianan Zhao, Le Zhuo, Yikang Shen, Meng Qu, Kai Liu, Michael Bronstein, Zhaocheng Zhu, Jian Tang

Furthermore, GraphText paves the way for interactive graph reasoning, allowing both humans and LLMs to communicate with the model seamlessly using natural language.

In-Context Learning Text Generation

On Regularized Sparse Logistic Regression

no code implementations12 Sep 2023 Mengyuan Zhang, Kai Liu

Sparse logistic regression is for classification and feature selection simultaneously.

Binary Classification Classification +2

Strictly Low Rank Constraint Optimization -- An Asymptotically $\mathcal{O}(\frac{1}{t^2})$ Method

no code implementations4 Jul 2023 Mengyuan Zhang, Kai Liu

We study a class of non-convex and non-smooth problems with \textit{rank} regularization to promote sparsity in optimal solution.

Physics-Guided Graph Neural Networks for Real-time AC/DC Power Flow Analysis

no code implementations29 Apr 2023 Mei Yang, Gao Qiu, Yong Wu, Junyong Liu, Nina Dai, Yue Shui, Kai Liu, Lijie Ding

The increasing scale of alternating current and direct current (AC/DC) hybrid systems necessitates a faster power flow analysis tool than ever.

Computational Efficiency Graph Neural Network

SDFReg: Learning Signed Distance Functions for Point Cloud Registration

no code implementations18 Apr 2023 Leida Zhang, Zhengda Lu, Kai Liu, Yiqun Wang

We then propose to alternately optimize the implicit function and the registration between the implicit function and point cloud.

Point Cloud Registration

Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition

no code implementations23 Mar 2023 Kai Liu, Hailiang Xiong, Gangqiang Yang, Zhengfeng Du, Yewen Cao, Danyal Shah

On the other hand, we need to reduce the dimension of each subspace to keep the size of the overall feature space unchanged when we increase the number of heads, which will significantly weaken the ability to represent the feature of each subspace.

Automatic Speech Recognition speech-recognition +1

X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker Confusion

no code implementations9 Mar 2023 Kai Liu, Ziqing Du, Xucheng Wan, Huan Zhou

To mitigate the imperative SC issue, we reformulate the training objective and propose two novel loss schemes that explore the metric of reconstruction improvement performance defined at small chunk-level and leverage the metric associated distribution information.

Speech Extraction

Adaptive Weighted Multiview Kernel Matrix Factorization with its application in Alzheimer's Disease Analysis -- A clustering Perspective

no code implementations7 Mar 2023 Kai Liu, Yarui Cao

Recent technology and equipment advancements provide with us opportunities to better analyze Alzheimer's disease (AD), where we could collect and employ the data from different image and genetic modalities that may potentially enhance the predictive performance.

Clustering

A Provable Splitting Approach for Symmetric Nonnegative Matrix Factorization

no code implementations25 Jan 2023 Xiao Li, Zhihui Zhu, Qiuwei Li, Kai Liu

The symmetric Nonnegative Matrix Factorization (NMF), a special but important class of the general NMF, has found numerous applications in data analysis such as various clustering tasks.

Clustering Image Clustering +1

Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings

no code implementations16 Jan 2023 Kai Liu, Xucheng Wan, Ziqing Du, Huan Zhou

As a practical alternative of speech separation, target speaker extraction (TSE) aims to extract the speech from the desired speaker using additional speaker cue extracted from the speaker.

Speaker Verification Speech Separation +1

Randomized Greedy Algorithms and Composable Coreset for k-Center Clustering with Outliers

1 code implementation7 Jan 2023 Hu Ding, Ruomin Huang, Kai Liu, Haikuo Yu, Zixiu Wang

Though a number of methods have been developed in the past decades, it is still quite challenging to design quality guaranteed algorithm with low complexity for this problem.

Clustering

Deep Biological Pathway Informed Pathology-Genomic Multimodal Survival Prediction

1 code implementation6 Jan 2023 Lin Qiu, Aminollah Khormali, Kai Liu

The integration of multi-modal data, such as pathological images and genomic data, is essential for understanding cancer heterogeneity and complexity for personalized treatments, as well as for enhancing survival predictions.

Prognosis Survival Prediction

Multi-Task Learning with Prior Information

no code implementations4 Jan 2023 Mengyuan Zhang, Kai Liu

Multi-task learning aims to boost the generalization performance of multiple related tasks simultaneously by leveraging information contained in those tasks.

Multi-Task Learning

KAST: Knowledge Aware Adaptive Session Multi-Topic Network for Click-Through Rate Prediction

no code implementations7 Oct 2022 Dike Sun, Kai Liu, ShengKai Yang

Capturing the evolving trends of user interest is important for both recommendation systems and advertising systems, and user behavior sequences have been successfully used in Click-Through-Rate(CTR) prediction problems.

Click-Through Rate Prediction Recommendation Systems

A GPU-accelerated Algorithm for Distinct Discriminant Canonical Correlation Network

no code implementations26 Sep 2022 Kai Liu, Lei Gao, Ling Guan

In this paper, a GPU-based accelerated algorithm is proposed to further optimize the DDCCANet algorithm.

Image Classification

Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

no code implementations24 Sep 2022 Xucheng Wan, Kai Liu, Ziqing Du, Huan Zhou

To validate the effectiveness of our proposed model, extensive experiments are conducted on the DNS2020 dataset.

Speech Enhancement

Joint Speech Activity and Overlap Detection with Multi-Exit Architecture

no code implementations24 Sep 2022 Ziqing Du, Kai Liu, Xucheng Wan, Huan Zhou

Overlapped speech detection (OSD) is critical for speech applications in scenario of multi-party conversion.

Action Detection Activity Detection +1

Rethinking Symmetric Matrix Factorization: A More General and Better Clustering Perspective

1 code implementation6 Sep 2022 Mengyuan Zhang, Kai Liu

Nonnegative matrix factorization (NMF) is widely used for clustering with strong interpretability.

Clustering Graph Clustering

Maximum Correntropy Value Decomposition for Multi-agent Deep Reinforcemen Learning

no code implementations7 Aug 2022 Kai Liu, Tianxian Zhang, Lingjiang Kong

In this paper, we first demonstrate the flaw of Weighted QMIX using an ordinary One-Step Matrix Game (OMG), that no matter how the weight is chosen, Weighted QMIX struggles to deal with non-monotonic value decomposition problems with a large variance of reward distributions.

Deep Reinforcement Learning SMAC+ +1

Weakly-supervised High-fidelity Ultrasound Video Synthesis with Feature Decoupling

no code implementations1 Jul 2022 Jiamin Liang, Xin Yang, Yuhao Huang, Kai Liu, Xinrui Zhou, Xindi Hu, Zehui Lin, Huanjia Luo, Yuanji Zhang, Yi Xiong, Dong Ni

First, leveraging the advantages of self- and fully-supervised learning, our proposed system is trained in weakly-supervised manner for keypoint detection.

Keypoint Detection Vocal Bursts Intensity Prediction

Occlusion-Resistant Instance Segmentation of Piglets in Farrowing Pens Using Center Clustering Network

no code implementations4 Jun 2022 Endai Huang, Axiu Mao, Junhui Hou, Yongjian Wu, Weitao Xu, Maria Camila Ceballos, Thomas D. Parsons, Kai Liu

Specifically, CClusnet-Inseg uses each pixel to predict object centers and trace these centers to form masks based on clustering results, which consists of a network for segmentation and center offset vector map, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, Centers-to-Mask (C2M), and Remain-Centers-to-Mask (RC2M) algorithms.

Clustering Instance Segmentation +4

Enriched Robust Multi-View Kernel Subspace Clustering

no code implementations21 May 2022 Mengyuan Zhang, Kai Liu

To address the above issues, in this paper we propose a novel Enriched Robust Multi-View Kernel Subspace Clustering framework where the consensus affinity matrix is learned from both multi-view data and spectral clustering.

Clustering Multi-view Subspace Clustering

MKQ-BERT: Quantized BERT with 4-bits Weights and Activations

no code implementations25 Mar 2022 Hanlin Tang, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang

In this work, we propose MKQ-BERT, which further improves the compression level and uses 4-bits for quantization.

Quantization

Exploring the impact of spatiotemporal granularity on the demand prediction of dynamic ride-hailing

no code implementations19 Mar 2022 Kai Liu, Zhiju Chen, Toshiyuki Yamamoto, Liheng Tuo

A convolutional, long short-term memory model combined with a hexagonal convolution operation (H-ConvLSTM) is proposed to explore the complex spatial and temporal relations.

Prediction

Joint CNN and Transformer Network via weakly supervised Learning for efficient crowd counting

no code implementations12 Mar 2022 Fusen Wang, Kai Liu, Fei Long, Nong Sang, Xiaofeng Xia, Jun Sang

However, the transformer directly partitions the crowd images into a series of tokens, which may not be a good choice due to each pedestrian being an independent individual, and the parameter number of the network is very large.

Crowd Counting Weakly-supervised Learning

Dynamic Group Transformer: A General Vision Transformer Backbone with Dynamic Group Attention

no code implementations8 Mar 2022 Kai Liu, Tianyi Wu, Cong Liu, Guodong Guo

To reduce the quadratic computation complexity caused by each query attending to all keys/values, various methods have constrained the range of attention within local regions, where each query only attends to keys/values within a hand-crafted window.

Image Classification Instance Segmentation +3

Spherical Matrix Factorization

no code implementations29 Nov 2021 Kai Liu

However, most of the studies aim to minimize the loss by measuring the Euclidean distance, though in some fields, angle distance is known to be more important and critical for analysis.

Dictionary Learning

Robust Principal Component Analysis: A Construction Error Minimization Perspective

no code implementations23 Nov 2021 Kai Liu, Yarui Cao

In this paper we propose a novel optimization framework to systematically solve robust PCA problem with rigorous theoretical guarantee, based on which we investigate very computationally economic updating algorithms.

Exact Sparse Orthogonal Dictionary Learning

no code implementations14 Mar 2021 Kai Liu, Yongjian Zhao, Hua Wang

Over the past decade, learning a dictionary from input images for sparse modeling has been one of the topics which receive most research attention in image processing and compressed sensing.

compressed sensing Denoising +1

RRCN: A Reinforced Random Convolutional Network based Reciprocal Recommendation Approach for Online Dating

no code implementations25 Nov 2020 Linhao Luo, Liqi Yang, Ju Xin, Yixiang Fang, Xiaofeng Zhang, Xiaofei Yang, Kai Chen, Zhiyuan Zhang, Kai Liu

In particular, we technically propose a novel random CNN component that can randomly convolute non-adjacent features to capture their interaction information and learn feature embeddings of key attributes to make the final recommendation.

Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech

9 code implementations Interspeech2020 2020 Geng Yang, Shan Yang, Kai Liu, Peng Fang, Wei Chen, Lei Xie

In this paper, we propose multi-band MelGAN, a much faster waveform generation model targeting to high-quality text-to-speech.

Sound Audio and Speech Processing

Incentivized Exploration for Multi-Armed Bandits under Reward Drift

no code implementations12 Nov 2019 Zhiyuan Liu, Huazheng Wang, Fan Shen, Kai Liu, Lijun Chen

We study incentivized exploration for the multi-armed bandit (MAB) problem where the players receive compensation for exploring arms other than the greedy choice and may provide biased feedback on reward.

Thompson Sampling

Gated Multiple Feedback Network for Image Super-Resolution

1 code implementation9 Jul 2019 Qilei Li, Zhen Li, Lu Lu, Gwanggil Jeon, Kai Liu, Xiaomin Yang

The rapid development of deep learning (DL) has driven single image super-resolution (SR) into a new era.

Image Super-Resolution

Spherical Principal Component Analysis

1 code implementation16 Mar 2019 Kai Liu, Qiuwei Li, Hua Wang, Gongguo Tang

However, most of the studies on PCA aim to minimize the loss after projection, which usually measures the Euclidean distance, though in some fields, angle distance is known to be more important and critical for analysis.

Clustering

Shubnikov-de Haas and de Haas-van Alphen oscillations in topological semimetal CaAl4

no code implementations15 Nov 2018 Sheng Xu, Jian-Feng Zhang, Yi-Yan Wang, Lin-Lin Sun, Huan Wang, Yuan Su, Xiao-Yan Wang, Kai Liu, Tian-Long Xia

An electron-type quasi-2D Fermi surface is found by the angle-dependent Shubnikov-de Haas oscillations, de Haas-van Alphen oscillations and the first-principles calculations.

Materials Science Mesoscale and Nanoscale Physics

Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization

no code implementations NeurIPS 2018 Zhihui Zhu, Xiao Li, Kai Liu, Qiuwei Li

Symmetric nonnegative matrix factorization (NMF), a special but important class of the general NMF, is demonstrated to be useful for data analysis and in particular for various clustering tasks.

Clustering Image Clustering

Deep Item-based Collaborative Filtering for Top-N Recommendation

1 code implementation11 Nov 2018 Feng Xue, Xiangnan He, Xiang Wang, Jiandong Xu, Kai Liu, Richang Hong

In this work, we propose a more expressive ICF solution by accounting for the nonlinear and higher-order relationship among items.

Collaborative Filtering Decision Making +1

Adaptations of ROUGE and BLEU to Better Evaluate Machine Reading Comprehension Task

no code implementations WS 2018 An Yang, Kai Liu, Jing Liu, Yajuan Lyu, Sujian Li

Current evaluation metrics to question answering based machine reading comprehension (MRC) systems generally focus on the lexical overlap between the candidate and reference answers, such as ROUGE and BLEU.

Machine Reading Comprehension Question Answering

Learning Multi-Instance Enriched Image Representations via Non-Greedy Ratio Maximization of the l1-Norm Distances

no code implementations CVPR 2018 Kai Liu, Hua Wang, Feiping Nie, Hao Zhang

To tackle these two challenges, in this paper we propose a novel image representation learning method that can integrate the local patches (the instances) of an input image (the bag) and its holistic representation into one single-vector representation.

Representation Learning

Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification

no code implementations ACL 2018 Yizhong Wang, Kai Liu, Jing Liu, wei he, Yajuan Lyu, Hua Wu, Sujian Li, Haifeng Wang

Machine reading comprehension (MRC) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine.

Machine Reading Comprehension Question Answering

Spatial Image Steganography Based on Generative Adversarial Network

1 code implementation21 Apr 2018 Jianhua Yang, Kai Liu, Xiangui Kang, Edward K. Wong, Yun-Qing Shi

The architecture contain three component modules: a generator, an embedding simulator and a discriminator.

Multimedia

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

3 code implementations WS 2018 Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yu-An Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang

Experiments show that human performance is well above current state-of-the-art baseline systems, leaving plenty of room for the community to make improvements.

Machine Reading Comprehension

Structured Light Phase Measuring Profilometry Pattern Design for Binary Spatial Light Modulators

no code implementations8 Jun 2017 Daniel L. Lau, Yu Zhang, Kai Liu

In the case of phase measuring profilometry (PMP), the projected patterns are composed of a rolling sinusoidal wave, but as a set of time-multiplexed patterns, PMP requires the target surface to remain motionless or for scanning to be performed at such high rates that any movement is small.

Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

3 code implementations18 Apr 2017 Kun Gai, Xiaoqiang Zhu, Han Li, Kai Liu, Zhe Wang

CTR prediction in real-world business is a difficult machine learning problem with large scale nonlinear sparse data.

Click-Through Rate Prediction Feature Engineering

Cannot find the paper you are looking for? You can Submit a new open access paper.