no code implementations • 1 Jun 2025 • Zhu Li, Yuqing Zhang, Xiyuan Gao, Shekhar Nayak, Matt Coler
Sarcasm fundamentally alters meaning through tone and context, yet detecting it in speech remains a challenge due to data scarcity.
no code implementations • 18 May 2025 • Yating Liu, Yujie Zhang, Qi Yang, Yiling Xu, Zhu Li, Ye-kui Wang
Recently, the advancements in Virtual/Augmented Reality (VR/AR) have driven the demand for Dynamic Point Clouds (DPC).
1 code implementation • 16 May 2025 • Kaifa Yang, Qi Yang, Zhu Li, Yiling Xu
Motivated by the effectiveness of fields in representing both 3D geometry and color information, we propose a novel point-based TMQA method called field mesh quality metric (FMQM).
1 code implementation • 13 May 2025 • He Huang, Qi Yang, Mufan Liu, Yiling Xu, Zhu Li
Existing 4D Gaussian Splatting methods rely on per-Gaussian deformation from a canonical space to target frames, which overlooks redundancy among adjacent Gaussian primitives and results in suboptimal performance.
1 code implementation • 3 May 2025 • Qi Yang, Le Yang, Geert Van der Auwera, Zhu Li
Most existing 3D Gaussian Splatting (3DGS) compression schemes focus on producing compact 3DGS representation via implicit data embedding.
no code implementations • 17 Apr 2025 • Xiangrui Liu, Xinju Wu, Shiqi Wang, Zhu Li, Sam Kwong
We further devise a temporal primitive prediction module to handle dynamic scenes, which exploits primitive correlations across timestamps to effectively reduce temporal redundancy.
no code implementations • 18 Mar 2025 • Mufan Liu, Qi Yang, He Huang, Wenjie Huang, Zhenlong Yuan, Zhu Li, Yiling Xu
Specifically, our framework is built upon two core components: (1) a spatio-temporal significance pruning strategy that eliminates over 64\% of the deformable primitives, followed by an entropy-constrained spherical harmonics compression applied to the remainder; and (2) a deep context model that integrates intra- and inter-prediction with hyperprior into a coarse-to-fine context structure to enable efficient multiscale latent embedding compression.
1 code implementation • 7 Mar 2025 • Mufan Liu, Qi Yang, Miaoran Zhao, He Huang, Le Yang, Zhu Li, Yiling Xu
Implicit Neural Representations (INRs) have emerged as a powerful approach for video representation, offering versatility across tasks such as compression and inpainting.
no code implementations • 23 Jan 2025 • Yipeng Liu, Qi Yang, Yujie Zhang, Yiling Xu, Le Yang, Zhu Li
We present a novel quality assessment method which can predict the perceptual quality of point clouds from new scenes without available annotations by leveraging the rich prior knowledge in images, called the Distribution-Weighted Image-Transferred Point Cloud Quality Assessment (DWIT-PCQA).
no code implementations • 9 Jan 2025 • Juno Kim, Dimitri Meunier, Arthur Gretton, Taiji Suzuki, Zhu Li
We prove that the DFIV algorithm achieves the minimax optimal learning rate when the target structural function lies in a Besov space.
no code implementations • 20 Dec 2024 • Yiheng Jiang, Haotian Zhang, Li Li, Dong Liu, Zhu Li
In this paper, motivated by the recent success of learned image compression, we propose a new framework that uses sparse point clouds to assist in learned image compression in the autonomous driving scenario.
no code implementations • 15 Dec 2024 • Yujie Zhang, Bingyang Cui, Qi Yang, Zhu Li, Yiling Xu
Text-to-3D generation has achieved remarkable progress in recent years, yet evaluating these methods remains challenging for two reasons: i) Existing benchmarks lack fine-grained evaluation on different prompt categories and evaluation dimensions.
no code implementations • 13 Dec 2024 • Xiyuan Gao, Shubhi Bansal, Kushaan Gowda, Zhu Li, Shekhar Nayak, Nagendra Kumar, Matt Coler
This approach utilizes the Multimodal Sarcasm Detection Dataset (MUStARD) and introduces a two-phase bimodal data augmentation strategy.
no code implementations • 29 Nov 2024 • Dimitri Meunier, Zhu Li, Tim Christensen, Arthur Gretton
We study the kernel instrumental variable algorithm of \citet{singh2019kernel}, a nonparametric two-stage least squares (2SLS) procedure which has demonstrated strong empirical performance.
no code implementations • 15 Nov 2024 • Jingyi Cao, Xiangyi Chen, Bo Liu, Ming Ding, Rong Xie, Li Song, Zhu Li, Wenjun Zhang
The widespread use of image acquisition technologies, along with advances in facial recognition, has raised serious privacy concerns.
no code implementations • 11 Nov 2024 • He Huang, Wenjie Huang, Qi Yang, Yiling Xu, Zhu Li
For non-anchor primitives, each is predicted based on the k-nearest anchor primitives.
no code implementations • 29 Aug 2024 • BoYu Chen, Junjie Liu, Zhu Li, Mengyue Yang
We address these challenges by first conceptualizing multimodal representations as comprising modality-invariant and modality-specific components.
no code implementations • 27 Aug 2024 • Zhu Li, Xiyuan Gao, Yuqing Zhang, Shekhar Nayak, Matt Coler
This study investigates the acoustic features of sarcasm and disentangles the interplay between the propensity of an utterance being used sarcastically and the presence of prosodic cues signaling sarcasm.
1 code implementation • 19 Jul 2024 • Qi Yang, Kaifa Yang, Yuke Xing, Yiling Xu, Zhu Li
Second, based on GGSC, we create a GS Quality Assessment dataset (GSQA) with 120 samples.
no code implementations • 11 Jul 2024 • Yuke Xing, Qi Yang, Kaifa Yang, Yilin Xu, Zhu Li
The state-of-the-art objective metrics are tested in the new dataset.
no code implementations • CVPR Workshop 2024 • Raghunath Sai Puttagunta, Birendra Kathariya, Zhu Li, George York
MSFFCT achieved state-of-the-art results on the ×8 and ×16 GTISR tasks of the 2024 Perception Beyond Visual Spectrum (PBVS) challenge, winning 2nd place in both tasks and demonstrating its effectiveness in real-world scenarios.
no code implementations • 23 May 2024 • Dimitri Meunier, Zikai Shen, Mattes Mollenhauer, Arthur Gretton, Zhu Li
First, we rigorously confirm the so-called saturation effect for ridge regression with vector-valued output by deriving a novel lower bound on learning rates; this bound is shown to be suboptimal when the smoothness of the regression function exceeds a certain level.
no code implementations • 15 Apr 2024 • Xiangrui Liu, Xinju Wu, Pingping Zhang, Shiqi Wang, Zhu Li, Sam Kwong
Gaussian splatting, renowned for its exceptional rendering quality and efficiency, has emerged as a prominent technique in 3D scene representation.
1 code implementation • 12 Feb 2024 • Shubhabrata Mukherjee, Cory Beard, Zhu Li
YOLO Phantom utilizes the novel Phantom Convolution block, achieving comparable accuracy to the latest YOLOv8n model while simultaneously reducing both parameters and model size by 43\%, resulting in a significant 19\% reduction in Giga Floating-Point Operations (GFLOPs).
no code implementations • 12 Dec 2023 • Zhu Li, Dimitri Meunier, Mattes Mollenhauer, Arthur Gretton
We present the first optimal rates for infinite-dimensional vector-valued ridge regression on a continuous scale of norms that interpolate between $L_2$ and the hypothesis space, which we consider as a vector-valued reproducing kernel Hilbert space.
no code implementations • 20 Nov 2023 • Yu Huang, Yue Chen, Zhu Li
Since DARPA Grand Challenges (rural) in 2004/05 and Urban Challenges in 2007, autonomous driving has been the most active field of AI applications.
1 code implementation • 1 Nov 2023 • Wei Wu, Hao Chang, Zhu Li
One is difference of Gaussian (DoG) pyramid recovery network (DPRNet) for SIFT detection, and the other gradients of Gaussian images recovery network (GGIRNet) for SIFT description.
no code implementations • 20 Jul 2023 • Dimitri Meunier, Zhu Li, Arthur Gretton, Samory Kpotufe
The main aim of theoretical guarantees on the subject is to establish the extent to which convergence rates -- in learning a common representation -- \emph{may scale with the number $N$ of tasks} (as well as the number of samples per task).
1 code implementation • 9 May 2023 • Shuting Xia, Tingyu Fan, Yiling Xu, Jenq-Neng Hwang, Zhu Li
3D dynamic point cloud (DPC) compression relies on mining its temporal context, which faces significant challenges due to DPC's sparsity and non-uniform structure.
no code implementations • 26 Sep 2022 • Tingyu Fan, Linyao Gao, Yiling Xu, Dong Wang, Zhu Li
Besides, we propose a residual coding framework for the compression of the latent variable, which explores the spatial correlation of each layer by progressive downsampling, and model the corresponding residual with a fully-factorized entropy model.
no code implementations • 2 Aug 2022 • Zhu Li, Dimitri Meunier, Mattes Mollenhauer, Arthur Gretton
We address the misspecified setting, where the target CME is in the space of Hilbert-Schmidt operators acting from an input interpolation space between $\mathcal{H}_X$ and $L_2$, to $\mathcal{H}_Y$.
no code implementations • 25 Jul 2022 • Anique Akhtar, Zhu Li, Geert Van der Auwera
The proposed method introduces a novel predictor network for motion compensation in the feature domain to map the latent representation of the previous frame to the coordinates of the current frame to predict the current frame's feature embedding.
no code implementations • 31 May 2022 • Zeyan Liu, Fengjun Li, Jingqiang Lin, Zhu Li, Bo Luo
In this paper, we present the first large-scale study on the stealthiness of adversarial samples used in the attacks against deep learning.
1 code implementation • 2 May 2022 • Tingyu Fan, Linyao Gao, Yiling Xu, Zhu Li, Dong Wang
This paper proposes a novel 3D sparse convolution-based Deep Dynamic Point Cloud Compression (D-DPCC) network to compensate and compress the DPC geometry with 3D motion estimation and motion compensation in the feature space.
2 code implementations • 20 Nov 2021 • Jianqiang Wang, Dandan Ding, Zhu Li, Xiaoxing Feng, Chuntong Cao, Zhan Ma
We call this compression method SparsePCGC.
no code implementations • 15 Nov 2021 • Zhu Li, Yuqing Zhang, Mengxi Nie, Ming Yan, Mengnan He, Ruixiong Zhang, Caixia Gong
Recent advancements in end-to-end speech synthesis have made it possible to generate highly natural speech.
no code implementations • 22 Sep 2021 • Zhu Li
Utilizing the regularity condition, we show for the first time that random Fourier features classification can achieve $O(1/\sqrt{n})$ learning rate with only $\Omega(\sqrt{n} \log n)$ features, as opposed to $\Omega(n)$ features suggested by previous results.
1 code implementation • 9 Aug 2021 • Baoliang Chen, Lingyu Zhu, Chenqi Kong, Hanwei Zhu, Shiqi Wang, Zhu Li
In this paper, we propose a no-reference (NR) image quality assessment (IQA) method via feature level pseudo-reference (PR) hallucination.
no code implementations • 15 Jun 2021 • Md Adnan Arefeen, Sumaiya Tabassum Nimi, Md Yusuf Sarwar Uddin, Zhu Li
In this paper, we propose a transfer-learning based model construction technique for the aerial scene classification problem.
Ranked #5 on
Aerial Scene Classification
on UCM (50% as trainset)
no code implementations • 6 Jun 2021 • Zhu Li, Zhi-Hua Zhou, Arthur Gretton
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss; yet surprisingly, they possess near-optimal prediction performance, contradicting classical learning theory.
no code implementations • 1 Jan 2021 • Wenqing Hu, Tiefeng Jiang, Zhu Li
We propose a novel local Subspace Indexing Model with Interpolation (SIM-I) for low-dimensional embedding of image datasets.
2 code implementations • 7 Nov 2020 • Jianqiang Wang, Dandan Ding, Zhu Li, Zhan Ma
Recent years have witnessed the growth of point cloud based applications because of its realistic and fine-grained representation of 3D objects and scenes.
no code implementations • 26 Sep 2020 • Rijun Liao, Weizhi An, Shiqi Yu, Zhu Li, Yongzhen Huang
In this paper, we, therefore, introduce a Dense-View GEIs Set (DV-GEIs) to deal with the challenge of limited view angles.
no code implementations • 6 Aug 2020 • Zhu Li, Weijie Su, Dino Sejdinovic
Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off.
1 code implementation • 31 May 2020 • Qi Yang, Zhan Ma, Yiling Xu, Zhu Li, Jun Sun
We propose the GraphSIM -- an objective metric to accurately predict the subjective quality of point cloud with superimposed geometry and color impairments.
1 code implementation • 25 May 2020 • Renlong Hang, Zhu Li, Qingshan Liu, Pedram Ghamisi, Shuvra S. Bhattacharyya
Specifically, a spectral attention sub-network and a spatial attention sub-network are proposed for spectral and spatial classification, respectively.
no code implementations • 13 Mar 2020 • Kaihua Zhang, Long Wang, Dong Liu, Bo Liu, Qingshan Liu, Zhu Li
We present an end-to-end network which stores short- and long-term video sequence information preceding the current frame as the temporal memories to address the temporal modeling in VOS.
no code implementations • 4 Feb 2020 • Renlong Hang, Zhu Li, Pedram Ghamisi, Danfeng Hong, Guiyu Xia, Qingshan Liu
For the feature-level fusion, three different fusion strategies are evaluated, including the concatenation strategy, the maximization strategy, and the summation strategy.
no code implementations • 11 Nov 2019 • Zhu Li, Adrian Perez-Suay, Gustau Camps-Valls, Dino Sejdinovic
We present a regularization approach to this problem that trades off predictive accuracy of the learned models (with respect to biased labels) for the fairness in terms of statistical parity, i. e. independence of the decisions from the sensitive covariates.
no code implementations • 29 Jul 2019 • Wei Jia, Li Li, Zhu Li, Xiang Zhang, Shan Liu
The block-based coding structure in the hybrid video coding framework inevitably introduces compression artifacts such as blocking, ringing, etc.
no code implementations • 18 Apr 2019 • Wei Yan, Yiting shao, Shan Liu, Thomas H. Li, Zhu Li, Ge Li
Point cloud is a fundamental 3D representation which is widely used in real world applications such as autonomous driving.
no code implementations • 24 Jun 2018 • Zhu Li, Jean-Francois Ton, Dino Oglic, Dino Sejdinovic
We study both the standard random Fourier features method for which we improve the existing bounds on the number of features required to guarantee the corresponding minimax risk convergence rate of kernel ridge regression, as well as a data-dependent modification which samples features proportional to \emph{ridge leverage scores} and further reduces the required number of features.
no code implementations • 28 Apr 2018 • Yiting Shao, Qi Zhang, Ge Li, Zhu Li
In intra-frame compression of point cloud color attributes, results demonstrate that our method performs better than the state-of-the-art region-adaptive hierarchical transform (RAHT) system, and on average a 29. 37$\%$ BD-rate gain is achieved.
Multimedia
no code implementations • 10 Sep 2017 • Bowen Cheng, Zhangyang Wang, Zhaobin Zhang, Zhu Li, Ding Liu, Jianchao Yang, Shuai Huang, Thomas S. Huang
Emotion recognition from facial expressions is tremendously useful, especially when coupled with smart devices and wireless multimedia applications.
no code implementations • 21 Feb 2017 • Li Li, Zhu Li, Madhukar Budagavi, Houqiang Li
This paper proposes a novel advanced motion model to handle the irregular motion for the cubic map projection of 360-degree video.