Search Results for author: Chen Li

Found 258 papers, 96 papers with code

Chinese Grammatical Error Diagnosis with Graph Convolution Network and Multi-task Learning

no code implementations AACL (NLP-TEA) 2020 Yikang Luo, Zuyi Bao, Chen Li, Rui Wang

For the correction subtask, we utilize the masked language model, the seq2seq model and the spelling check model to generate corrections based on the detection results.

Language Modeling Language Modelling +2

TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection

no code implementations21 Jan 2025 Yang Cao, Sikun Yang, Chen Li, Haolong Xiang, Lianyong Qi, Bo Liu, Rongsheng Li, Ming Liu

Text anomaly detection is crucial for identifying spam, misinformation, and offensive language in natural language processing tasks.

DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check

no code implementations17 Dec 2024 Ziheng Qiao, Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang

One key characteristic of the Chinese spelling check (CSC) task is that incorrect characters are usually similar to the correct ones in either phonetics or glyph.

RemDet: Rethinking Efficient Model Design for UAV Object Detection

1 code implementation13 Dec 2024 Chen Li, Rui Zhao, Zeyu Wang, Huiying Xu, Xinzhong Zhu

On the challenging UAV dataset VisDrone, our methods not only provided state-of-the-art results, improving detection by more than 3. 4%, but also achieve 110 FPS on a single 4090.

Object object-detection +1

TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model

no code implementations8 Dec 2024 Meilong Xu, Saumya Gupta, Xiaoling Hu, Chen Li, Shahira Abousamra, Dimitris Samaras, Prateek Prasanna, Chao Chen

Accurately modeling multi-class cell topology is crucial in digital pathology, as it provides critical insights into tissue structure and pathology.

Cell Detection

Efficient Deployment of Transformer Models in Analog In-Memory Computing Hardware

1 code implementation26 Nov 2024 Chen Li, Corey Lammie, Manuel Le Gallo, Bipin Rajendran

Moreover, it supports on-chip adaptation to new hardware constraints and tasks without updating analog weights, providing a flexible and versatile solution for real-world AI applications.

Computational Efficiency

Noise Adaptor: Enhancing Low-Latency Spiking Neural Networks through Noise-Injected Low-Bit ANN Conversion

no code implementations26 Nov 2024 Chen Li, Bipin. Rajendran

We present Noise Adaptor, a novel method for constructing competitive low-latency spiking neural networks (SNNs) by converting noise-injected, low-bit artificial neural networks (ANNs).

Morph: A Motion-free Physics Optimization Framework for Human Motion Generation

no code implementations22 Nov 2024 Zhuo Li, Mingshuang Luo, Ruibing Hou, Xin Zhao, Hao liu, Hong Chang, Zimo Liu, Chen Li

Human motion generation plays a vital role in applications such as digital humans and humanoid robot control.

MORPH Motion Generation

Stacking Brick by Brick: Aligned Feature Isolation for Incremental Face Forgery Detection

no code implementations18 Nov 2024 Jikang Cheng, Zhiyuan Yan, Ying Zhang, Li Hao, Jiaxin Ai, Qin Zou, Chen Li, Zhongyuan Wang

Incremental Face Forgery Detection (IFFD), involving gradually adding new forgery data to fine-tune the previously trained model, has been introduced as a promising strategy to deal with evolving forgery methods.

Specificity

DiHuR: Diffusion-Guided Generalizable Human Reconstruction

no code implementations16 Nov 2024 Jinnan Chen, Chen Li, Gim Hee Lee

We introduce DiHuR, a novel Diffusion-guided model for generalizable Human 3D Reconstruction and view synthesis from sparse, minimally overlapping images.

3D Reconstruction Novel View Synthesis +1

Situational Scene Graph for Structured Human-centric Situation Understanding

1 code implementation30 Oct 2024 Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Based on our proposed representation, we introduce the task of situational scene graph generation and propose a multi-stage pipeline Interactive and Complementary Network (InComNet) to address the task.

Graph Generation Predicate Classification +2

MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps

1 code implementation28 Oct 2024 Yating Xu, Chen Li, Gim Hee Lee

However, the geometry extracted from NeRF is generally inaccurate, which leads to sub-optimal detection performance.

3D Object Detection Depth Estimation +2

Evaluating AI-Generated Essays with GRE Analytical Writing Assessment

no code implementations22 Oct 2024 Yang Zhong, Jiangang Hao, Michael Fauss, Chen Li, YuAn Wang

The recent revolutionary advance in generative AI enables the generation of realistic and coherent texts by large language models (LLMs).

BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models

no code implementations9 Oct 2024 Fangyikang Wang, Hubery Yin, Yuejiang Dong, Huminhao Zhu, Chao Zhang, Hanbin Zhao, Hui Qian, Chen Li

In this paper, we introduce a generic formulation, \emph{Bidirectional Explicit Linear Multi-step} (BELM) samplers, of the exact inversion samplers, which includes all previously proposed heuristic exact inversion samplers as special cases.

Hardware-Software Co-optimised Fast and Accurate Deep Reconfigurable Spiking Inference Accelerator Architecture Design Methodology

no code implementations7 Oct 2024 Anagha Nimbekar, Prabodh Katti, Chen Li, Bashir M. Al-Hashimi, Amit Acharyya, Bipin Rajendran

In this paper, we develop a hardware-software co-optimisation strategy to port software-trained deep neural networks (DNN) to reduced-precision spiking models demonstrating fast and accurate inference in a novel event-driven CMOS reconfigurable spiking inference accelerator.

A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models

1 code implementation5 Oct 2024 Houquan Zhou, Zhenghua Li, Bo Zhang, Chen Li, Shaopeng Lai, Ji Zhang, Fei Huang, Min Zhang

This work proposes a simple training-free prompt-free approach to leverage large language models (LLMs) for the Chinese spelling correction (CSC) task, which is totally different from all previous CSC approaches.

Language Modeling Language Modelling +2

Tailored Federated Learning: Leveraging Direction Regulation & Knowledge Distillation

no code implementations29 Sep 2024 Huidong Tang, Chen Li, Huachong Yu, Sayaka Kamei, Yasuhiko Morimoto

To address such a challenge, we propose an FL optimization algorithm that integrates model delta regularization, personalized models, federated knowledge distillation, and mix-pooling.

Federated Learning Knowledge Distillation

When Molecular GAN Meets Byte-Pair Encoding

no code implementations29 Sep 2024 Huidong Tang, Chen Li, Yasuhiko Morimoto

Deep generative models, such as generative adversarial networks (GANs), are pivotal in discovering novel drug-like candidates via de novo molecular generation.

Computational Efficiency Diversity

Spatial Visibility and Temporal Dynamics: Revolutionizing Field of View Prediction in Adaptive Point Cloud Video Streaming

no code implementations26 Sep 2024 Chen Li, Tongyu Zong, Yueyu Hu, Yao Wang, Yong liu

Such approaches do not explicitly consider video content's impact on viewer attention, and the conversion from FoV to point visibility is often error-prone and time-consuming.

Decision Making

Leveraging Estimated Transferability Over Human Intuition for Model Selection in Text Ranking

1 code implementation24 Sep 2024 Jun Bai, Zhuofan Chen, Zhenzi Li, Hanhua Hong, Jianfei Zhang, Chen Li, Chenghua Lin, Wenge Rong

As a promising alternative to human intuition and brute-force fine-tuning, Transferability Estimation (TE) has emerged as an effective approach to model selection.

Model Selection Sentence +1

Spatial Diffusion for Cell Layout Generation

1 code implementation4 Sep 2024 Chen Li, Xiaoling Hu, Shahira Abousamra, Meilong Xu, Chao Chen

In downstream tasks, we show that the generated cell layouts can be used to guide the generation of high-quality pathology images.

Cell Detection

Can We Leave Deepfake Data Behind in Training Deepfake Detector?

no code implementations30 Aug 2024 Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, Chen Li

The accumulation of forgery information should be oriented and progressively increasing during this transition process.

Face Swapping

HistoGym: A Reinforcement Learning Environment for Histopathological Image Analysis

1 code implementation16 Aug 2024 Zhi-Bo Liu, Xiaobo Pang, Jizhao Wang, Shuai Liu, Chen Li

In pathological research, education, and clinical practice, the decision-making process based on pathological images is critically important.

Cancer Classification OpenAI Gym +3

DePatch: Towards Robust Adversarial Patch for Evading Person Detectors in the Real World

no code implementations13 Aug 2024 Jikang Cheng, Ying Zhang, Zhongyuan Wang, Zou Qin, Chen Li

Recent years have seen an increasing interest in physical adversarial attacks, which aim to craft deployable patterns for deceiving deep neural networks, especially for person detectors.

ED$^4$: Explicit Data-level Debiasing for Deepfake Detection

no code implementations13 Aug 2024 Jikang Cheng, Ying Zhang, Qin Zou, Zhiyuan Yan, Chao Liang, Zhongyuan Wang, Chen Li

Learning intrinsic bias from limited data has been considered the main reason for the failure of deepfake detection with generalizability.

DeepFake Detection Disentanglement +1

TreeSBA: Tree-Transformer for Self-Supervised Sequential Brick Assembly

1 code implementation22 Jul 2024 Mengqi Guo, Chen Li, Yuyang Zhao, Gim Hee Lee

Based on the LEGO-Tree structure, we then design a class-agnostic tree-transformer framework to predict the sequential assembly actions from the input multi-view images.

3D Assembly Transfer Learning

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

1 code implementation1 Jul 2024 Runqi Qiao, Qiuna Tan, Guanting Dong, Minhui Wu, Chong Sun, Xiaoshuai Song, Zhuoma Gongque, Shanglin Lei, Zhe Wei, Miaoxuan Zhang, Runfeng Qiao, Yifan Zhang, Xiao Zong, Yida Xu, Muxi Diao, Zhimin Bao, Chen Li, Honggang Zhang

More notably, the primary challenge of GPT-4o has significantly transitioned from IK to IG, establishing it as the first LMM advancing towards the knowledge generalization stage.

Math Mathematical Reasoning +2

PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models

no code implementations29 Jun 2024 Kunquan Deng, Zeyu Huang, Chen Li, Chenghua Lin, Min Gao, Wenge Rong

In editing tasks, PFME further enhances the FActScore of FActScore-Alpaca13B and FActScore-ChatGPT datasets, increasing by 16. 2pp and 4. 6pp, respectively.

Hallucination Sentence

How Does Distribution Matching Help Domain Generalization: An Information-theoretic Analysis

1 code implementation14 Jun 2024 Yuxin Dong, Tieliang Gong, Hong Chen, Shuangyong Song, Weizhan Zhang, Chen Li

Domain generalization aims to learn invariance across multiple training domains, thereby enhancing generalization against out-of-distribution data.

Domain Generalization

Generalizable Human Gaussians from Single-View Image

1 code implementation10 Jun 2024 Jinnan Chen, Chen Li, Jianfeng Zhang, Lingting Zhu, Buzhen Huang, Hanlin Chen, Gim Hee Lee

To mitigate the potential generation of unrealistic human poses and shapes, we incorporate human priors from the SMPL-X model as a dual branch, propagating image features from the SMPL-X volume to the image Gaussians using sparse convolution and attention mechanisms.

Novel View Synthesis SSIM +1

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model

1 code implementation9 Jun 2024 Zeyu Wang, Chen Li, Huiying Xu, Xinzhong Zhu, Hongbo Li

Our contributions are as follows: 1) We propose that the ODMamba backbone introduce a \textbf{S}tate \textbf{S}pace \textbf{M}odel (\textbf{SSM}) with linear complexity to address the quadratic complexity of self-attention.

Mamba Novel Object Detection +4

VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction

no code implementations9 Jun 2024 Hanlin Chen, Fangyin Wei, Chen Li, Tianxin Huang, Yunsong Wang, Gim Hee Lee

Although 3D Gaussian Splatting has been widely studied because of its realistic and efficient novel-view synthesis, it is still challenging to extract a high-quality surface from the point-based representation.

Novel View Synthesis Surface Reconstruction

Medication Recommendation via Dual Molecular Modalities and Multi-Step Enhancement

1 code implementation30 May 2024 Shi Mu, Chen Li, Xiang Li, Shunpan Liang

Existing works based on molecular knowledge neglect the 3D geometric structure of molecules and fail to learn the high-dimensional information of medications, leading to structural confusion.

Contrastive Learning Recommendation Systems

LabObf: A Label Protection Scheme for Vertical Federated Learning Through Label Obfuscation

no code implementations27 May 2024 Ying He, Mingyang Niu, Jingyu Hua, Yunlong Mao, Xu Huang, Chen Li, Sheng Zhong

In this paper, we first propose an embedding extension attack manipulating embeddings to undermine existing defense strategies, which rely on constraining the correlation between the embeddings uploaded by participants and the labels.

Privacy Preserving Vertical Federated Learning

Detecting Adversarial Data via Perturbation Forgery

1 code implementation25 May 2024 Qian Wang, Chen Li, Yuchen Luo, Hefei Ling, Ping Li, Jiazhong Chen, Shijuan Huang, Ning Yu

By learning to distinguish this open covering from the distribution of natural data, we can develop a detector with strong generalization capabilities against all types of adversarial attacks.

Multi-View Attentive Contextualization for Multi-View 3D Object Detection

no code implementations CVPR 2024 Xianpeng Liu, Ce Zheng, Ming Qian, Nan Xue, Chen Chen, Zhebin Zhang, Chen Li, Tianfu Wu

We present Multi-View Attentive Contextualization (MvACon), a simple yet effective method for improving 2D-to-3D feature lifting in query-based multi-view 3D (MV3D) object detection.

3D Object Detection Object +1

SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

1 code implementation7 May 2024 Yuying Ge, Sijie Zhao, Chen Li, Yixiao Ge, Ying Shan

In this technical report, we introduce SEED-Data-Edit: a unique hybrid dataset for instruction-guided image editing, which aims to facilitate image manipulation using open-form language.

Image Manipulation Language Modeling +3

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

1 code implementation22 Apr 2024 Yuying Ge, Sijie Zhao, Jinguo Zhu, Yixiao Ge, Kun Yi, Lin Song, Chen Li, Xiaohan Ding, Ying Shan

We hope that our work will inspire future research into what can be achieved by versatile multimodal foundation models in real-world applications.

Image Generation

Knowledge-Aware Multi-Intent Contrastive Learning for Multi-Behavior Recommendation

no code implementations18 Apr 2024 Shunpan Liang, Junjie Zhao, Chen Li, Yu Lei

This model uses relationships in the knowledge graph to construct intents, aiming to mine the connections between users' multi-behaviors from the perspective of intents to achieve more accurate recommendations.

Contrastive Learning

CausalMed: Causality-Based Personalized Medication Recommendation Centered on Patient health state

1 code implementation18 Apr 2024 Xiang Li, Shunpan Liang, Yu Lei, Chen Li, Yulei Hou, Tengfei Ma

However, these methods are limited to capturing personalized patient representations due to the following primary limitations: (i) unable to capture the differences in the impact of diseases/procedures on patients across various patient health states; (ii) fail to model the direct causal relationships between medications and specific health state of patients, resulting in an inability to determine which specific disease each medication is treating.

Causal Discovery Causal Inference +1

Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption

1 code implementation CVPR 2024 Buzhen Huang, Chen Li, Chongyang Xu, Liang Pan, Yangang Wang, Gim Hee Lee

Specifically, we first design a latent representation based on Vector Quantised-Variational AutoEncoder (VQ-VAE) to model human interaction.

Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

1 code implementation16 Apr 2024 Yiqian Wu, Hao Xu, Xiangjun Tang, Xien Chen, Siyu Tang, Zhebin Zhang, Chen Li, Xiaogang Jin

Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance.

Neural Rendering Text to 3D

Advancing Aspect-Based Sentiment Analysis through Deep Learning Models

no code implementations4 Apr 2024 Chen Li, Huidong Tang, Jinli Zhang, Xiujing Guo, Debo Cheng, Yasuhiko Morimoto

This study introduces an innovative edge-enhanced GCN, named SentiSys, to navigate the syntactic graph while preserving intact feature information, leading to enhanced performance.

Aspect-Based Sentiment Analysis Deep Learning +2

DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF

1 code implementation CVPR 2024 Jie Long Lee, Chen Li, Gim Hee Lee

We further introduce Renoised Score Distillation (RSD), a novel score-distillation objective for 2D image resolution.

Super-Resolution

ST-LLM: Large Language Models Are Effective Temporal Learners

1 code implementation30 Mar 2024 Ruyang Liu, Chen Li, Haoran Tang, Yixiao Ge, Ying Shan, Ge Li

In this paper, we investigate a straightforward yet unexplored question: Can we feed all spatial-temporal tokens into the LLM, thus delegating the task of video sequence modeling to the LLMs?

MVBench Reading Comprehension +3

Molecular Generative Adversarial Network with Multi-Property Optimization

no code implementations29 Mar 2024 Huidong Tang, Chen Li, Sayaka Kamei, Yoshihiro Yamanishi, Yasuhiko Morimoto

Deep generative models, such as generative adversarial networks (GANs), have been employed for $de~novo$ molecular generation in drug discovery.

Drug Discovery Generative Adversarial Network +1

Enhancing Industrial Transfer Learning with Style Filter: Cost Reduction and Defect-Focus

no code implementations25 Mar 2024 Chen Li, Ruijie Ma, Xiang Qian, Xiaohao Wang, Xinghui Li

Addressing the challenge of data scarcity in industrial domains, transfer learning emerges as a pivotal paradigm.

Transfer Learning

mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

1 code implementation19 Mar 2024 Anwen Hu, Haiyang Xu, Jiabo Ye, Ming Yan, Liang Zhang, Bo Zhang, Chen Li, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou

In this work, we emphasize the importance of structure information in Visual Document Understanding and propose the Unified Structure Learning to boost the performance of MLLMs.

document understanding Optical Character Recognition (OCR)

Enhancing Multi-Hop Knowledge Graph Reasoning through Reward Shaping Techniques

no code implementations9 Mar 2024 Chen Li, Haotian Zheng, Yiping Sun, Cangqing Wang, Liqiang Yu, Che Chang, Xinyu Tian, Bo Liu

In the realm of computational knowledge representation, Knowledge Graph Reasoning (KG-R) stands at the forefront of facilitating sophisticated inferential capabilities across multifarious domains.

Knowledge Graphs Navigate +1

Stochastic Analysis of Touch-Tone Frequency Recognition in Two-Way Radio Systems for Dialed Telephone Number Identification

no code implementations9 Mar 2024 Liqiang Yu, Chen Li, Bo Liu, Chang Che

This paper focuses on recognizing dialed numbers in a touch-tone telephone system based on the Dual Tone MultiFrequency (DTMF) signaling technique with analysis of stochastic aspects during the noise and random duration of characters.

Decoder

Common 7B Language Models Already Possess Strong Math Capabilities

2 code implementations7 Mar 2024 Chen Li, Weiqi Wang, Jingcheng Hu, Yixuan Wei, Nanning Zheng, Han Hu, Zheng Zhang, Houwen Peng

This paper shows that the LLaMA-2 7B model with common pre-training already exhibits strong mathematical abilities, as evidenced by its impressive accuracy of 97. 7% and 72. 0% on the GSM8K and MATH benchmarks, respectively, when selecting the best response from 256 random generations.

GSM8K Math

CIDGMed: Causal Inference-Driven Medication Recommendation with Enhanced Dual-Granularity Learning

2 code implementations1 Mar 2024 Shunpan Liang, Xiang Li, Shi Mu, Chen Li, Yu Lei, Yulei Hou, Tengfei Ma

Medication recommendation aims to integrate patients' long-term health records to provide accurate and safe medication combinations for specific health states.

Causal Inference Recommendation Systems +1

Partially Recentralization Softmax Loss for Vision-Language Models Robustness

no code implementations6 Feb 2024 Hao Wang, JinZhe Jiang, Xin Zhang, Chen Li

However, it has been shown that multimodal NLP are vulnerable to adversarial attacks, where the outputs of a model can be dramatically changed by a perturbation to the input.

Adversarial Robustness Diversity

Bayesian Inference Accelerator for Spiking Neural Networks

no code implementations27 Jan 2024 Prabodh Katti, Anagha Nimbekar, Chen Li, Amit Acharyya, Bashir M. Al-Hashimi, Bipin Rajendran

Bayesian neural networks offer better estimates of model uncertainty compared to frequentist networks.

Bayesian Inference

Consensus Focus for Object Detection and minority classes

1 code implementation10 Jan 2024 Erik Isai Valle Salgado, Chen Li, Yaqi Han, Linchao Shi, Xinghui Li

Ensemble methods exploit the availability of a given number of classifiers or detectors trained in single or multiple source domains and tasks to address machine learning problems such as domain adaptation or multi-source transfer learning.

Domain Adaptation Long-tailed Object Detection +4

SwitchTab: Switched Autoencoders Are Effective Tabular Learners

1 code implementation4 Jan 2024 Jing Wu, Suiyao Chen, Qi Zhao, Renat Sergazinov, Chen Li, ShengJie Liu, Chongchao Zhao, Tianpei Xie, Hanqing Guo, Cheng Ji, Daniel Cociorva, Hakan Brunzel

Self-supervised representation learning methods have achieved significant success in computer vision and natural language processing, where data samples exhibit explicit spatial or semantic dependencies.

Decoder Representation Learning

ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification

no code implementations CVPR 2024 Jiangbo Shi, Chen Li, Tieliang Gong, Yefeng Zheng, Huazhu Fu

Specifically we propose a dual-scale visual descriptive text prompt based on the frozen large language model (LLM) to boost the performance of VLM effectively.

Decoder Descriptive +5

ESCAPE: Encoding Super-keypoints for Category-Agnostic Pose Estimation

1 code implementation CVPR 2024 Khoi Duc Nguyen, Chen Li, Gim Hee Lee

For the first step we propose a learnable matching network to capture the relationship between the novel keypoints and the super-keypoints resulting in a more reliable matching.

Category-Agnostic Pose Estimation Pose Estimation

Multi-Granularity Information Interaction Framework for Incomplete Utterance Rewriting

no code implementations19 Dec 2023 Haowei Du, Dinghao Zhang, Chen Li, Yang Li, Dongyan Zhao

Recent approaches in Incomplete Utterance Rewriting (IUR) fail to capture the source of important words, which is crucial to edit the incomplete utterance, and introduce words from irrelevant utterances.

Relation-Aware Question Answering for Heterogeneous Knowledge Graphs

1 code implementation19 Dec 2023 Haowei Du, Quzhe Huang, Chen Li, Chen Zhang, Yang Li, Dongyan Zhao

To address this issue, we construct a \textbf{dual relation graph} where each node denotes a relation in the original KG (\textbf{primal entity graph}) and edges are constructed between relations sharing same head or tail entities.

Knowledge Base Question Answering Knowledge Graphs +1

External Knowledge Augmented Polyphone Disambiguation Using Large Language Model

no code implementations19 Dec 2023 Chen Li

One of the key issues in Mandarin Chinese text-to-speech (TTS) systems is polyphone disambiguation when doing grapheme-to-phoneme (G2P) conversion.

Decoder Language Modeling +7

Noise Adaptor in Spiking Neural Networks

no code implementations8 Dec 2023 Chen Li, Bipin Rajendran

Our research utilizes the ResNet model for a comprehensive analysis of the impact of the noise adaptor on low-latency SNNs.

NeuSG: Neural Implicit Surface Reconstruction with 3D Gaussian Splatting Guidance

no code implementations1 Dec 2023 Hanlin Chen, Chen Li, Gim Hee Lee

In this work, we propose a neural implicit surface reconstruction pipeline with guidance from 3D Gaussian Splatting to recover highly detailed surfaces.

3D Reconstruction Multi-View 3D Reconstruction +1

Optimal Power Flow in Highly Renewable Power System Based on Attention Neural Networks

no code implementations23 Nov 2023 Chen Li, Alexander Kies, Kai Zhou, Markus Schlott, Omar El Sayed, Mariia Bilousova, Horst Stoecker

The Optimal Power Flow (OPF) problem is pivotal for power system operations, guiding generator output and power distribution to meet demand at minimized costs, while adhering to physical and engineering constraints.

Imitation Learning Physics-informed machine learning

Vision-Language Instruction Tuning: A Review and Analysis

1 code implementation14 Nov 2023 Chen Li, Yixiao Ge, Dian Li, Ying Shan

Instruction tuning is a crucial supervised training phase in Large Language Models (LLMs), aiming to enhance the LLM's ability to generalize instruction execution and adapt to user preferences.

Improving Seq2Seq Grammatical Error Correction via Decoding Interventions

1 code implementation23 Oct 2023 Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang

In this paper, we propose a unified decoding intervention framework that employs an external critic to assess the appropriateness of the token to be generated incrementally, and then dynamically influence the choice of the next token.

Decoder Grammatical Error Correction +2

Making LLaMA SEE and Draw with SEED Tokenizer

1 code implementation2 Oct 2023 Yuying Ge, Sijie Zhao, Ziyun Zeng, Yixiao Ge, Chen Li, Xintao Wang, Ying Shan

We identify two crucial design principles: (1) Image tokens should be independent of 2D physical patch positions and instead be produced with a 1D causal dependency, exhibiting intrinsic interdependence that aligns with the left-to-right autoregressive prediction mechanism in LLMs.

multimodal generation

A Visual Interpretation-Based Self-Improved Classification System Using Virtual Adversarial Training

no code implementations3 Sep 2023 Shuai Jiang, Sayaka Kamei, Chen Li, Shengzhe Hou, Yasuhiko Morimoto

The successful application of large pre-trained models such as BERT in natural language processing has attracted more attention from researchers.

Classification Sentiment Analysis +1

GHuNeRF: Generalizable Human NeRF from a Monocular Video

1 code implementation31 Aug 2023 Chen Li, Jiahao Lin, Gim Hee Lee

In view of these limitations, we propose GHuNeRF to learn a generalizable human NeRF model from a monocular video of the human performer.

Calibrating Uncertainty for Semi-Supervised Crowd Counting

no code implementations ICCV 2023 Chen Li, Xiaoling Hu, Shahira Abousamra, Chao Chen

A popular approach is to iteratively generate pseudo-labels for unlabeled data and add them to the training set.

Crowd Counting

ECPC-IDS:A benchmark endometrail cancer PET/CT image dataset for evaluation of semantic segmentation and detection of hypermetabolic regions

no code implementations16 Aug 2023 Dechao Tang, Tianming Du, Deguo Ma, Zhiyu Ma, Hongzan Sun, Marcin Grzegorzek, Huiyan Jiang, Chen Li

As far as we know, this is the first publicly available dataset of endometrial cancer with a large number of multiple images, including a large amount of information required for image and target detection.

Image Segmentation object-detection +3

DETR Doesn't Need Multi-Scale or Locality Design

1 code implementation3 Aug 2023 Yutong Lin, Yuhui Yuan, Zheng Zhang, Chen Li, Nanning Zheng, Han Hu

This paper presents an improved DETR detector that maintains a "plain" nature: using a single-scale feature map and global cross-attention calculations without specific locality constraints, in contrast to previous leading DETR-based detectors that reintroduce architectural inductive biases of multi-scale and locality into the decoder.

Decoder

Weakly-supervised 3D Pose Transfer with Keypoints

1 code implementation ICCV 2023 Jinnan Chen, Chen Li, Gim Hee Lee

The main challenges of 3D pose transfer are: 1) Lack of paired training data with different characters performing the same pose; 2) Disentangling pose and shape information from the target mesh; 3) Difficulty in applying to meshes with different topologies.

Pose Transfer

Confidence Estimation Using Unlabeled Data

1 code implementation19 Jul 2023 Chen Li, Xiaoling Hu, Chao Chen

We stipulate that even with limited training labels, we can still reasonably approximate the confidence of model on unlabeled samples by inspecting the prediction consistency through the training process.

Active Learning Image Classification

PTVD: A Large-Scale Plot-Oriented Multimodal Dataset Based on Television Dramas

1 code implementation26 Jun 2023 Chen Li, Xutan Peng, Teng Wang, Yixiao Ge, Mengyang Liu, Xuyuan Xu, Yexin Wang, Ying Shan

Art forms such as movies and television (TV) dramas are reflections of the real world, which have attracted much attention from the multimodal learning community recently.

Genre classification Retrieval +1

NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

1 code implementation25 May 2023 Yue Zhang, Bo Zhang, Haochen Jiang, Zhenghua Li, Chen Li, Fei Huang, Min Zhang

We introduce NaSGEC, a new dataset to facilitate research on Chinese grammatical error correction (CGEC) for native speaker texts from multiple domains.

Grammatical Error Correction

OD-NeRF: Efficient Training of On-the-Fly Dynamic Neural Radiance Fields

no code implementations24 May 2023 Zhiwen Yan, Chen Li, Gim Hee Lee

Dynamic neural radiance fields (dynamic NeRFs) have demonstrated impressive results in novel view synthesis on 3D dynamic scenes.

Novel View Synthesis

A unified front-end framework for English text-to-speech synthesis

no code implementations18 May 2023 Zelin Ying, Chen Li, Yu Dong, Qiuqiang Kong, Qiao Tian, YuanYuan Huo, Yuxuan Wang

The front-end is a critical component of English text-to-speech (TTS) systems, responsible for extracting linguistic features that are essential for a text-to-speech model to synthesize speech, such as prosodies and phonemes.

Speech Synthesis Text to Speech +1

Ripple Knowledge Graph Convolutional Networks For Recommendation Systems

no code implementations2 May 2023 Chen Li, Yang Cao, Ye Zhu, Debo Cheng, Chengyuan Li, Yasuhiko Morimoto

Using knowledge graphs to assist deep learning models in making recommendation decisions has recently been proven to effectively improve the model's interpretability and accuracy.

Deep Learning Knowledge Graphs +1

Understanding the Generalization Ability of Deep Learning Algorithms: A Kernelized Renyi's Entropy Perspective

1 code implementation2 May 2023 Yuxin Dong, Tieliang Gong, Hong Chen, Chen Li

However, the current generalization error bounds within this framework are still far from optimal, while substantial improvements on these bounds are quite challenging due to the intractability of high-dimensional information quantities.

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

1 code implementation6 Apr 2023 Chen Li, Yixiao Ge, Jiayong Mao, Dian Li, Ying Shan

Given a new entity that needs tagging for distribution, TagGPT introduces two alternative options for zero-shot tagging, i. e., a generative method with late semantic matching with the tag set, and another selective method with early matching in prompts.

Optical Character Recognition (OCR) Prompt Engineering +5

ScarceNet: Animal Pose Estimation with Scarce Annotations

1 code implementation CVPR 2023 Chen Li, Gim Hee Lee

To this end, we propose the ScarceNet, a pseudo label-based approach to generate artificial labels for the unlabeled images.

Animal Pose Estimation Domain Adaptation +1

NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects

1 code implementation CVPR 2023 Zhiwen Yan, Chen Li, Gim Hee Lee

We evaluate our model based on the novel view synthesis quality with a self-collected dataset of different moving specular objects in realistic environments.

Novel View Synthesis

Distribution-restrained Softmax Loss for the Model Robustness

no code implementations22 Mar 2023 Hao Wang, Chen Li, JinZhe Jiang, Xin Zhang, YaQian Zhao, Weifeng Gong

Recently, the robustness of deep learning models has received widespread attention, and various methods for improving model robustness have been proposed, including adversarial training, model architecture modification, design of loss functions, certified defenses, and so on.

Diversity

DataLight: Offline Data-Driven Traffic Signal Control

1 code implementation20 Mar 2023 Liang Zhang, Yutong Zhang, Jianming Deng, Chen Li

Reinforcement learning (RL) has emerged as a promising solution for addressing traffic signal control (TSC) challenges.

Offline RL Reinforcement Learning (RL) +1

Unleashing the Potential of Spiking Neural Networks by Dynamic Confidence

1 code implementation17 Mar 2023 Chen Li, Edward Jones, Steve Furber

In this regard, Dynamic Confidence represents a meaningful step toward realizing the potential of SNNs.

Decision Making

Efficient Diffusion Training via Min-SNR Weighting Strategy

2 code implementations ICCV 2023 Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo

Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence.

Denoising Image Generation +2

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

no code implementations18 Feb 2023 Ce Zhou, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, Cheng Ji, Qiben Yan, Lifang He, Hao Peng, JianXin Li, Jia Wu, Ziwei Liu, Pengtao Xie, Caiming Xiong, Jian Pei, Philip S. Yu, Lichao Sun

This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities.

Graph Learning Language Modelling +1

ACTIVE: A Deep Model for Sperm and Impurity Detection in Microscopic Videos

no code implementations15 Jan 2023 Ao Chen, Jinghua Zhang, Md Mamunur Rahaman, Hongzan Sun, M. D., Tieyong Zeng, Marcin Grzegorzek, Feng-Lei Fan, Chen Li

The accurate detection of sperms and impurities is a very challenging task, facing problems such as the small size of targets, indefinite target morphologies, low contrast and resolution of the video, and similarity of sperms and impurities.

Deep Learning object-detection +1

All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

1 code implementation ICCV 2023 Jia Ning, Chen Li, Zheng Zhang, Zigang Geng, Qi Dai, Kun He, Han Hu

With these new techniques and other designs, we show that the proposed general-purpose task-solver can perform both instance segmentation and depth estimation well.

Instance Segmentation Monocular Depth Estimation +1

Unleashing the Potential of Spiking Neural Networks with Dynamic Confidence

no code implementations ICCV 2023 Chen Li, Edward G Jones, Steve Furber

In this regard, Dynamic Confidence represents a meaningful step toward realizing the potential of SNNs.

Decision Making

Weak-Shot Object Detection Through Mutual Knowledge Transfer

no code implementations CVPR 2023 Xuanyi Du, Weitao Wan, Chong Sun, Chen Li

We propose a novel Knowledge Transfer (KT) loss which simultaneously distills the knowledge of objectness and class entropy from a proposal generator trained on the S dataset to optimize a multiple instance learning module on the T dataset.

Multiple Instance Learning Object +3

DETR Does Not Need Multi-Scale or Locality Design

1 code implementation ICCV 2023 Yutong Lin, Yuhui Yuan, Zheng Zhang, Chen Li, Nanning Zheng, Han Hu

This paper presents an improved DETR detector that maintains a "plain" nature: using a single-scale feature map and global cross-attention calculations without specific locality constraints, in contrast to previous leading DETR-based detectors that reintroduce architectural inductive biases of multi-scale and locality into the decoder.

Ranked #12 on Object Detection on COCO test-dev (using extra training data)

Decoder Object Detection

Robust and Fast Measure of Information via Low-rank Representation

1 code implementation30 Nov 2022 Yuxin Dong, Tieliang Gong, Shujian Yu, Hong Chen, Chen Li

The matrix-based R\'enyi's entropy allows us to directly quantify information measures from given data, without explicit estimation of the underlying probability distribution.

Computational Efficiency

A Knowledge-based Learning Framework for Self-supervised Pre-training Towards Enhanced Recognition of Biomedical Microscopy Images

1 code implementation27 Nov 2022 Wei Chen, Chen Li, Dan Chen, Xin Luo

Self-supervised pre-training has become the priory choice to establish reliable neural networks for automated recognition of massive biomedical microscopy images, which are routinely annotation-free, without semantics, and without guarantee of quality.

Contrastive Learning Image Restoration +3

DynamicLight: Two-Stage Dynamic Traffic Signal Timing

1 code implementation2 Nov 2022 Liang Zhang, Yutong Zhang, Shubin Xie, Jianming Deng, Chen Li

Reinforcement learning (RL) is gaining popularity as an effective approach for traffic signal control (TSC) and is increasingly applied in this domain.

Q-Learning Reinforcement Learning (RL) +1

Predictive Edge Caching through Deep Mining of Sequential Patterns in User Content Retrievals

no code implementations6 Oct 2022 Chen Li, Xiaoyu Wang, Tongyu Zong, Houwei Cao, Yong liu

Edge caching plays an increasingly important role in boosting user content retrieval performance while reducing redundant network traffic.

Retrieval

Quasi-supervised Learning for Super-resolution PET

1 code implementation3 Sep 2022 Guangtong Yang, Chen Li, YuDong Yao, Ge Wang, Yueyang Teng

In this paper, we propose a quasi-supervised learning method, which is a new type of weakly-supervised learning methods, to recover HR PET images from LR counterparts by leveraging similarity between unpaired LR and HR image patches.

Generative Adversarial Network Super-Resolution +1

A Multi-Channel Next POI Recommendation Framework with Multi-Granularity Check-in Signals

1 code implementation1 Sep 2022 Zhu Sun, Yu Lei, Lu Zhang, Chen Li, Yew-Soon Ong, Jie Zhang

Being equipped with three modules (i. e., global user behavior encoder, local multi-channel encoder, and region-aware weighting strategy), MCMG is capable of capturing both fine- and coarse-grained sequential regularities as well as exploring the dynamic impact of multi-channel by differentiating the region check-in patterns.

Segmentation of Weakly Visible Environmental Microorganism Images Using Pair-wise Deep Learning Features

no code implementations31 Aug 2022 Frank Kulwa, Chen Li, Marcin Grzegorzek, Md Mamunur Rahaman, Kimiaki Shirahama, Sergey Kosov

The use of PDLFs enables the network to focus more on the foreground (EMs) by concatenating the pairwise deep learning features of each image to different blocks of the base model SegNet.

Deep Learning Specificity

An Energy Activity Dataset for Smart Homes

no code implementations29 Aug 2022 Chen Li

The proposed energy activity dataset (EAD) has a high data type diversity in contrast to existing load monitoring datasets.

Diversity Miscellaneous +2

Artificial Neural Networks for Finger Vein Recognition: A Survey

no code implementations29 Aug 2022 Yimin Yin, Renye Zhang, PengFei Liu, Wanxia Deng, Siliang He, Chen Li, Jinghua Zhang

To our best knowledge, this paper is the first comprehensive survey focusing on finger vein recognition based on artificial neural networks.

Feature Engineering Finger Vein Recognition +1

Simulation of snakes using vertical body bending to traverse terrain with large height variation

no code implementations26 Jul 2022 Yifeng Zhang, Qihan Xuan, Qiyuan Fu, Chen Li

Remarkably, even when frictional drag is low (snake-terrain kinetic friction coefficient of 0. 20), the body must push against the wedge with a pressure 5 times that from body weight to generate sufficient forward propulsion to move forward.

Friction

Deep Learning for Finger Vein Recognition: A Brief Survey of Recent Trend

no code implementations5 Jul 2022 Renye Zhang, Yimin Yin, Wanxia Deng, Chen Li, Jinghua Zhang

Finger vein image recognition technology plays an important role in biometric recognition and has been successfully applied in many fields.

Finger Vein Recognition

Adaptive Weighted Nonnegative Matrix Factorization for Robust Feature Representation

1 code implementation7 Jun 2022 Tingting Shen, Junhang Li, Can Tong, Qiang He, Chen Li, YuDong Yao, Yueyang Teng

Nonnegative matrix factorization (NMF) has been widely used to dimensionality reduction in machine learning.

Dimensionality Reduction

A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law

no code implementations30 May 2022 Chen Li, Antonios Tsourdos, Weisi Guo

At a general computational energy model level, there is both strong dependency to both the hardware architecture (e. g. generic processors with different configuration of inner components- CPU and GPU, programmable integrated circuits - FPGA), as well as different interacting energy consumption aspects (e. g., data movement, calculation, control).

A Comparative Study of Gastric Histopathology Sub-size Image Classification: from Linear Regression to Visual Transformer

no code implementations25 May 2022 Weiming Hu, HaoYuan Chen, Wanli Liu, Xiaoyan Li, Hongzan Sun, Xinyu Huang, Marcin Grzegorzek, Chen Li

Ensemble learning is a way to improve the accuracy of algorithms, and finding multiple learning models with complementarity types is the basis of ensemble learning.

BIG-bench Machine Learning Deep Learning +3

Optimal Randomized Approximations for Matrix based Renyi's Entropy

no code implementations16 May 2022 Yuxin Dong, Tieliang Gong, Shujian Yu, Chen Li

The Matrix-based Renyi's entropy enables us to directly measure information quantities from given data without the costly probability density estimation of underlying distributions, thus has been widely adopted in numerous statistical learning and inference tasks.

Density Estimation

MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction

2 code implementations NAACL 2022 Yue Zhang, Zhenghua Li, Zuyi Bao, Jiacheng Li, Bo Zhang, Chen Li, Fei Huang, Min Zhang

This paper presents MuCGEC, a multi-reference multi-source evaluation dataset for Chinese Grammatical Error Correction (CGEC), consisting of 7, 063 sentences collected from three Chinese-as-a-Second-Language (CSL) learner sources.

Grammatical Error Correction Sentence